lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Niall O'Connor (Commented) (JIRA)" <>
Subject [jira] [Commented] (SOLR-2094) When using a XPathEntityProcessor nested within a SQLEntityProcessor, the xpathReader isn't reinitilized for each new document
Date Wed, 07 Mar 2012 21:36:58 GMT


Niall O'Connor commented on SOLR-2094:

in dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/

I removed the "if (xpathReader == null)" from the XPathEntityProcessor and rebuilt the package
so that the XPathReader was re-initialized.

I didn't commit this change since there was no activity on this issue. 
> When using a XPathEntityProcessor nested within a SQLEntityProcessor, the xpathReader
isn't reinitilized for each new document 
> -------------------------------------------------------------------------------------------------------------------------------
>                 Key: SOLR-2094
>                 URL:
>             Project: Solr
>          Issue Type: Bug
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.4.1
>         Environment: Solr 1.4
>            Reporter: Niall O'Connor
> I have a dih config with a SqlEntityProcessor that retrives a table. I then have a sub-entity
with the XPathEntityProcessor type, this takes a value from the table as input to parse through
an xml doc. 
> I find that the first document is created correctly, but then the xpathReader of the
XPathEntityProcessor does not reinitialize for the following documents so the initial documents
input is used. 
> <dataSource name="hivseqdb" driver="com.mysql.jdbc.Driver"
> 	   url="l"
>            user="hivseqdb" password="hivseqdb" batchSize="1"/>
>     <dataSource name="xmlFile" type="FileDataSource" />
> 	<document><entity name="Sequence" dataSource="hivseqdb" pk="se_id" query="SELECT
* FROM hivseqdb.sequenceentry where se_id != '1'">
>             <entity name="FMA_Tissue_Hierarchy" 
>             		dataSource="xmlFile"
>             		pk="fma-id"
>             		forEach="/tissue-samples" 
>             		processor="XPathEntityProcessor" 
>             		url="/opt/hivseqdb/solr/conf/sub_ontology_translated.xml" 
>             		stream="true">
>                 <field column="tissue-antology-parent-path" xpath="/tissue-samples/tissue[@fma-id='${Sequence.sampleTissueCode}']/parent-path"/>
>             </entity>
> DocBuilder dose call init on the XPathEntityProcessor but there is a conditional in the
init method to check if the xpathReader is null:
>   public void init(Context context) {
>     super.init(context);
>     if (xpathReader == null)
>       initXpathReader();
>     pk = context.getEntityAttribute("pk");
>     dataSource = context.getDataSource();
>     rowIterator = null;
>   }
> So the xPathReader is used again and again. Is there away to reinitialize the xPathReader
for every document? Or what is the specific design reason for preserving it?

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message