lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Niall O'Connor (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-2094) When using a XPathEntityProcessor nested within a SQLEntityProcessor, the xpathReader isn't reinitilized for each new document
Date Wed, 07 Mar 2012 21:36:58 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224735#comment-13224735
] 

Niall O'Connor commented on SOLR-2094:
--------------------------------------

in dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/XPathEntityProcessor.java

I removed the "if (xpathReader == null)" from the XPathEntityProcessor and rebuilt the package
so that the XPathReader was re-initialized.

I didn't commit this change since there was no activity on this issue. 
 
                
> When using a XPathEntityProcessor nested within a SQLEntityProcessor, the xpathReader
isn't reinitilized for each new document 
> -------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2094
>                 URL: https://issues.apache.org/jira/browse/SOLR-2094
>             Project: Solr
>          Issue Type: Bug
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.4.1
>         Environment: Solr 1.4
>            Reporter: Niall O'Connor
>
> I have a dih config with a SqlEntityProcessor that retrives a table. I then have a sub-entity
with the XPathEntityProcessor type, this takes a value from the table as input to parse through
an xml doc. 
> I find that the first document is created correctly, but then the xpathReader of the
XPathEntityProcessor does not reinitialize for the following documents so the initial documents
input is used. 
> <dataSource name="hivseqdb" driver="com.mysql.jdbc.Driver"
> 	   url="l"
>            user="hivseqdb" password="hivseqdb" batchSize="1"/>
>            
>     <dataSource name="xmlFile" type="FileDataSource" />
>     
> 	<document><entity name="Sequence" dataSource="hivseqdb" pk="se_id" query="SELECT
* FROM hivseqdb.sequenceentry where se_id != '1'">
> 			
>             <entity name="FMA_Tissue_Hierarchy" 
>             		dataSource="xmlFile"
>             		pk="fma-id"
>             		forEach="/tissue-samples" 
>             		processor="XPathEntityProcessor" 
>             		url="/opt/hivseqdb/solr/conf/sub_ontology_translated.xml" 
>             		stream="true">
>                 <field column="tissue-antology-parent-path" xpath="/tissue-samples/tissue[@fma-id='${Sequence.sampleTissueCode}']/parent-path"/>
>             </entity>
> DocBuilder dose call init on the XPathEntityProcessor but there is a conditional in the
init method to check if the xpathReader is null:
>   public void init(Context context) {
>     super.init(context);
>     if (xpathReader == null)
>       initXpathReader();
>     pk = context.getEntityAttribute("pk");
>     dataSource = context.getDataSource();
>     rowIterator = null;
>   }
> So the xPathReader is used again and again. Is there away to reinitialize the xPathReader
for every document? Or what is the specific design reason for preserving it?
> 		
> 		

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message