lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Noble Paul നോബിള്‍ नोब्ळ् <noble.p...@corp.aol.com>
Subject Re: pk vs. uniqueKey with DIH delta-import
Date Thu, 18 Jun 2009 09:00:51 GMT
a have raised an issue and fixed it
https://issues.apache.org/jira/browse/SOLR-1228

2009/6/18 Noble Paul നോബിള്‍  नोब्ळ् <noble.paul@corp.aol.com>:
> apparently the row return a null 'board_id'
>
> your stacktrace sugggests this. even if it is fixed I guess it may not
> work because your are storing the id as
>
>
> board-${test.board_id}
>
> and unless your query returns something like board-<some-id> it may
> not work for you.
>
> Anyway i shall put in a fix ion DIH to avoid this NPE
>
>
>
>
>
>
>
> On Thu, Jun 18, 2009 at 2:17 AM, Erik Hatcher<erik@ehatchersolutions.com> wrote:
>> First - DIH has worked pretty well in a new customer engagement of ours.
>>  We've easily imported tens of millions of records with no problem.  Kudos
>> to the developers/contributors to DIH - it got us up and running quickly.
>>  But now we're delving into more complexities and having some issues.
>>
>> Now on to my current issue, doing a delta-import such that records marked as
>> "deleted" in the database are removed from Solr using deletedPkQuery.
>>
>> Here's a config I'm using against a mocked test database:
>>
>> <dataConfig>
>>  <dataSource driver="com.mysql.jdbc.Driver"
>> url="jdbc:mysql://localhost/db"/>
>>  <document name="tests">
>>    <entity name="test"
>>            pk="board_id"
>>            transformer="TemplateTransformer"
>>            deletedPkQuery="select board_id from boards where deleted = 'Y'"
>>            query="select * from boards where deleted = 'N'"
>>            deltaImportQuery="select * from boards where deleted = 'N'"
>>            deltaQuery="select * from boards where deleted = 'N'"
>>            preImportDeleteQuery="datasource:board">
>>      <field column="id" template="board-${test.board_id}"/>
>>      <field column="datasource" template="board"/>
>>      <field column="title" />
>>    </entity>
>>  </document>
>> </dataConfig>
>>
>> Note that the uniqueKey in Solr is the "id" field.  And its value is a
>> template board-<PK>.
>>
>> I noticed the javadoc comments in DocBuilder#collectDelta it says "Note: In
>> our definition, unique key of Solr document is the primary key of the top
>> level entity".  This of course isn't really an appropriate assumption.
>>
>> I also tried a deletedPkQuery of "select concat('board-',board_id) from
>> boards where deleted = 'Y'", but got an NPE (relevant stack trace below).
>>
>> It seems that deletedPkQuery only works if the pk and Solr's uniqueKey field
>> use the same value.  Is that the case?  If this is the case we'll need to
>> fix this somehow.  Any suggestions?
>>
>> Thanks,
>>        Erik
>>
>> stack trace from scenario mentioned above:
>> SEVERE: Delta Import Failed
>> java.lang.NullPointerException
>>        at
>> org.apache.solr.handler.dataimport.SolrWriter.deleteDoc(SolrWriter.java:83)
>>        at
>> org.apache.solr.handler.dataimport.DocBuilder.deleteAll(DocBuilder.java:275)
>>        at
>> org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:247)
>>        at
>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:159)
>>        at
>> org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:337)
>>
>>
>
>
>
> --
> -----------------------------------------------------
> Noble Paul | Principal Engineer| AOL | http://aol.com
>



-- 
-----------------------------------------------------
Noble Paul | Principal Engineer| AOL | http://aol.com

Mime
View raw message