lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <>
Subject Now, a lost data problem with trunk too
Date Tue, 14 Sep 2010 08:37:48 GMT
Hi folks,

It looks like the handle leak may be real - Simon Willnauer has been looking at it and could
not find an explanation for the behavior I have been seeing.  But before we got too far on
that problem, I encountered what appears to be an even more serious problem.  Specifically,
I'm losing field data out of some records.

The index I'm building is fairly large - some 25M records when complete.  What I'm seeing
is that the main searchable field ("value") is not finding all the records it should.  I was
able to locate one such record just now:

curl "http://localhost:8983/solr/nose/standard?fl=*,score&q=id:\"POI|DEU:205:20187477:1014564|brandenburger+tor\""
<?xml version="1.0" encoding="UTF-8"?>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">95</int><lst
name="params"><str name="q">id:"POI|DEU:205:20187477:1014564|brandenburger tor"</str><str
name="fl">*,score</str></lst></lst><result name="response" numFound="1"
start="0" maxScore="17.335964"><doc><float name="score">17.335964</float><str
name="entityid">POI|DEU:205:20187477:1014564|brandenburger tor</str><str name="id">POI|DEU:205:20187477:1014564|brandenburger
tor</str><str name="reference">brandenburger tor, potsdam, deutschland</str><str
name="type">poi</str> ... </doc></result>

.. but it is completely missing the supposedly required "value" field:

   <!-- The value field.  This contains the actual string that will be matched.-->
   <field name="value" type="string_idx"  required="true" stored="false"/>

The code that does the indexing is straightforward, and *some* of the records of this class
are indeed searchable via the "value" field, but others aren't.  I know the "value" field
is non-empty, because it is used to construct the "id" field, which is correct above.

Simon is also looking into this one, but if anyone else has advice for figuring out what's
going wrong, please let me know.  FWIW, this is a trunk build from Monday morning.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message