lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan McKinley <ryan...@gmail.com>
Subject Re: lucene 2.9 migration issues -- MultiReader vs IndexReader document ids
Date Thu, 23 Apr 2009 23:37:35 GMT
thanks!


On Apr 23, 2009, at 6:32 PM, Mark Miller wrote:

> Looks like its my fault. Auto resolution was moved upto  
> IndexSearcher in Lucene, and it looks like SolrIndexSearcher is not  
> tickling it first. I'll take a look.
>
> - Mark
>
> Ryan McKinley wrote:
>> Ok, not totally resolved....
>>
>> Things work fine when I have my custom Filter alone or with other  
>> Filters, however if I add a query string to the mix it breaks with  
>> an IllegalStateException:
>>
>> java.lang.IllegalStateException: Auto should be resolved before now
>>    at org.apache.lucene.search.FieldSortedHitQueue 
>> $1.createValue(FieldSortedHitQueue.java:216)
>>    at org.apache.lucene.search.FieldCacheImpl 
>> $Cache.get(FieldCacheImpl.java:73)
>>    at  
>> org 
>> .apache 
>> .lucene 
>> .search 
>> .FieldSortedHitQueue.getCachedComparator(FieldSortedHitQueue.java: 
>> 168)
>>    at  
>> org 
>> .apache 
>> .lucene.search.FieldSortedHitQueue.<init>(FieldSortedHitQueue.java: 
>> 58)
>>    at  
>> org 
>> .apache 
>> .solr 
>> .search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java: 
>> 1214)
>>    at  
>> org 
>> .apache 
>> .solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java: 
>> 924)
>>    at  
>> org 
>> .apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java: 
>> 345)
>>    at  
>> org 
>> .apache 
>> .solr.handler.component.QueryComponent.process(QueryComponent.java: 
>> 171)
>>    at  
>> org 
>> .apache 
>> .solr 
>> .handler 
>> .component.SearchHandler.handleRequestBody(SearchHandler.java:195)
>>    at  
>> org 
>> .apache 
>> .solr 
>> .handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 
>> 131)
>>    at org.apache.solr.core.SolrCore.execute(SolrCore.java:1330)
>>    at  
>> org 
>> .apache 
>> .solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
>>
>> This is for a query:
>>  /solr/flat/select?q=SGID&bounds=-144 2.4 -72 67 WITHIN
>> bounds=XXX triggers my custom filter to kick in.
>>
>> Any thoughts where to look?  This error is new since upgrading the  
>> lucene libs (in recent solr)
>>
>> Thanks!
>> ryan
>>
>>
>> On Apr 20, 2009, at 7:14 PM, Ryan McKinley wrote:
>>
>>> thanks!
>>>
>>> everything got better when I removed my logic to cache based on  
>>> the index modification time.
>>>
>>>
>>> On Apr 20, 2009, at 4:51 PM, Yonik Seeley wrote:
>>>
>>>> On Mon, Apr 20, 2009 at 4:17 PM, Ryan McKinley  
>>>> <ryantxu@gmail.com> wrote:
>>>>> This issue started on java-user, but I am moving it to solr-dev:
>>>>> http://www.lucidimagination.com/search/document/46481456bc214ccb/bitset_filter_arrayindexoutofboundsexception
>>>>>
>>>>> I am using solr trunk and building an RTree from stored document  
>>>>> fields.
>>>>> This process worked fine until a recent change in 2.9 that has  
>>>>> different
>>>>> document id strategy then I was used to.
>>>>>
>>>>> In that thread, Yonik suggested:
>>>>> - pop back to the top level from the sub-reader, if you really  
>>>>> need a single
>>>>> set
>>>>> - if a set-per-reader will work, then cache per segment (better  
>>>>> for
>>>>> incremental updates anyway)
>>>>>
>>>>> I'm not quite sure what you mean by a "set-per-reader".
>>>>
>>>> I meant RTree per reader (per segment reader).
>>>>
>>>>> Previously I was
>>>>> building a single RTree and using it until the the last modified  
>>>>> time had
>>>>> changed.  This avoided building an index anytime a new reader  
>>>>> was opened and
>>>>> the index had not changed.
>>>>
>>>> I *think* that our use of re-open will return the same IndexReader
>>>> instance if nothing has changed... so you shouldn't have to try  
>>>> and do
>>>> that yourself.
>>>>
>>>>> I'm fine building a new RTree for each reader if
>>>>> that is required.
>>>>
>>>> If that works just as well, it will put you in a better position  
>>>> for
>>>> faster incremental updates... new RTrees will be built only for  
>>>> those
>>>> segments that have changed.
>>>>
>>>>> Is there any existing code that deals with this situation?
>>>>
>>>> To cache an RTree per reader, you could use the same logic as
>>>> FieldCache uses... a weak map with the reader as the key.
>>>>
>>>> If a single top-level RTree that covers the entire index works  
>>>> better
>>>> for you, then you can cache the RTree based on the top level multi
>>>> reader and translate the ids... that was my fix for  
>>>> ExternalFileField.
>>>> See FileFloatSource.getValues() for the implementation.
>>>>
>>>>
>>>>> - - - -
>>>>>
>>>>> Yonik also suggested:
>>>>>
>>>>> Relatively new in 2.9, you can pass null to enumerate over all  
>>>>> non-deleted
>>>>> docs:
>>>>> TermDocs td = reader.termDocs(null);
>>>>>
>>>>> It would probably be a lot faster to iterate over indexed values  
>>>>> though.
>>>>>
>>>>> If I iterate of indexed values (from the FieldCache i presume)  
>>>>> then how do i
>>>>> get access to the document id?
>>>>
>>>> IndexReader.terms(Term t) returns a TermEnum that can iterate over
>>>> terms, starting at t.
>>>> IndexReader.termDocs(Term t or TermEnum te) will give you the  
>>>> list of
>>>> documents that match a term.
>>>>
>>>>
>>>> -Yonik
>>>
>>
>
>
> -- 
> - Mark
>
> http://www.lucidimagination.com
>
>
>


Mime
View raw message