lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan McKinley <ryan...@gmail.com>
Subject Re: lucene 2.9 migration issues -- MultiReader vs IndexReader document ids
Date Fri, 24 Apr 2009 15:37:09 GMT
Yes, that would be great!  the changes we need are in rev 768275:
http://svn.apache.org/viewvc?view=rev&revision=768275

thanks



On Apr 24, 2009, at 11:23 AM, Shalin Shekhar Mangar wrote:

> Yes, I upgraded the lucene jars a few hours ago for trie api  
> updates. Do you
> want me to upgrade them again?
>
> On Fri, Apr 24, 2009 at 7:51 PM, Mark Miller <markrmiller@gmail.com>  
> wrote:
>
>> I think Shalin upgraded the jars this morning, so I'd just grab  
>> them again
>> real quick.
>>
>> 4/4 4:46 am : Upgraded to Lucene 2.9-dev r768228
>>
>>
>> Ryan McKinley wrote:
>>
>>> thanks Mark!
>>>
>>> how far is lucene /trunk from what is currently in solr?
>>>
>>> Is it something we should consider upgrading?
>>>
>>>
>>> On Apr 24, 2009, at 8:30 AM, Mark Miller wrote:
>>>
>>> I just committed a fix Ryan - should work with upgraded Lucene jars.
>>>>
>>>> - Mark
>>>>
>>>> Ryan McKinley wrote:
>>>>
>>>>> thanks!
>>>>>
>>>>>
>>>>> On Apr 23, 2009, at 6:32 PM, Mark Miller wrote:
>>>>>
>>>>> Looks like its my fault. Auto resolution was moved upto  
>>>>> IndexSearcher
>>>>>> in Lucene, and it looks like SolrIndexSearcher is not tickling  
>>>>>> it first.
>>>>>> I'll take a look.
>>>>>>
>>>>>> - Mark
>>>>>>
>>>>>> Ryan McKinley wrote:
>>>>>>
>>>>>>> Ok, not totally resolved....
>>>>>>>
>>>>>>> Things work fine when I have my custom Filter alone or with 

>>>>>>> other
>>>>>>> Filters, however if I add a query string to the mix it breaks
 
>>>>>>> with an
>>>>>>> IllegalStateException:
>>>>>>>
>>>>>>> java.lang.IllegalStateException: Auto should be resolved  
>>>>>>> before now
>>>>>>> at
>>>>>>> org.apache.lucene.search.FieldSortedHitQueue 
>>>>>>> $1.createValue(FieldSortedHitQueue.java:216)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.lucene.search.FieldCacheImpl 
>>>>>>> $Cache.get(FieldCacheImpl.java:73)
>>>>>>> at
>>>>>>> org 
>>>>>>> .apache 
>>>>>>> .lucene 
>>>>>>> .search 
>>>>>>> .FieldSortedHitQueue 
>>>>>>> .getCachedComparator(FieldSortedHitQueue.java:168)
>>>>>>>
>>>>>>> at
>>>>>>> org 
>>>>>>> .apache 
>>>>>>> .lucene 
>>>>>>> .search.FieldSortedHitQueue.<init>(FieldSortedHitQueue.java:58)
>>>>>>>
>>>>>>> at
>>>>>>> org 
>>>>>>> .apache 
>>>>>>> .solr 
>>>>>>> .search 
>>>>>>> .SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:

>>>>>>> 1214)
>>>>>>>
>>>>>>> at
>>>>>>> org 
>>>>>>> .apache 
>>>>>>> .solr 
>>>>>>> .search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:

>>>>>>> 924)
>>>>>>>
>>>>>>> at
>>>>>>> org 
>>>>>>> .apache 
>>>>>>> .solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:

>>>>>>> 345)
>>>>>>> at
>>>>>>> org 
>>>>>>> .apache 
>>>>>>> .solr 
>>>>>>> .handler.component.QueryComponent.process(QueryComponent.java:

>>>>>>> 171)
>>>>>>>
>>>>>>> at
>>>>>>> org 
>>>>>>> .apache 
>>>>>>> .solr 
>>>>>>> .handler 
>>>>>>> .component.SearchHandler.handleRequestBody(SearchHandler.java:

>>>>>>> 195)
>>>>>>>
>>>>>>> at
>>>>>>> org 
>>>>>>> .apache 
>>>>>>> .solr 
>>>>>>> .handler 
>>>>>>> .RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>>>>>>>
>>>>>>> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1330)
>>>>>>> at
>>>>>>> org 
>>>>>>> .apache 
>>>>>>> .solr 
>>>>>>> .servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
>>>>>>>
>>>>>>>
>>>>>>> This is for a query:
>>>>>>> /solr/flat/select?q=SGID&bounds=-144 2.4 -72 67 WITHIN
>>>>>>> bounds=XXX triggers my custom filter to kick in.
>>>>>>>
>>>>>>> Any thoughts where to look?  This error is new since upgrading
 
>>>>>>> the
>>>>>>> lucene libs (in recent solr)
>>>>>>>
>>>>>>> Thanks!
>>>>>>> ryan
>>>>>>>
>>>>>>>
>>>>>>> On Apr 20, 2009, at 7:14 PM, Ryan McKinley wrote:
>>>>>>>
>>>>>>> thanks!
>>>>>>>>
>>>>>>>> everything got better when I removed my logic to cache based
 
>>>>>>>> on the
>>>>>>>> index modification time.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Apr 20, 2009, at 4:51 PM, Yonik Seeley wrote:
>>>>>>>>
>>>>>>>> On Mon, Apr 20, 2009 at 4:17 PM, Ryan McKinley <ryantxu@gmail.com

>>>>>>>> >
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> This issue started on java-user, but I am moving
it to solr- 
>>>>>>>>>> dev:
>>>>>>>>>>
>>>>>>>>>> http://www.lucidimagination.com/search/document/46481456bc214ccb/bitset_filter_arrayindexoutofboundsexception
>>>>>>>>>>
>>>>>>>>>> I am using solr trunk and building an RTree from
stored  
>>>>>>>>>> document
>>>>>>>>>> fields.
>>>>>>>>>> This process worked fine until a recent change in
2.9 that  
>>>>>>>>>> has
>>>>>>>>>> different
>>>>>>>>>> document id strategy then I was used to.
>>>>>>>>>>
>>>>>>>>>> In that thread, Yonik suggested:
>>>>>>>>>> - pop back to the top level from the sub-reader,
if you  
>>>>>>>>>> really need
>>>>>>>>>> a single
>>>>>>>>>> set
>>>>>>>>>> - if a set-per-reader will work, then cache per segment
 
>>>>>>>>>> (better for
>>>>>>>>>> incremental updates anyway)
>>>>>>>>>>
>>>>>>>>>> I'm not quite sure what you mean by a "set-per-reader".
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I meant RTree per reader (per segment reader).
>>>>>>>>>
>>>>>>>>> Previously I was
>>>>>>>>>> building a single RTree and using it until the the
last  
>>>>>>>>>> modified
>>>>>>>>>> time had
>>>>>>>>>> changed.  This avoided building an index anytime
a new  
>>>>>>>>>> reader was
>>>>>>>>>> opened and
>>>>>>>>>> the index had not changed.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I *think* that our use of re-open will return the same
 
>>>>>>>>> IndexReader
>>>>>>>>> instance if nothing has changed... so you shouldn't have
to  
>>>>>>>>> try and
>>>>>>>>> do
>>>>>>>>> that yourself.
>>>>>>>>>
>>>>>>>>> I'm fine building a new RTree for each reader if
>>>>>>>>>> that is required.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> If that works just as well, it will put you in a better
 
>>>>>>>>> position for
>>>>>>>>> faster incremental updates... new RTrees will be built
only  
>>>>>>>>> for
>>>>>>>>> those
>>>>>>>>> segments that have changed.
>>>>>>>>>
>>>>>>>>> Is there any existing code that deals with this situation?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> To cache an RTree per reader, you could use the same
logic as
>>>>>>>>> FieldCache uses... a weak map with the reader as the
key.
>>>>>>>>>
>>>>>>>>> If a single top-level RTree that covers the entire index
works
>>>>>>>>> better
>>>>>>>>> for you, then you can cache the RTree based on the top
level  
>>>>>>>>> multi
>>>>>>>>> reader and translate the ids... that was my fix for
>>>>>>>>> ExternalFileField.
>>>>>>>>> See FileFloatSource.getValues() for the implementation.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> - - - -
>>>>>>>>>>
>>>>>>>>>> Yonik also suggested:
>>>>>>>>>>
>>>>>>>>>> Relatively new in 2.9, you can pass null to enumerate
over  
>>>>>>>>>> all
>>>>>>>>>> non-deleted
>>>>>>>>>> docs:
>>>>>>>>>> TermDocs td = reader.termDocs(null);
>>>>>>>>>>
>>>>>>>>>> It would probably be a lot faster to iterate over
indexed  
>>>>>>>>>> values
>>>>>>>>>> though.
>>>>>>>>>>
>>>>>>>>>> If I iterate of indexed values (from the FieldCache
i  
>>>>>>>>>> presume) then
>>>>>>>>>> how do i
>>>>>>>>>> get access to the document id?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> IndexReader.terms(Term t) returns a TermEnum that can
 
>>>>>>>>> iterate over
>>>>>>>>> terms, starting at t.
>>>>>>>>> IndexReader.termDocs(Term t or TermEnum te) will give
you  
>>>>>>>>> the list
>>>>>>>>> of
>>>>>>>>> documents that match a term.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -Yonik
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> - Mark
>>>>>>
>>>>>> http://www.lucidimagination.com
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> --
>>>> - Mark
>>>>
>>>> http://www.lucidimagination.com
>>>>
>>>>
>>>>
>>>>
>>>
>>
>> --
>> - Mark
>>
>> http://www.lucidimagination.com
>>
>>
>>
>>
>
>
> -- 
> Regards,
> Shalin Shekhar Mangar.


Mime
View raw message