lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Sokolov <msoko...@safaribooksonline.com>
Subject Re: MultiReader docid reliability
Date Fri, 30 May 2014 14:41:35 GMT
There is a Solr document cache that holds field values too, see: 
http://wiki.apache.org/solr/SolrCaching

Maybe take this question over to the solr mailing list?

-Mike

On 5/30/2014 10:32 AM, Alan Woodward wrote:
> Solr caches hold lucene docids, which are invalidated every time a new searcher is opened.
 The various fields for a response aren't cached as far as I know, they're reloaded on each
request.  But loading the fields for 10 documents is typically very fast, compared to searching
over a very large collection.
>
> Alan Woodward
> www.flax.co.uk
>
>
> On 30 May 2014, at 11:20, Nicola Buso wrote:
>
>> Hi Alan,
>>
>> just to make it more typical (yes there are not IndexWriters open on
>> that indexes) how solr is caching results? the first thing I would like
>> to do is to store the docs ids and return to the reader for the real
>> content. Is solr storing the whole results with all values?
>>
>>
>> nicola.
>>
>>
>> On Fri, 2014-05-30 at 11:05 +0100, Alan Woodward wrote:
>>> If the index is truly unchanging (ie there's no IndexWriter open on
>>> it) then I guess the document numbers will be stable across reopens.
>>> But this is a pretty specialized situation, and the docs are really
>>> there to warn you off trying to rely on this for more typical uses.
>>>
>>> Alan Woodward
>>> www.flax.co.uk
>>>
>>>
>>>
>>> On 30 May 2014, at 10:39, Nicola Buso wrote:
>>>
>>>> Hi Alan,
>>>>
>>>> thanks a lot for the reply.
>>>>
>>>> For what I understood from your reply if the index is not changing
>>>> (no
>>>> adds, deletes even updates) the docs id viewed by the MultiReader
>>>> will
>>>> not change if you open more times that unchanged index also in
>>>> different
>>>> environments.
>>>>
>>>> If this is true (my understanding) the word "ephemeral" in the API
>>>> could
>>>> be elaborated a bit more.
>>>>
>>>>
>>>> nicola
>>>>
>>>> On Fri, 2014-05-30 at 09:26 +0100, Alan Woodward wrote:
>>>>> Hi Nicola,
>>>>>
>>>>>
>>>>> 1) A session here means as long as you have that MultiReader open.
>>>>> IndexReaders see a snapshot of the index and so document ids
>>>>> shouldn't change over the lifetime of an IndexReader, even if the
>>>>> index is being updated.
>>>>>
>>>>>
>>>>> 2) MultiReader just takes an array of subindexes, so as long as
>>>>> the
>>>>> subindexes are passed to the MultiReader constructor in the same
>>>>> order
>>>>> on both machines, the docBase assigned to each reader context
>>>>> should
>>>>> be the same.
>>>>>
>>>>> Alan Woodward
>>>>> www.flax.co.uk
>>>>>
>>>>>
>>>>>
>>>>> On 29 May 2014, at 14:29, Nicola Buso wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> from the javadocs:
>>>>>>
>>>>>> ----
>>>>>> For efficiency, in this API documents are often referred to via
>>>>>> document
>>>>>> numbers, non-negative integers which each name a unique document
>>>>>> in
>>>>>> the
>>>>>> index. These document numbers are ephemeral -- they may change
>>>>>> as
>>>>>> documents are added to and deleted from an index. Clients should
>>>>>> thus
>>>>>> not rely on a given document having the same number between
>>>>>> sessions.
>>>>>> ----
>>>>>>
>>>>>> What does it mean in this context "sessions"? Are search
>>>>>> sessions?
>>>>>>
>>>>>> 1) If I have an index that does not change (no deletes or
>>>>>> updates)
>>>>>> and
>>>>>> I'm keeping the MultiReader open, can the docid change executing
>>>>>> more
>>>>>> times the same search on that reader?
>>>>>>
>>>>>> 2) Opening the same set of indexes in a MultiReader on different
>>>>>> machines will assign different docids to the same document at
>>>>>> runtime or
>>>>>> the algorithm to calculate such docids in some way can guarantee
>>>>>> that
>>>>>> static indexes will have the same docids in different machines
>>>>>> (than
>>>>>> separated JVMs)?
>>>>>>
>>>>>>
>>>>>> nicola.
>>>>>>
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> Nicola Buso <nbuso@ebi.ac.uk>
>>>>>> EMBL-EBI
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail:
>>>>>> java-user-help@lucene.apache.org
>>>>>>
>>>>>>
>>>>>
>>>> -- 
>>>> Nicola Buso <nbuso@ebi.ac.uk>
>>>> EMBL-EBI
>>>>
>>>>
>>>
>> -- 
>> Nicola Buso
>> Software Engineer - Web Production Team
>>
>> European Bioinformatics Institute (EMBL-EBI)
>> European Molecular Biology Laboratory
>>
>> Wellcome Trust Genome Campus
>> Hinxton
>> Cambridge CB10 1SD
>> United Kingdom
>>
>> URL: http://www.ebi.ac.uk
>>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message