lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ganesh" <emailg...@yahoo.co.in>
Subject Re: Multisearcher will maintain index order sorting?
Date Fri, 24 Oct 2008 06:30:13 GMT
Any commets are suggestions are greatly appreciated.

Regards
Ganesh

----- Original Message ----- 
From: "Ganesh" <emailgane@yahoo.co.in>
To: <java-user@lucene.apache.org>
Sent: Thursday, October 23, 2008 3:45 PM
Subject: Re: Multisearcher will maintain index order sorting?


> Multisearcher after performing search on second index, adds the resultant 
> docid with the maxdocid of the first index. In my case it would be 3. 
> After incrementing the docid, the document is inserted in to the 
> FieldDocSortedHitQueue. FieldDocSortedHitQueue is an extension of priority 
> queue should sort in the increasing order. It should insert docid 3 after 
> 2 and not after 0.
>
> code snippet of MultiSearcher.Java
> --------------------------------------
> if (hq == null) hq = new FieldDocSortedHitQueue (docs.fields, n);
> .....
> for (int j = 0; j < scoreDocs.length; j++) { // merge scoreDocs into hq
>        ScoreDoc scoreDoc = scoreDocs[j];
>        scoreDoc.doc += starts[i];                //Doc id is 
> incremented*******
>        if (!hq.insert (scoreDoc))                  //Insertion should do 
> automatic sorting
>          break;                                        }
>
> Regards
> Ganesh
>
>
> ----- Original Message ----- 
> From: "Hadi Forghani" <hadi4i@gmail.com>
> To: <java-user@lucene.apache.org>
> Sent: Thursday, October 23, 2008 3:25 PM
> Subject: Re: Multisearcher will maintain index order sorting?
>
>
>> because when you want to find X of second index, shoud pass docId=3 to
>> MultiSearcher and MultiSearcher can Find Sub Search of this Document with
>> calculation length of all subSearcher.
>> for example when you get doc with DocID 3(Second X), multisearch (see the
>> code of multisearcher doc(int i)), mines 3 from your DocID(because the 
>> first
>> Searcher has 3 documents) and then pass zero to second Searcher and want 
>> to
>> return 0 doc from it.
>> on the other hand, multisearcher find subsearcher by BinarySearchTree no
>> just that is said.
>>
>> On Thu, Oct 23, 2008 at 12:47 PM, Ganesh <emailgane@yahoo.co.in> wrote:
>>
>>> In IndexA there are 3 docs
>>> DocID, Terms
>>> 0,X
>>> 1,X Y
>>> 2,X Z
>>>
>>> In IndexB there are 3 docs
>>> DocID, Terms
>>> 0,X
>>> 1,X Y
>>> 2,X Z
>>>
>>> When i do sort on indexed order using Multisearcher and
>>> ParallelMultiSearcher, it returns the result
>>> 0,X
>>> 3,X
>>> 1,X Y
>>> 4,X Y
>>> 2,X Z
>>> 5,X Z
>>>
>>> But it should be in the order of 0,1,2,3,4,5. Could anyone explain why?
>>>
>>> Regards
>>> Ganesh
>>>
>>> ----- Original Message ----- From: "Ganesh" <emailgane@yahoo.co.in>
>>> To: <java-user@lucene.apache.org>
>>> Sent: Thursday, October 23, 2008 1:37 PM
>>>
>>> Subject: Re: Multisearcher will maintain index order sorting?
>>>
>>>
>>>  Multisearcher and ParallelMultiSearcher, when requested to sort on doc
>>>> (indexed order), it merges the result by docID of each DB.
>>>>
>>>> Regards
>>>> Ganesh
>>>>
>>>> ----- Original Message ----- From: "Paul Smith" <psmith@aconex.com>
>>>> To: <java-user@lucene.apache.org>
>>>> Sent: Thursday, October 23, 2008 10:57 AM
>>>> Subject: Re: Multisearcher will maintain index order sorting?
>>>>
>>>>
>>>>
>>>>> On 23/10/2008, at 4:20 PM, Ganesh wrote:
>>>>>
>>>>>  My Index DB is having 10 million records and it will grow to 30 
>>>>> million.
>>>>>> Currently I am using millisecond timestamp and the RAM cosumption
is 
>>>>>> more. I
>>>>>> will change the resolution to minute. I am  using 2 searcher objects
>>>>>> refreshing each other every minute. When i  do a warmup query with

>>>>>> sort of
>>>>>> timestamp then the cpu is spiked to  100% and this is happening for

>>>>>> every
>>>>>> minute.  In order to avoid  these issues, i am planning to break
my 
>>>>>> DB and
>>>>>> to do sort on indexed  order.
>>>>>>
>>>>>> Will multisearcher will maintain indexed order on sorting?
>>>>>>
>>>>>
>>>>>
>>>>> If you need to keep the millisecond accuracy, break down the timestamp
>>>>> into 3 fields: day, time, millisecond, and sort on 3 fields.  This way

>>>>> each
>>>>> field has a much smaller number of distinct values and well  occupy 
>>>>> vastly
>>>>> less memory over time.  I don't think there's much  overhead in this
>>>>> approach either, because in most cases, the top-level  field (day) 
>>>>> will
>>>>> provide most of the sorting ability, and Lucene will  only need to hit

>>>>> the
>>>>> time & millisecond fields less frequently for  comparison.
>>>>>
>>>>> I believe Multisearcher does a merge sort of the 2 (or more) sub-
>>>>> searchers, so there is an overhead in using in versus a single 
>>>>> searcher.
>>>>>
>>>>> Paul
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>>
>>>> Send instant messages to your online friends
>>>> http://in.messenger.yahoo.com
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>> Send instant messages to your online friends 
>>> http://in.messenger.yahoo.com
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>
> Send instant messages to your online friends http://in.messenger.yahoo.com
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 

Send instant messages to your online friends http://in.messenger.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message