lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <ian....@gmail.com>
Subject Re: document object
Date Fri, 11 Mar 2011 10:11:57 GMT
If I've read this right you are saying that you need to look at fields
A and D for 1000 docs but B, C and E for just one.  If that is right
then lazy loading/FieldSelector will help.

But even loading just A and D for 1000 hits will inevitably take time.
 As already suggested, you could look at caching to speed that up.


--
Ian.




On Fri, Mar 11, 2011 at 6:04 AM, suman.holani <suman.holani@zapak.co.in> wrote:
> Hello Erick,
>
> Hits .length is 1800
>
> Version is lucene 3.0.3
>
> I need the entire result set . As I ll be fetching records which satisfy the
> search conditions. And will be validating them wrt to current counts ,
> scheduling the successful resultset.Selecting one of them on basis of random
> scheduling.
>
> I cannot take page wise result. As that will lead to starvation of documents
> which are at end.
>
> I cannot add validating current counts onto index as it is changing v
> frequently. So not possible to change entire index everytime for that.
>
> Let me know of some soln .
>
>
> Let say there are 5 fields in indexing . A, B C ,D ,E
>
> when I search 1000 records are fetched
> I wanna use A, D for the time being for validating the records wrt counts.
> Note:fields B,C,E is nt required now, bt I am fetching it and storing in a
> list
>
> A,D in list are given to another process for validation
> After validation 700 records are in list
> Of wchich one of the record displayed after scheduling with entire fields A,
> B,C,D,E
>
> Regards,
> Suman
>
>
>
>
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: Thursday, March 10, 2011 7:46 PM
> To: java-user@lucene.apache.org
> Subject: Re: document object
>
> If you're loading 100,000 documents, you can expect it to be slow. If
> you're loading 10 documents, it should be quite fast... So how big is
> hits.length?
>
> And what version of Lucene are you using? The Hits object has been
> deprecated for quite some time I believe.....
>
> The problem here is that you're loading the entire result set. This is
> rarely the right thing to do, which is why paging is used normally.
>
> Why do you need to load the entire result set? That seems to be the
> crux of the issue.
>
> Best
> Erick
>
> On Thu, Mar 10, 2011 at 5:22 AM, Anshum <anshumg@gmail.com> wrote:
>> Depends on your data. I know that's a vague answer but that's the point.
>> What you could do is use FieldCache if memory and data let you do so.
> Would
>> it?
>>
>> --
>> Anshum Gupta
>> http://ai-cafe.blogspot.com
>>
>>
>> On Thu, Mar 10, 2011 at 3:12 PM, suman.holani
> <suman.holani@zapak.co.in>wrote:
>>
>>> Hi Anshum,
>>>
>>> Thanks for prompt reply.
>>>
>>> I am only storing the fields in index , which I want to get/fetch after
>>> search.
>>>
>>> The area I am not sure is when we call searcher/reader class to
> initialize
>>> Document object is heavy?
>>> Can we use something else in that place, which doesnot needs to load all
>>> doc
>>> again.
>>>
>>> Regards,
>>> Suman
>>>
>>>
>>> -----Original Message-----
>>> From: Anshum [mailto:anshumg@gmail.com]
>>> Sent: Thursday, March 10, 2011 3:11 PM
>>> To: java-user@lucene.apache.org
>>> Subject: Re: document object
>>>
>>> Hi Suman,
>>> Do you need to load/use all fields that you have stored in the index? If
>>> that's not the case I'd suggest you to use the
>>>
>>>
>>> public Document
>>> <
>>>
> http://lucene.apache.org/java/3_0_1/api/core/org/apache/lucene/document/Doc
>>> ument.html>
>>> *doc*(int i, FieldSelector fieldSelector)
>>>
>>>
>>>
> http://lucene.apache.org/java/3_0_1/api/core/org/apache/lucene/search/IndexS
>>> earcher.html#doc(int,
>>> org.apache.lucene.document.FieldSelector)
>>> <
>>>
> http://lucene.apache.org/java/3_0_1/api/core/org/apache/lucene/search/Index
>>> Searcher.html#doc(int,
>>> org.apache.lucene.document.FieldSelector)>function .
>>> This should help you. Also, otherwise if you're using very selective
> field
>>> which may be used though a FieldCache it'd be a nice thing to do.
>>>
>>> Hope that helps.
>>> --
>>> Anshum Gupta
>>> http://ai-cafe.blogspot.com
>>>
>>>
>>> On Thu, Mar 10, 2011 at 3:01 PM, suman.holani
>>> <suman.holani@zapak.co.in>wrote:
>>>
>>> >
>>> >
>>> > Hi,
>>> >
>>> >
>>> >
>>> > I am facing the  problem
>>> >
>>> >
>>> >
>>> > The line in the loop is going very slow giving me a performance hit
>>> >
>>> >  for (int i = 0; i < hits.length; ++i) {
>>> >
>>> >
>>> >
>>> >                int docId = hits[i].doc;
>>> >
>>> >                Document d = searcher.doc(docId);  //problem
>>> >
>>> > }
>>> >
>>> >
>>> >
>>> > How can I improve this. Please give me an example of the improved code
>>> >
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > Suman
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > Ps :
>>> >
>>> > In one of post Erick said ..
>>> >
>>> >
>>> >
>>> > this line is really suspicious:
>>> >
>>> > Document document = this.indexReader.document(doc)
>>> >
>>> > From the Javadoc for HitCollector.collect:
>>> >
>>> > Note: This is called in an inner search loop. For good search
>>> performance,
>>> > implementations of this method should not call
>>> >
>>> >
>>>
>>>
> Searcher.doc(int)<file:///C:/lucene-2.1.0/docs/api/org/apache/lucene/search/
>>> > Searcher.html#doc%28int%29>or
>>> >
>>> >
>>>
>>>
> IndexReader.document(int)<file:///C:/lucene-2.1.0/docs/api/org/apache/lucene
>>> > /index/IndexReader.html#document%28int%29>on
>>> > every document number encountered. Doing so can slow searches by an
>>> > order
>>> > of magnitude or more.
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message