lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mark harwood <markharw...@yahoo.co.uk>
Subject Re: About the search efficiency based on document's length
Date Fri, 21 Sep 2007 09:52:41 GMT
This may be of interest:

http://mail-archives.apache.org/mod_mbox/lucene-java-user/200707.mbox/%3c62924.70068.qm@web26010.mail.ukl.yahoo.com%3e


Cheers
Mark


----- Original Message ----
From: Karl Wettin <karl.wettin@gmail.com>
To: java-user@lucene.apache.org
Sent: Friday, 21 September, 2007 8:35:05 AM
Subject: Re: About the search efficiency based on document's length


21 sep 2007 kl. 09.09 skrev Jarvis:

> Storing data in a document will not affect search speed.
>
> This is helpful .

Someone should probably confirm that though.

>
> And another question :)
>
> When I make a search which will return 500000 results , it will be  
> very
> inefficient when I want to get the document between the No.450000 to
> No.450010 or some back document . Why was it ? Or some solution ?

I suppose you are referring to the class Hits? It should only be an  
extra cost if you iterate a lot of documents priot to index 450000,  
as that will force it to replace the query now and then.

It is a pretty simple peice of code. Go right ahead and take a look  
at it:

<http://svn.apache.org/repos/asf/lucene/java/trunk/src/java/org/ 
apache/lucene/search/Hits.java>


-- 
karl




> Thanks,
>         Jarvis .
>
>
> -----Original Message-----
> From: Karl Wettin [mailto:karl.wettin@gmail.com]
> Sent: Friday, September 21, 2007 2:45 PM
> To: java-user@lucene.apache.org
> Subject: Re: About the search efficiency based on document's length
>
> 21 sep 2007 kl. 08.23 skrev Jarvis:
>
>> There is a question about the document’s length and search  
>> efficiency.
>
>> Two ways to index some html pages(ignore some information): one is
>> both
>> store and index the html content in lucene dictionary, the other is
>> just
>> index the content . For the first method is there a efficiency  
>> problem
>> compare to the second besides the folder size increase?
>
> Not sure I understand your question, but I'll give it a go.
>
> As far as I know, storing data in a document will not affect search
> speed. However, loading large amounts of data to a Document will of
> course consume resources. Therefor it is possible to pass a
> FieldSelector to the IndexReader when you retrieve a Document,
> allowing you to define what fields to ignore, load, lazy load, et c.
>
> I hope this helps.
>
> -- 
> karl
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org






      ___________________________________________________________
Yahoo! Answers - Got a question? Someone out there knows the answer. Try it
now.
http://uk.answers.yahoo.com/ 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message