lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Victor Kabdebon <victor.kabde...@gmail.com>
Subject Re: Question from a new user : IndexSearcher.doc
Date Sun, 20 Jun 2010 20:04:32 GMT
Hello Simon,

As I told you, I am quite new with Lucene, so there are many things that
might be wrong.
I'm using Lucene to make a search service for a website that has a large
amount of information daily. This amount of information is directly avaible
as text in a Cassandra Database.
There might be as much as 10.000 new documents added daily, and yes my
concern is it possible to retrieve more documents than the integer max value
?
I don't really see also how the IndexSearcher.doc( ) really works, because
it seems like we give this method an ID and it is going to search in the
indexed documents. So what exactly is going to do this
IndexSearcher.doc(int) ?

*Or are you concerned about retrieving all documents
containing term "XY" if the number of documents matching is large?*
*
*

I'm also concerned by this problem, yes

Could you explain me a little bit how it works, and how Lucene enables one
to retrieve a very large number of documents even if it uses int ?

Thank you for your answers,
Victor

2010/6/20 Simon Willnauer <simon.willnauer@googlemail.com>

> Hi, maybe I don't understand your question correctly. Are you asking
> if you could run into problems if you retrieve more documents than
> integer max value? Or are you concerned about retrieving all documents
> containing term "XY" if the number of documents matching is large? If
> you are afraid of loading all documents matched from a stored field I
> guess you are doing something wrong.
> What are you using lucene for?
>
> simon
>
> On Sun, Jun 20, 2010 at 8:00 PM, Victor Kabdebon
> <victor.kabdebon@gmail.com> wrote:
> > Hello everybody,
> >
> > I am new to Apache Lucene and it seems to fit perfectly my needs for my
> > application.
> > However I'm a little concerned about something (pardon me if it's a
> > recurrent question, I've searched the archives but I didn't find
> something
> > about that)
> >
> > So here is my case :
> >
> > I have index a few files (like 10) and I'm trying to search something
> stupid
> > in it. The word "test". So after opening everything etc... (assuming it
> > works also) I do that :
> >
> > *Term test = new Term("text_comment","test");*
> > *        Query query = new TermQuery(test);*
> > *        TopDocs top = searcher.search(query, 10);*
> >
> > I want to recover the first document (I have 2 documents in TopDocs), I
> do :
> >
> > *IndexSearcher.doc( top[0].doc)*
> >
> > I searched a little bit in javadoc and I saw that this method uses "int"
> as
> > a parameter
> > I'm a little bit concerned about this... At the moment, I have 10
> documents
> > so that's ok, but if I want to index let's say 20 files documents, how
> will
> > the IndexSearcher.doc(int) be able to retrieve documents ?
> > Same problem if 100.000 files have the word "test" in "text_comment" will
> I
> > still be able to get these 100.000 documents or is it going to be a
> problem
> > ?
> >
> > Thank you very much.
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message