lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jian chen <chenjian1...@gmail.com>
Subject Re: Large queries
Date Sun, 16 Oct 2005 18:15:27 GMT
Hi, Trond,

By the way, it appears to me that Lucene uses the iterator pattern a lot,
like SegmentTermEnum, TermDocs, TermPositions, etc. Each iterator uses the
underlying fix sized buffer to load a chunck of data at a time. So, even you
have millions of documents, you shouldn't run into memory problem given
Lucene's index data structure.

That said, there are some parameters that you can tweak given really large
amount of documents. But I suggest that you don't need to worry about it
unless you really run into it.

Cheers,

Jian

On 10/16/05, Trond Aksel Myklebust <tamyk@online.no> wrote:
>
> How is Lucene handling very large queries? I have 6million documents,
> which
> each has a "docID" field. There is a total of 20000 distinct docID's, so
> many documents got the same docID which consists of a filename (only name,
> not path).
>
> Sometimes, I must get all documents that has one of 10 docID's, and
> sometimes I need to get all documents that has one of 10000 docIDs. Is
> there
> any other way than doing a query: docID:(file1 file2 file3 file4..) ?
>
>
>
> Trond A Myklebust
>
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message