lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: maxDoc and arrays
Date Thu, 24 May 2007 20:22:25 GMT
>From the Javadoc for IndexReader.....

Returns one greater than the largest possible document number. This may be
used to, e.g., determine how big to allocate an array which will have an
element for every document number in an index.

Isn't that what you're wondering about?

Erick

On 5/24/07, Carlos Pita <carlosjosepita@gmail.com> wrote:
>
> That's no problem, I can regenerate my entire extra data structure upon
> periodic index optimization. That way the array size will be about  the
> size
> of the index. What I find more difficult is to know the id of the last
> added/removed document. I need it to update the in-mem structure upon more
> fine-grained index changes.  Any ideas?
>
> TIA.
> Cheers,
> Carlos
>
> On 5/24/07, Erick Erickson <erickerickson@gmail.com> wrote:
> >
> > Document IDs will be re-utilized, after, say, optimization.
> > One consequence of this is that optimization will change the IDs
> > of *existing* documents.
> >
> > You're right, that numdocs may well be shorter than maxdocs.
> > That's what I get for reading quickly...
> >
> > Best
> > Erick
> >
> > On 5/24/07, Carlos Pita <carlosjosepita@gmail.com> wrote:
> > >
> > > >
> > > >
> > > > No. It will always be at least as large as the total documents. But
> > that
> > > > will also count deleted documents.
> > >
> > >
> > >
> > > Do you mean that deleted document ids won't be reutilized, so the
> index
> > > maxDoc will grow more and more with time? Isn't there any way to
> > compress
> > > the range? It seems strange to me, considering that an example in the
> > book
> > > suggests to use the document id as an array index for an array of
> maxDoc
> > > elements.
> > >
> > > Cheers,
> > > Carlos
> > >
> > > Why wouldn't numdocs serve?
> > > >
> > > > Best
> > > > Erick
> > > >
> > > >
> > > > The motivation of this question is that I want to associate some
> info
> > to
> > > > > each document in the index, and in order to access this additional
> > > data
> > > > in
> > > > > O(1) I would like to do this through an array indexing. But the
> > array
> > > > size
> > > > > shouldn't be a lot greater than the total number of documents. I
> see
> > > > that
> > > > > something similar is done in the example of section 6.1 of Lucene
> in
> > > > > Action,
> > > > > but for sorting purposes, which is not my case.
> > > > >
> > > > > Related to this: how can update my array of extra data when
> > documents
> > > > are
> > > > > added/removed to/from the index? Is there any feedback mechanism
> by
> > > > means
> > > > > of
> > > > > callbacks or event handlers?
> > > > >
> > > > > Thank you in advance.
> > > > > Regards,
> > > > > Carlos
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message