lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sriram Sankar <san...@gmail.com>
Subject Re: segments and sorting
Date Mon, 17 Jun 2013 23:05:08 GMT
I'm sorry - I meant "DocValue" not "FieldValue".  Slide 20 in the following
deck talks about the 2Gb limit.

Sriram.

http://www.slideshare.net/lucenerevolution/willnauer-simon-doc-values-column-stride-fields-in-lucene


On Sat, Jun 15, 2013 at 1:52 AM, Adrien Grand <jpountz@gmail.com> wrote:

> Hi,
>
> On Fri, Jun 14, 2013 at 11:24 PM, Sriram Sankar <sankar@gmail.com> wrote:
> > For my use case of having all docs sorted by a static rank and being able
> > to cut off retrieval after a certain number of docs, I have to sort all
> my
> > docs using the static rank (and Lucene 4 has a way to do this).
> >
> > When an index has multiple segments, how does this sorting work?  Is each
> > segment sorted independently?  Or is it possible for me to control this -
> > and have a single segment?
>
> You can sort each segment independently or have a single segment, both
> options are available. To have a single segment, you just need to wrap
> your top-level index reader with SlowCompositeReaderWrapper before
> wrapping it again in a SortingAtomicReader and calling
> IndexWriter.addIndexes.
>
> > Assuming I have a single segment, are there any other constraints?  I
> read
> > somewhere that FieldValue's have a limit of 2Gb per segment - is this
> true?
>
> What do you mean with "FieldValue"? If you are referring to stored
> fields, a single field value cannot be larger than 2B because the API
> uses ints. But some codecs enforce lower limits, for example the
> current default stored fields format enforces that the sum of the
> sizes of all fields of a _single_ document is less than 2GB (which is
> already much more than what typical users need). I think the major
> limitation is that a single Lucene index cannot have more than 2
> billion documents, but you can store your data into several physical
> shards to work around this limitation and merge results at searching
> time.
>
> --
> Adrien
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message