lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aravinth thangasami <aravinththangas...@gmail.com>
Subject Re: Adding Docvalues to a Field
Date Sat, 06 May 2017 03:09:43 GMT
Will try on it.

Thanks Uwe :)

On Sat, May 6, 2017 at 4:02 AM, Uwe Schindler <uwe@thetaphi.de> wrote:

> Hi Aravinth,
>
> To get rid of the partially merged (mixed) docvalues fields you can use
> the following additional approach on top of my previous mail:
>
> > Erick was referring to Solr. To fix your issue without fully indexing
> you can
> > use merging to update the whole index. To do this use the following
> > approach:
> >
> > Wrap your index using UninvertingReader. Then get all LeadReaders using
> > the leaves() method.
>
> The problem is by that approach, that all those leaves that have partial
> (!) docvalues are seen by UninvertingReader as having DocValues already and
> those just return the partial DocValues, so Uninverting is not done. So we
> have to trick UninvertingReader to ignore the already existing (partial)
> DocValues. So instead of wrapping the whole IndexReader, we change the
> workflow:
>
> - Get all leaves() of the broken docvalues/non-docvalues index
> - Wrap all those LeafReader instances using an anonymous FilterLeafReader
> instance, overriding all the DocValues-related methods to return "null"
> instead of calling super. This hides all partially existing doc values (not
> form FieldInfos, but that should not hurt). The consumer of this reader
> will see no DocValues.
> - Then wrap those filtered Readers with new UninvertingRaeder(filteredLeaf)
> - this adds back fresh DocValues, recalculated from the uninverted fields.
> Be sure to get the types right, otherwise you will get merge errors
> (incompatible field types).
> - Then wrap all those uninverting leaves with
> SlowCodecReaderWrapper.wrap(). This makes them mergeable (its slow and
> costs memory, but works).
>
> The remaining stuff as said before:
>
> > Then create an new index with IndexWriter and use
> > IndexWriter.addIndex(CodecReader) and pass in the previously created
> > wrappers, ideally one by one. Those readers are slow, but ready to be
> > merged into a new index with DocValues. The empty Writer will then import
> > the wrapped index and takes the emulates DocValues. This may take some
> > time, but afterwards you have an index with all fields having the
> DocValues
> > on disk. Inverting is no longer needed.
> >
> > I hope that helps. I can post code that should do this. There is no
> ready to
> > use tool available, because you need to correctly configure the
> uninverter.
> >
> > Uwe
> >
> > Am 5. Mai 2017 22:12:13 MESZ schrieb aravinth thangasami
> > <aravinththangasami@gmail.com>:
> > >Thanks Erick
> > >
> > >On Fri, May 5, 2017 at 9:19 PM, Erick Erickson
> > ><erickerickson@gmail.com>
> > >wrote:
> > >
> > >> In a word, "no". You must re-index from scratch. Worse, now that you
> > >> have some segments thinking the fields are docValues and some not and
> > >> maybe some mixed, I know of no way to un-entangle them.
> > >>
> > >> I'd create a new collection and re-index it entirely, then use
> > >> collection aliasing to point the applications at the new collection.
> > >>
> > >> Best,
> > >> Erick
> > >>
> > >> On Fri, May 5, 2017 at 2:49 AM, aravinth thangasami
> > >> <aravinththangasami@gmail.com> wrote:
> > >> > Hi all,
> > >> >
> > >> > On process of moving to Lucene 5 from Lucene 4, we faced this
> > >following
> > >> > issue
> > >> > We have enabled doc values in Lucene 5.we previously don't used doc
> > >> values
> > >> > in Lucene 4
> > >> >
> > >> > Using UninvertingReader, sorting works fine until the first merge
> > >> happens.
> > >> > On merge documents in the older version without doc values affect
> > >the
> > >> > sorting order.
> > >> >
> > >> > Is there any way to solve this issue without reindexing ???
> > >> >
> > >> > What is  your opinion on it ?
> > >> >
> > >> > I was thinking about these two ways.will these possible ?
> > >> >
> > >> > 1. Does Uninverting Reader can be made to store the formed doc
> > >values to
> > >> > disk ?
> > >> > 2. During merge, does IndexWriter can be made to write the doc
> > >values for
> > >> > documents without doc value ?
> > >> >
> > >> >
> > >> >
> > >> > Thanks
> > >> > Aravinth
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >> For additional commands, e-mail: java-user-help@lucene.apache.org
> > >>
> > >>
> >
> > --
> > Uwe Schindler
> > Achterdiek 19, 28357 Bremen
> > https://www.thetaphi.de
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message