lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From András Péteri <apet...@b2international.com>
Subject Re: Maintaining sorting order (stored fields vs DocValue fields) while upgrading Lucene version
Date Sun, 02 Jul 2017 16:53:52 GMT
Hi,

Note that If you are using Lucene directly, 5.x introduced LUCENE-6064 [1]
[2], which adds checks to ensure that the sort field has a corresponding
DocValue of the expected type. Indexed fields can only be used for sorting
via an UninvertingReader, at a cost of increased heap usage [3]. Solr
handles the "index-only" cases transparently [4].

[1] https://issues.apache.org/jira/browse/LUCENE-6064
[2] https://github.com/apache/lucene-solr/commit/e696967770b3505
18fcd3f88050511349d5607a6#diff-9e0559ce8f7317732e59c3be337716a2R62
[3] https://github.com/apache/lucene-solr/blob/releases/lucene-
solr/5.5.4/lucene/misc/src/java/org/apache/lucene/uninver
ting/UninvertingReader.java#L49
[4] https://cwiki.apache.org/confluence/display/solr/
Common+Query+Parameters#CommonQueryParameters-ThesortParameter

Regards,
András

On Fri, Jun 30, 2017 at 4:44 AM, Erick Erickson <erickerickson@gmail.com>
wrote:

> 1>  Is it correct that stored fields can only be sorted on if they become a
> DocValue field in 5.x
>
> no. Indexed-only fields can still be used to sort. DocValues are just more
> efficient at load time and don't consume as much of the Java heap.
> Essentially this latter can be thought of as moving the "uninverted"
> structure from heap to MMap space.
>
> That said, I can't think of any _good_ reason to continue to sort on
> indexed="true" docValues="false" fields. Use DocValues.
>
> 2> When "updating" stored fields to DocValue fields , is it required to
> update all documents in the index at the same time?
>
> Yes. I'm assuming here you're talking about changing the schema definition
> to include docValues="true". In general I advocate re-indexing everything
> when upgrading major versions. Technically if you want to some
> "interesting" things with low-level Lucene you can upgrade your index, Uwe
> Schindler outlined the process. I copied what he said but don't understand
> it ;).
>
> I've seen some situations where people will define a _new_ field with both,
> gradually re-index and when all the docs have been updated switch to using
> the new field. That assumes that it's just impossible to reindex all at
> once.
>
> The question I have to ask... Why upgrade just to 5x? Solr is releasing 7.0
> very shortly. I can't think of a really good reason not to jump to 6x
> unless you have heavy customizations and the like. Even in that case you'll
> have to upgrade eventually. And if you wind up re-indexing everything
> anyway, it seems like stopping at 5x is unnecessary.
>
> Best,
> Erick
>
> On Thu, Jun 29, 2017 at 6:45 PM, Florian Buetow <fbuetow@mimecast.com>
> wrote:
>
> >
> > Hi,
> >
> >
> >
> > I am in the process of updating a large index from Lucene 4.x to 5.x and
> > have two questions related to the sorting order.
> >
> >
> >
> > 1. Is it correct that stored fields can only be sorted on if they become
> a
> > DocValue field in 5.x?
> >
> > 2. When "updating" stored fields to DocValue fields , is it required to
> > update all documents in the index at the same time?
> >
> >
> >
> > Thank you in advance for your help.
> >
> >
> >
> > Best regards
> >
> > Florian
> >
> >
> >
> >
> > Florian Buetow m: +44 7702 557267 <+44%207702%20557267> www.mimecast.com
> > Software Engineer p: +44 207 847 8700 <+44%2020%207847%208700> Address
> > click here <http://www.mimecast.com/About-us/Contact-us/>
> > ------------------------------
> > [image: Mimecast Logo]
> > <https://eu-api.mimecast.com/s/click/XujAZpejvFW2OIhYbUKIGzr
> oV6Ul00G1pndONKfdiASkL7P_JTj_EbOwSR6KJeM3Kvz0IZRCB8acaJBqWJO
> x38gmrExje8x_ZkiWP_1hffShQenbwEWz_1oZ1cbKQQG4IfVy_GaWWH_nasT
> a-CxfcIhZmNdIYJbBmmJS3QzSJiixOWl8enXqQrGcgifXyDPE2X25_Gibsklnspkf31Weag>
> >
> > [image: Linked In]
> > <https://eu-api.mimecast.com/s/click/F2A44qlyvx7D1oreXULOBfH
> yqFe-ucyZnbwU4nyMvdvUEGcUvIxVnjwbq5maMNXUvt3rIuwP0RRogPF5-Da
> KXXVPCRBYg4JXq_Wd9owjxjdIhbzjJFyQw0PStTFX85RQ-1-DXs8HNoBxB7O
> UVIfjBbm80zerQX9iyu2hUqSsBeorOQA5m0DSs02m-WfDE0D8Fk5QhYYVuNml1jnwK04O1A>
> > [image: You Tube]
> > <https://eu-api.mimecast.com/s/click/XujAZpejvFW2OIhYbUKIG2n
> AuYW6P9Ht8pYvS25cjsRqiqDLULnAw7_zVKh7qu0Cj5DlaDrIyaXNgxRDQ28
> 91XVPCRBYg4JXq_Wd9owjxjdIhbzjJFyQw0PStTFX85RQ-1-DXs8HNoBxB7O
> UVIfjBbm80zerQX9iyu2hUqSsBeorOQA5m0DSs02m-WfDE0D8Z2TW5muLnp5v1RvqNV85XA>
> > [image: Facebook]
> > <https://eu-api.mimecast.com/s/click/uQWkx1ojkUjr1VxEtkyiByq
> 4BnZ6vuvpsN8NdJZql_pF01rrX3lVN0Lkn17pYkcdegwl-9AVMG4H83XPkwR
> fs3HtvVBjQaZEcDg2mFzDF1aqY9nE2tOgEoMHpuK779bJDGst5dfpouURnyY
> 09us_UyyDKJwJwUfRSnFZ-AqLkMPUn1LoVm9oGenYNwGtEKHmFPRdp8WsooS3k5xOk52Z7A>
> > [image: Blog]
> > <https://eu-api.mimecast.com/s/click/1K7xTdhoqgjnB3PEFCIbOb8
> le33alv6yAdkn1w_geParmiDkrKTJpFb8SM6re-1Kg41NMmHOQefcj9nAX3y
> 56QZbY2H7yXqlixsehHybau3duZRb40foIi8j_9kd2WIhd6BMUlxMqXFSsol
> n_Legi_UcwnCCCu4aMN9dqpnXkmOgTTuZkNYniodU7KrpZB-fUWsThNMfSE_TxYg9ZhC-Fg>
> > [image: TwitterGlobal]
> > <https://eu-api.mimecast.com/s/click/kZ10BfBHOLnnDW9JqwKMJfy
> _6t9o-KCV44vs6UlXxz9W_NOKmTQZflJz-Bl6GV_6kPqHLUsfI8o2hvRvYO8
> OZxumXDjn9Uw0gSidB_1ElORv0fhh3lCq7XfcyQqTcNrW2yGC7iwuZeeHKcc
> BlHDNPX1aVmHHswrILqAqhBBiKR-DFj1YiPWhevZc21ryfaiRUWsThNMfSE_TxYg9ZhC-Fg>
> >
> >
> > [image: ESRA]
> > <https://eu-api.mimecast.com/s/click/XujAZpejvFW2OIhYbUKIG2Z
> rtA5qIMPIpMTeMN0NQraXvQeN9RALNGa0aMd0fP6_BOl80yHWDMTxIYtR1U8
> XArwbkTeK6xzoDkgbEf3Jv7IImmDW79LHBgwfMuc1NE9BQYGLsysA_qxqzLl
> mgHh0s0QhvGUnBXihs0pinvg0j4DRwXgM5E6l6Vq773KgYZFRdlRIP-qxKhZi_ID3Wx60Ow>
> >
> >
> > *Disclaimer*
> > The information contained in this communication from *
> > fbuetow@mimecast.com <fbuetow@mimecast.com> * sent at 2017-06-30
> 02:45:29
> > is confidential and may be legally privileged. It is intended solely for
> > use by * java-user@lucene.apache.org <java-user@lucene.apache.org> * and
> > others authorized to receive it. If you are not *
> > java-user@lucene.apache.org <java-user@lucene.apache.org> * you are
> > hereby notified that any disclosure, copying, distribution or taking
> action
> > in reliance of the contents of this information is strictly prohibited
> and
> > may be unlawful.
> >
> > This email message has been scanned for viruses by Mimecast. Mimecast
> > delivers a complete managed email solution from a single web based
> > platform. For more information please visit http://www.mimecast.com
> >
> >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message