Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 656DB200C7F for ; Wed, 24 May 2017 19:30:19 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 64348160BB6; Wed, 24 May 2017 17:30:19 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id AAC5F160BA5 for ; Wed, 24 May 2017 19:30:18 +0200 (CEST) Received: (qmail 50598 invoked by uid 500); 24 May 2017 17:30:17 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 50586 invoked by uid 99); 24 May 2017 17:30:17 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 May 2017 17:30:17 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id C0642C18AA for ; Wed, 24 May 2017 17:30:16 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.379 X-Spam-Level: ** X-Spam-Status: No, score=2.379 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id USh5CRZPuZlZ for ; Wed, 24 May 2017 17:30:15 +0000 (UTC) Received: from mail-wm0-f52.google.com (mail-wm0-f52.google.com [74.125.82.52]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 022C45F19B for ; Wed, 24 May 2017 17:30:15 +0000 (UTC) Received: by mail-wm0-f52.google.com with SMTP id d127so75113818wmf.0 for ; Wed, 24 May 2017 10:30:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=rJjqkCymMACVrEaOsQUCEzwzQFYQCXbQrD3WjgD0M3w=; b=kCJnqbZNQJR2qPjPR+RM2DqQTlcTyGWonrFPcV6l8o03n2yqH9TGwD+P0U/MqWa34s Syb+5fWVDa2J85idKXbQnhGk4OyczGgZSmJ7/wANFOhhpHFgGJb4TxmTk6mbfQ3ob+mp YvvLlkiAB3ueofa7o05OCaU+4vBI7pcV4ly5Osw0zGuUHjX091/3QEhxXbx95RURZgdA VwOAcdi81Gxg+1LbiKpkuVnCYQOeqbg3tvTzEcQek1At/98Wt9QygyKdEHJX3dCk+3bZ pIb2Fk4MjcASoXg7NndadDRBIyEBT6OXkRNCYjtcPMfmP12LZr4WQLDAE0VbS0hQNgpE DNUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=rJjqkCymMACVrEaOsQUCEzwzQFYQCXbQrD3WjgD0M3w=; b=puxe/lAxzKi0YlbOLIkIZlw5P7ALikPvbMlb1ZvXTAikW/vSD6t5XswU6NSMDHiJRq uB7/VtkqkYE3a/KkUNR/hE8XJX6EFVlGmMbfe9N0NfxwxFgUGAYV7oCjdp8f1n1lll81 rU9y9cLS99k15EmN1nqBYmqkhrvkBD1wrg4hNB5KSwg/UixPywceXKL2RenKZspxuL7p 0YwCDTVurICP2LRi1ScNBxHAZyFgOC8GJDAGuwzKDDVDrviIaznKx+G57xl3DQGvM4h6 a1x4ttJWlPc+34j6+mXBlNGuSraP5+AusCumHwN/H/HXVCjaivgd9sy8N+uCShiGtWSM WkIw== X-Gm-Message-State: AODbwcCid8LzjPvLrj9V/FQ3kNd9PsUHPkZIQmra/KeG+7RPTgapVORj 7zGkI1Tkfg6bZgCd/d8Dnr5VMDNPZZEK X-Received: by 10.28.167.197 with SMTP id q188mr6662480wme.79.1495647014446; Wed, 24 May 2017 10:30:14 -0700 (PDT) MIME-Version: 1.0 From: =?UTF-8?B?Sm9zw6kgVG9tw6FzIEF0cmlh?= Date: Wed, 24 May 2017 17:30:03 +0000 Message-ID: Subject: Indexing strategies for metadata fields To: "java-user@lucene.apache.org" Content-Type: multipart/alternative; boundary="001a114b37b8016c9e0550487590" archived-at: Wed, 24 May 2017 17:30:19 -0000 --001a114b37b8016c9e0550487590 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hello all, I'm trying to come up with a reasonable indexing strategy for my document's metadata, and I'm seeing some weird undocumented behaviours. My original approach was to build fields like these: FieldType ft =3D new FieldType(); ft.setDocValuesType( DocValuesType.SORTED ); ft.setIndexOptions( IndexOptions.DOCS ); ft.setStored( true ); thinking that it would be useful to have the doc's metadata available for searching, sorting and faceting and for value retrieval from doc numbers. However, for some mysterious reason, having fields with the above configuration resulted in an empty index after merging segments: $ ls indexDir write.lock segments_1 even though I could see the index writer creating index files in the dir during indexing. Independently of which combination of FieldType options I used, adding those fields to documents when indexing always produced the same empty index after merge/commit, but I would have to test more thoroughly to be sure of this. SO: first of all, *any clue why the merge is wiping out the index when and only when these docValues/stored fields are added?* Should I try to reproduce a and file a bug? Second: *is it possible to have one field that is both docValues and stored? and Indexed?* Why not? And if not, shouldn't FieldType warn you about this (like Field warns about non-stored, non-indexed fields)? Finally: *if this is not possible, what is the suggested strategy to make a given metadata field accesible from documents and useful for sorting/faceting?* Adding it twice? thanks! jta --=20 sent from a phone. please excuse terseness and tpyos. enviado desde un tel=C3=A9fono. por favor disculpe la parquedad y los erroe= rs. --001a114b37b8016c9e0550487590--