Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0B916195E2 for ; Tue, 15 Mar 2016 16:15:00 +0000 (UTC) Received: (qmail 962 invoked by uid 500); 15 Mar 2016 16:14:58 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 902 invoked by uid 500); 15 Mar 2016 16:14:58 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 890 invoked by uid 99); 15 Mar 2016 16:14:58 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Mar 2016 16:14:58 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id D94471A03B8 for ; Tue, 15 Mar 2016 16:14:57 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.179 X-Spam-Level: * X-Spam-Status: No, score=1.179 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 38P1B3gDR7fs for ; Tue, 15 Mar 2016 16:14:55 +0000 (UTC) Received: from mail-lb0-f178.google.com (mail-lb0-f178.google.com [209.85.217.178]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 170BF5F620 for ; Tue, 15 Mar 2016 16:14:55 +0000 (UTC) Received: by mail-lb0-f178.google.com with SMTP id bc4so29752175lbc.2 for ; Tue, 15 Mar 2016 09:14:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=e//Nnp8pbuZU3calKpWBxxUbG02xNAdEBybL/37nVzo=; b=bKBw5ml6VicTA/7CYOoPOo6tl30ENcVCx2ZZEnzEfhzhQCiNYY0qi8W9g9C0hfd9mk U63VpGuDxcAWUv8xlmbHgC8y7V4BXoGcF0wuM1i1D0kxdWHKikTfOX0jk/470cVu2mzz SYwJ7kBC3rUV7JJfCghAgBaGOwUoHhqR/U8NFFez1zw7tSbVMfW0KrmCHGRHyEds4zG7 /xI9E56/w0h5hfke0akfCAUqZDbjbS++Syd0p54lNRcSHyCYDwkLmESMy3/OuHoKo2/C jbFnzpy/kr0XMtL0/weKtWBqLh97+mwL2GP6mG3uI/7R0QNhv4M03n5PwZgjdocX1aDL px8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=e//Nnp8pbuZU3calKpWBxxUbG02xNAdEBybL/37nVzo=; b=IxLtUbf5j9ZCk4xl55YNrwxL4NR1U5J7jjCU5fB2nLLjyfKJ8VsHYk1VqQ+vX5RfxK MjWjdvxxMHVkdvKGAFAJG9L7OBFTOjJqSZoo40mJX/TMUv8j8Hw2PdoJUGTBqvWZ64W8 VfrsqK/qTbF3mUIth1IAwylY7csdaTgFVaTPkF5s+mW2exMV4YpAzCPv59oHvOeytCbl hSAJgKn+gaE9R/IbrftQeyat79BGHGzYLF/evGMAInsrNg7fTOYTAc7jG9P+yQ4bf9bU 5xy0NGzdv2kXYdTQ2/VNpJLVf1I1LY+drLh+XSiawjlKOzGoGDjPbkgJVVJ35hy8b/jj 0q1Q== X-Gm-Message-State: AD7BkJIasQxd1MYKfARNpNE54hEEsfqNYQlWIBZ2pMP9FuWFP3jc7/z/r72y70iZKWb+Ri3RhFLWZUL6e7o8pQ== X-Received: by 10.25.81.148 with SMTP id f142mr8400286lfb.95.1458058493642; Tue, 15 Mar 2016 09:14:53 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Adrien Grand Date: Tue, 15 Mar 2016 16:14:43 +0000 Message-ID: Subject: Re: Canonicalize stored fields (small set of possible values) To: java-user Content-Type: multipart/alternative; boundary=001a1141f24a93580d052e18b262 --001a1141f24a93580d052e18b262 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable You can still give an id to each value on the application side if you want to avoid repeating values. Otherwise, even without doing anything, things should not be too bad thanks to stored fields compression. Le mar. 15 mars 2016 =C3=A0 16:56, Erick Erickson = a =C3=A9crit : > In a word, "no". When you set stored=3D"true", Solr (well > actually Lucene) puts a compressed verbatim copy on disk. > > "Disk space is cheap" is the usual response here ;) > > Best, > Erick > > On Tue, Mar 15, 2016 at 8:35 AM, Andreas Sewe > wrote: > > Hi, > > > > I have an index in which each document has an indexed & stored "kind" > > StringField, which has a small set of possible values (about 10). > > > > Alas, Lucene (5.2.1) stores these field values over and over again, > > which seems wasteful. Is there a way to avoid this, while still having > > the fields' value available as stored? > > > > Best wishes, > > > > Andreas > > > > -- > > Codetrails GmbH > > The knowledge transfer company > > > > Robert-Bosch-Str. 7, 64293 Darmstadt > > Phone: +49-6151-276-7092 > > Mobile: +49-170-811-3791 > > http://www.codetrails.com/ > > > > Managing Director: Dr. Marcel Bruch > > Handelsregister: Darmstadt HRB 91940 > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > For additional commands, e-mail: java-user-help@lucene.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --001a1141f24a93580d052e18b262--