Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 4DCEA200D28 for ; Mon, 23 Oct 2017 12:23:51 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 4C2F41609E0; Mon, 23 Oct 2017 10:23:51 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 6A7291609CE for ; Mon, 23 Oct 2017 12:23:50 +0200 (CEST) Received: (qmail 7194 invoked by uid 500); 23 Oct 2017 10:23:49 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 7182 invoked by uid 99); 23 Oct 2017 10:23:48 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 23 Oct 2017 10:23:48 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 0B71E1806FC for ; Mon, 23 Oct 2017 10:23:48 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.78 X-Spam-Level: * X-Spam-Status: No, score=1.78 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=mikemccandless-com.20150623.gappssmtp.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id pLiKSYXQUzV3 for ; Mon, 23 Oct 2017 10:23:45 +0000 (UTC) Received: from mail-it0-f54.google.com (mail-it0-f54.google.com [209.85.214.54]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id D492E5FE43 for ; Mon, 23 Oct 2017 10:23:44 +0000 (UTC) Received: by mail-it0-f54.google.com with SMTP id 72so5278791itl.5 for ; Mon, 23 Oct 2017 03:23:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mikemccandless-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=FeLoicqoIV8Qk6xCzHsUugcvUvPqk3SF0d8or+4WXoU=; b=BtrX2yoXBFLBjDENshpoWcs/ASoebIdlnaRw6VI7p78f4sJZyUnDXbbm/K3J1yDObS 22TWL+5YUvOVNygGxK4hnN/IF8KUBGbyBoWlVGAmP+TmoVyLhVp37TVXXRI8wfRX5kPe Bx+u/AyMpYjquqTo5NbVezRWDAm7i44jL27sTwKmck1ief4S9tnV/P60/rJJOsjCpc1y XvPVrK0anGqGOjWckkRydtNNMy6lJQuUICIEHVWKkUGVPi1P6mU6L6k5waUAn+ZmcEt9 +lw52qNQFrm3sTNGCEuPPo+bdDdaMGfsM9nmpHm24R9Rtbmrhf5KOuZtS5kBNTdmprvp x7hg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=FeLoicqoIV8Qk6xCzHsUugcvUvPqk3SF0d8or+4WXoU=; b=LMasRqhDY7KI+FAFA58EtPY2i5qvepOriMqM01S8xhYlwCLCRUMh9I+fimLYMuhswN 0wzUyunW8LCp4zT22KCvVLi2ZLcbMO0Tg3j2orYBBWDWdMuSQ+2cq+xls6zgrrLKHBOA 10pBj1kI3lSKVC8RzwhB/TO/MHZgTwwRUklY9h+6pQHEBgNHUy/sSQoFGyZNj64P8GON IUUoeOXw0xetkebNSsGV5FyeEKV/IZEBoSO9wVBSuvH2o3XK3Au2R3KpJoVMisSvL6/v Xkq4Tgy2PARzS+lEs+ORf7nlbcmeSfGX70ddo26KVyvZQn2Ep1+36gL8IH0spHVYc6fS 1KQA== X-Gm-Message-State: AMCzsaW23U+hC7YwKiQKKjwJFnQRfkO0OPC6hJVxNK1jxXa3SpRE7t7q W9M3jyr4cRqvIcTNDmgoiCOomiOraww7yN0rSWl3tQ== X-Google-Smtp-Source: ABhQp+Qv3IoCnduLBROxwYNcmJRCsselPnpU151voJtniRZOZeIRvaMdHf4gRSuFBPprOr45oCiAFJagUxWObegSj0w= X-Received: by 10.36.227.7 with SMTP id d7mr8212396ith.141.1508754223401; Mon, 23 Oct 2017 03:23:43 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.157.82 with HTTP; Mon, 23 Oct 2017 03:23:02 -0700 (PDT) In-Reply-To: <4aef07d9-85e0-9e71-cc9d-b4f4b7d796ce@bammers.net> References: <4aef07d9-85e0-9e71-cc9d-b4f4b7d796ce@bammers.net> From: Michael McCandless Date: Mon, 23 Oct 2017 06:23:02 -0400 Message-ID: Subject: Re: DocValues and SearcherManager To: Lucene Users , chris@bammers.net Content-Type: multipart/alternative; boundary="94eb2c111b068a1e7f055c34374f" archived-at: Mon, 23 Oct 2017 10:23:51 -0000 --94eb2c111b068a1e7f055c34374f Content-Type: text/plain; charset="UTF-8" Hmm the document you get back from a reader is NOT the same document you had indexed, so you cannot retrieve a doc from a reader, tweak it, re-index it, and hope everything survived. In particular, your doc values field "id" is not stored, so when you retrieve it from the reader, there is no id field, and so when you then replace that document, the new document has no id field. You could just add a StoredField("id", id) and then the id is there, but it will be a simple stored field, not a doc values field. You must then build a new Document instance for indexing, where you convert that id back into a doc values field. Mike McCandless http://blog.mikemccandless.com On Fri, Oct 20, 2017 at 5:28 AM, Chris and Helen Bamford wrote: > Hi, > > I am using Lucene 4.10.3 and have a problem retrieving a DocValue field of > a document using SearcherManager after I have updated a stored field value. > > The document has two key values: 'state' (stored Field) and 'id' > (BinaryDocValue). > > After the document is indexed, it undergoes the following chain of events: > > - it is retrieved from the index by 'state' (using a Searcher obtained by > SearcherManager.maybeRefresh() & searcherManager.acquire()) > - the 'state' field's value is changed and the document is updated using > the IndexWriter from the SearcherManager (indexWriter.updateDocument(Term, > Document)) > > This all works fine. > > The problem comes when I want to match on the docValue 'id' reusing the > same Searcher (SearcherManager.maybeRefresh() + > searcherManager.acquire()), which does not work. > > I'm no expert but it seems that when the document is retrieved by 'state' > it has only stored fields in the list, so when updated it ends up calling > FieldInfos.addOrUpdate that discards FieldInfo of the docValue field 'myId' > from the list. Afterwards it is impossible to retrieve the docValue using > the same searcher (searcherManager.maybeRefresh() + > searcherManager.acquire()). > > If a new reader is obtained the docValue match/update is possible but this > is a performance critical piece of code and I was hoping to reuse the same > SearcherManager. > > The unit test here shows the problem: > > public class SearcherManagerFailureTest { > private static final String indexPath = "/tmp/mytestindex"; > > private IndexWriter indexWriter; > private SearcherManager searcherManager; > public Directory directory; > > @Before > public void beforeTest() throws Exception { > // Setup > directory = FSDirectory.open(new File(indexPath)); > IndexWriterConfig idxCfg = new IndexWriterConfig(Version.LUCENE_4_10_3, > new WhitespaceAnalyzer()); > idxCfg.setOpenMode(IndexWriterConfig.OpenMode.CREATE_OR_APPEND); > indexWriter = new IndexWriter(directory, idxCfg); > searcherManager = new SearcherManager(indexWriter, true, new > SearcherFactory()); > } > > @After > public void afterTest() throws Exception { > if (indexWriter != null) { > indexWriter.commit(); > indexWriter.close(); > } > > if (searcherManager != null) { > searcherManager.close(); > } > > if (directory != null) { > directory.close(); > } > > for(File file: new File(indexPath).listFiles()) > if (!file.isDirectory()) > file.delete(); > } > > @Test > public void TestSearcherManagerFails() throws Exception{ > > //Indexing > Document doc = new Document(); > FieldType ft = new FieldType(TextField.TYPE_STORED); > ft.setTokenized(false); > doc.add(new Field("docId", "doc1", ft)); > doc.add(new Field("state", "added", ft)); > doc.add(new BinaryDocValuesField("id", new BytesRef("first"))); > indexWriter.addDocument(doc); > indexWriter.commit(); > > //Search by state > searcherManager.maybeRefresh(); > IndexSearcher searcher = searcherManager.acquire(); > TopDocs topDocs = searcher.search(new TermQuery(new Term("state", > "added")), null, 1); > Document indexedDoc = searcher.doc(topDocs.scoreDocs[0].doc); > > //Update document > String docId = indexedDoc.get("docId"); > Term term = new Term("docId", docId); > Field stateField = (Field) indexedDoc.getField("state"); > stateField.setStringValue("processed"); > indexWriter.updateDocument(term, indexedDoc); > > //Try get docValue > searcherManager.maybeRefresh(); > IndexSearcher newSearcher = searcherManager.acquire(); > > BinaryDocValues docValues = MultiDocValues.getBinaryValues(newSearcher.getIndexReader(), > "id"); > Assert.assertEquals(null, docValues); > > Directory newDirectory = FSDirectory.open(new File(indexPath)); > BinaryDocValues docValues2 = MultiDocValues.getBinaryValues > (DirectoryReader.open(newDirectory), "id"); > Assert.assertNotSame(null, docValues2); > > if(newDirectory != null){ > newDirectory.close(); > } > } > } > > > Can anyone advise? > > Thanks > > - Chris > > --94eb2c111b068a1e7f055c34374f--