Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 73382 invoked from network); 15 Sep 2008 19:57:51 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 15 Sep 2008 19:57:51 -0000 Received: (qmail 76681 invoked by uid 500); 15 Sep 2008 19:57:42 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 76329 invoked by uid 500); 15 Sep 2008 19:57:41 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 76318 invoked by uid 99); 15 Sep 2008 19:57:40 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Sep 2008 12:57:40 -0700 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: 209.85.217.30 is neither permitted nor denied by domain of lucene@mikemccandless.com) Received: from [209.85.217.30] (HELO mail-gx0-f30.google.com) (209.85.217.30) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Sep 2008 19:56:42 +0000 Received: by mail-gx0-f30.google.com with SMTP id 11so9750439gxk.5 for ; Mon, 15 Sep 2008 12:57:15 -0700 (PDT) Received: by 10.214.241.4 with SMTP id o4mr6333238qah.93.1221508634862; Mon, 15 Sep 2008 12:57:14 -0700 (PDT) Received: from ?10.17.4.4? ( [96.237.252.30]) by mx.google.com with ESMTPS id 6sm2372298qwk.1.2008.09.15.12.57.13 (version=TLSv1/SSLv3 cipher=RC4-MD5); Mon, 15 Sep 2008 12:57:13 -0700 (PDT) Message-Id: <244F0204-CDB0-4FD4-A7A6-0C8854492515@mikemccandless.com> From: Michael McCandless To: java-user@lucene.apache.org In-Reply-To: <1bcb7c7f0809151231i1ebf465ele2a9eca8f0c094cc@mail.gmail.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v926) Subject: Re: more on isDeleted Date: Mon, 15 Sep 2008 15:57:12 -0400 References: <1bcb7c7f0809151205s3c55ab73i87c1cc3efa09b9e5@mail.gmail.com> <1CA5EB01-9A02-4C03-810A-A68C8CEBA757@mikemccandless.com> <1bcb7c7f0809151231i1ebf465ele2a9eca8f0c094cc@mail.gmail.com> X-Mailer: Apple Mail (2.926) X-Virus-Checked: Checked by ClamAV on apache.org Until we can get realtime search integrated into Lucene (which I'm gradually trying to working on) I think the answer is no -- for now you have to keep your own record of which docIDs you've deleted. Because IndexWriter allows deletes by query and term (and also by docID, privately, when a document hits a non-aborting exception) it's tricky to give real-time isDeleted for a docID. My current thinking on how to do this (once we add realtime search) is when you ask IndexWriter for a new IndexReader, which searches the full index in the Directory plus all adds/deletes buffered in IndexWriter's RAM buffer, it must "materialize" all such buffered deletes down to docID. Those deletes that are against existing segments in the index will be flushed at that point to those segments; the deletes that apply only to buffered docs will be held in RAM and used by the RAMIndexSearcher that searches IndexWriter's buffer. Mike Cam Bazz wrote: > So, apart from the searcher, is there anyway to access the deletion > marks in an indexWriter. > > I have a live cache - and I was keeping two caches, ones for new adds, > other for deletes. > I am trying to get rid of deleted cache, and ask the index if a > fetched document is marked deleted. > > Best. > > -C.B. > > On Mon, Sep 15, 2008 at 10:20 PM, Michael McCandless > wrote: >> >> You'll have to open a new IndexReader after the delete is committed. >> >> An IndexReader (or IndexSearcher) only searches the point-in-time >> snapshot >> of the index as of when it was opened. >> >> Mike >> >> Cam Bazz wrote: >> >>> Hello, >>> >>> Here is what I am trying to do: >>> >>> dir = FSDirectory.getDirectory("/test"); >>> writer = new IndexWriter(dir, analyzer, true, new >>> IndexWriter.MaxFieldLength(2)); >>> writer.setMaxBufferedDocs(IndexWriter.DISABLE_AUTO_FLUSH); >>> >>> Document da = new Document(); >>> da.add(new Field("word", "a", Field.Store.YES, >>> Field.Index.NOT_ANALYZED_NO_NORMS)); >>> >>> Document db = new Document(); >>> db.add(new Field("word", "b", Field.Store.YES, >>> Field.Index.NOT_ANALYZED_NO_NORMS)); >>> >>> writer.addDocument(da); >>> writer.addDocument(db); >>> >>> writer.commit(); >>> >>> searcher = new IndexSearcher(dir); >>> >>> writer.deleteDocuments(new Term("word", "a")); >>> writer.commit(); >>> >>> TopDocCollector collector = new TopDocCollector(10); >>> searcher.search(new TermQuery(new Term("word","a")), >>> collector); >>> ScoreDoc[] hits = collector.topDocs().scoreDocs; >>> for (int i = 0; i < hits.length; i++) { >>> int docId = hits[i].doc; >>> Document d = searcher.doc(docId); >>> System.out.println(writer.hasDeletions()); >>> >>> System.out.println(searcher.getIndexReader().isDeleted(docId)); >>> System.out.println(d.get("word")); >>> } >>> >>> searcher.close(); >>> writer.close(); >>> dir.close(); >>> >>> >>> well I am trying to check if an document has been deleted without >>> refreshing the searcher. maybe i should access indexreader in a >>> different way? >>> the isDeleted() always returns false. that is the problem right now. >>> >>> Best. >>> >>> -C.B. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>> For additional commands, e-mail: java-user-help@lucene.apache.org >>> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org