Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4090EC972 for ; Mon, 6 Aug 2012 11:41:06 +0000 (UTC) Received: (qmail 71625 invoked by uid 500); 6 Aug 2012 11:41:04 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 71558 invoked by uid 500); 6 Aug 2012 11:41:03 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 71529 invoked by uid 99); 6 Aug 2012 11:41:02 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Aug 2012 11:41:02 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FSL_FREEMAIL_1,FSL_FREEMAIL_2,FSL_RCVD_USER,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of simon.willnauer@gmail.com designates 209.85.213.176 as permitted sender) Received: from [209.85.213.176] (HELO mail-yx0-f176.google.com) (209.85.213.176) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Aug 2012 11:40:57 +0000 Received: by yenl5 with SMTP id l5so2782770yen.35 for ; Mon, 06 Aug 2012 04:40:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=hvalTb93sYJkDANKmB1CPLI5K7TYQMQYHW4fz295ETA=; b=bepRJY/T1HBksUcknDRdGDVnlJ8i3A9bep0KhfHbytRgid4lvgF9zxD9uVVxt8E/M2 eOmDqxYA67ay1r7uF5pcgTMwkhBqZadcsDXrJGsNW2gMYCUqn5twA58NSFNgGk7DvRs3 D6DdVndhrpsGNhOfZgLUx/nGHpMl7bM6wIfwUGrlNDHWJAShgSA/av6zYRYZOQUnWXnc MKrgnOPU00VcjyYOftYIdwZ1vbgZOBjXz01fKw80Cl2Zjza33dXNBcSvh3ZWUBgnFF8r EbkW2/ajiyEbxnDQ4Gk9byalw4//tz7NaAd7d+4Dc/EfuVoQnPWFJL7J3wffYQZ3cFG/ p8bQ== MIME-Version: 1.0 Received: by 10.60.1.69 with SMTP id 5mr18502669oek.66.1344253237203; Mon, 06 Aug 2012 04:40:37 -0700 (PDT) Received: by 10.60.115.33 with HTTP; Mon, 6 Aug 2012 04:40:37 -0700 (PDT) Reply-To: simon.willnauer@gmail.com In-Reply-To: References: Date: Mon, 6 Aug 2012 13:40:37 +0200 Message-ID: Subject: Re: questions about DocValues in 4.0 alpha From: Simon Willnauer To: java-user@lucene.apache.org Content-Type: text/plain; charset=UTF-8 X-Virus-Checked: Checked by ClamAV on apache.org hey, On Mon, Aug 6, 2012 at 11:34 AM, Li Li wrote: > hi everyone, > in lucene 4.0 alpha, I found the DocValues are available and gave > it a try. I am following the slides in > http://www.slideshare.net/lucenerevolution/willnauer-simon-doc-values-column-stride-fields-in-lucene > I have got 2 questions. > 1. is DocValues updatable now? no not yet. simon > > 2. How can I get docBase of an AtomicReader? > in Collector, it's easy to get docBase. But I need to get > docValues after scoring. I find > AtomicReader.getTopReaderContext().docBaseInParent > and subReader.getTopReaderContext().docBase. But neither of them is correct. > So I have to iterate through all subReaders and use maxDoc() > to find suitable subReader for a docID. any better method to find > corresponding AtomicReader of a docID? > File d=new File("./testIndex"); > IndexWriterConfig cfg=new IndexWriterConfig(Version.LUCENE_40, new > WhitespaceAnalyzer(Version.LUCENE_40)); > cfg.setOpenMode(OpenMode.CREATE); > Directory dir=FSDirectory.open(d); > IndexWriter writer=new IndexWriter(dir,cfg); > FieldType titleFieldType=new FieldType(); > titleFieldType.setStored(true); > titleFieldType.setIndexed(true); > titleFieldType.setTokenized(true); > titleFieldType.setOmitNorms(true); > > Document doc=new Document(); > Field f=new Field("title","a b c",titleFieldType); > doc.add(f); > > FloatDocValuesField dvf=new FloatDocValuesField("pagerank", 0.8f); > doc.add(dvf); > > writer.addDocument(doc); > > doc=new Document(); > doc.add(new Field("title","b d",titleFieldType)); > dvf=new FloatDocValuesField("pagerank", 0.5f); > doc.add(dvf); > writer.addDocument(doc); > > writer.commit(); > > doc=new Document(); > doc.add(new Field("title","a c",titleFieldType)); > dvf=new FloatDocValuesField("pagerank", 0.5f); > doc.add(dvf); > writer.addDocument(doc); > > > DirectoryReader reader=DirectoryReader.open(writer, true); > IndexSearcher searcher=new IndexSearcher(reader); > Query q=new TermQuery(new Term("title","a")); > TopDocs topDocs=searcher.search(q, 10); > Set fieldsNeedLoaded=new HashSet(1); > fieldsNeedLoaded.add("title"); > @SuppressWarnings("unchecked") > List subReaders=(List) > reader.getSequentialSubReaders(); > Source[] sources=new Source[subReaders.size()]; > int idx=0; > for(AtomicReader subReader:subReaders){ > sources[idx++]=subReader.docValues("pagerank").getSource(); > } > > for(int i=0;i int docId=topDocs.scoreDocs[i].doc; > float score=topDocs.scoreDocs[i].score; > //get title > Document document=searcher.document(docId, fieldsNeedLoaded); > System.out.println("title: " +document.get("title")+" score: "+score); > idx=-1; > int docBase=0; > for(AtomicReader subReader:subReaders){ > idx++; > //int docBase=subReader.getTopReaderContext().docBaseInParent; > > int realDoc=docId-docBase; > if(realDoc>=0&&realDoc double pagerank=sources[idx].getFloat(realDoc); > System.out.println(pagerank); > break; > } > docBase+=subReader.maxDoc(); > } > } > } > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org