lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Li Li <fancye...@gmail.com>
Subject questions about DocValues in 4.0 alpha
Date Mon, 06 Aug 2012 09:34:17 GMT
hi everyone,
    in lucene 4.0 alpha, I found the DocValues are available and gave
it a try.  I am following the slides in
http://www.slideshare.net/lucenerevolution/willnauer-simon-doc-values-column-stride-fields-in-lucene
    I have got 2 questions.
    1. is DocValues updatable now?

    2. How can I get docBase of an AtomicReader?
        in Collector, it's easy to get docBase. But I need to get
docValues after scoring. I find
AtomicReader.getTopReaderContext().docBaseInParent
    and subReader.getTopReaderContext().docBase. But neither of them is correct.
        So I have to iterate through all subReaders and use maxDoc()
to find suitable subReader for a docID. any better method to find
corresponding AtomicReader of a docID?
		File d=new File("./testIndex");
		IndexWriterConfig cfg=new IndexWriterConfig(Version.LUCENE_40, new
WhitespaceAnalyzer(Version.LUCENE_40));
		cfg.setOpenMode(OpenMode.CREATE);
		Directory dir=FSDirectory.open(d);
		IndexWriter writer=new IndexWriter(dir,cfg);
		FieldType titleFieldType=new FieldType();
		titleFieldType.setStored(true);
		titleFieldType.setIndexed(true);
		titleFieldType.setTokenized(true);
		titleFieldType.setOmitNorms(true);
		
		Document doc=new Document();		
		Field f=new Field("title","a b c",titleFieldType);
		doc.add(f);
		
		FloatDocValuesField dvf=new FloatDocValuesField("pagerank", 0.8f);
		doc.add(dvf);
		
		writer.addDocument(doc);
		
		doc=new Document();
		doc.add(new Field("title","b d",titleFieldType));
		dvf=new FloatDocValuesField("pagerank", 0.5f);
		doc.add(dvf);
		writer.addDocument(doc);
		
		writer.commit();
		
		doc=new Document();
		doc.add(new Field("title","a c",titleFieldType));
		dvf=new FloatDocValuesField("pagerank", 0.5f);
		doc.add(dvf);
		writer.addDocument(doc);
		
		
		DirectoryReader reader=DirectoryReader.open(writer, true);
		IndexSearcher searcher=new IndexSearcher(reader);
		Query q=new TermQuery(new Term("title","a"));
		TopDocs topDocs=searcher.search(q, 10);
		Set<String> fieldsNeedLoaded=new HashSet<String>(1);
		fieldsNeedLoaded.add("title");
		@SuppressWarnings("unchecked")
		List<AtomicReader> subReaders=(List<AtomicReader>)
reader.getSequentialSubReaders();
		Source[] sources=new Source[subReaders.size()];
		int idx=0;
		for(AtomicReader subReader:subReaders){
			sources[idx++]=subReader.docValues("pagerank").getSource();
		}
		
		for(int i=0;i<topDocs.totalHits;i++){
			int docId=topDocs.scoreDocs[i].doc;
			float score=topDocs.scoreDocs[i].score;
			//get title
			Document document=searcher.document(docId, fieldsNeedLoaded);
			System.out.println("title: " +document.get("title")+" score: "+score);
			idx=-1;
			int docBase=0;
			for(AtomicReader subReader:subReaders){
				idx++;
				//int docBase=subReader.getTopReaderContext().docBaseInParent;
				
				int realDoc=docId-docBase;
				if(realDoc>=0&&realDoc<subReader.maxDoc()){
					double pagerank=sources[idx].getFloat(realDoc);
					System.out.println(pagerank);
					break;
				}
				docBase+=subReader.maxDoc();
			}
		}
	}

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message