lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: How to export lucene index to a simple text file?
Date Tue, 21 Sep 2010 16:29:55 GMT
Hi,

Retrieve a TermEnum and iterate it. By that you get all terms and can
retrieve the docFreq, which is the second column in your table. Finally for
each term you position the TermDocs enum on this term to get all document
ids. Read docs of IndexReader/TermEnum/TermDocs about this.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Sahin Buyrukbilen [mailto:sahin.buyrukbilen@gmail.com]
> Sent: Tuesday, September 21, 2010 9:12 AM
> To: java-user@lucene.apache.org
> Subject: How to export lucene index to a simple text file?
> 
> Hi,
> 
> I am currently working on a project about private information retrieval
and I
> need to have an inverted index file in txt format as follows:
> 
> Term t    freq t      Inverted list for t
> -------------------------------------------------------------------------
> and          1          <6, 0.159>
> big           2          <2, 0.148> <3, 0.088>
> dark         1          <6, 0.079>
> .
> .
> .
> .
> 
> here the <number1, number2> pairs are indicating: number1: doc ID, where
> term t exist with a rank of number2.
> 
> I have created an index from 5492 txt files, however the index is composed
of
> different files and most of the data is not in the text format.
> 
> could somebody guide me to achieve this?
> 
> Thank you
> 
> Sahin.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message