lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sahin Buyrukbilen <sahin.buyrukbi...@gmail.com>
Subject Re: How to export lucene index to a simple text file?
Date Tue, 21 Sep 2010 16:33:04 GMT
Thank you Uwe, I will read the docs and try to do it, however do you have an
example code? I need because I am not very familiar with Java.

Thank you.

Sahin

On Tue, Sep 21, 2010 at 12:29 PM, Uwe Schindler <uwe@thetaphi.de> wrote:

> Hi,
>
> Retrieve a TermEnum and iterate it. By that you get all terms and can
> retrieve the docFreq, which is the second column in your table. Finally for
> each term you position the TermDocs enum on this term to get all document
> ids. Read docs of IndexReader/TermEnum/TermDocs about this.
>
> Uwe
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
> > -----Original Message-----
> > From: Sahin Buyrukbilen [mailto:sahin.buyrukbilen@gmail.com]
> > Sent: Tuesday, September 21, 2010 9:12 AM
> > To: java-user@lucene.apache.org
> > Subject: How to export lucene index to a simple text file?
> >
> > Hi,
> >
> > I am currently working on a project about private information retrieval
> and I
> > need to have an inverted index file in txt format as follows:
> >
> > Term t    freq t      Inverted list for t
> > -------------------------------------------------------------------------
> > and          1          <6, 0.159>
> > big           2          <2, 0.148> <3, 0.088>
> > dark         1          <6, 0.079>
> > .
> > .
> > .
> > .
> >
> > here the <number1, number2> pairs are indicating: number1: doc ID, where
> > term t exist with a rank of number2.
> >
> > I have created an index from 5492 txt files, however the index is
> composed
> of
> > different files and most of the data is not in the text format.
> >
> > could somebody guide me to achieve this?
> >
> > Thank you
> >
> > Sahin.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message