lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: LineDocMaker usage
Date Thu, 07 Aug 2008 09:02:32 GMT

Also, since the docs are so regular from a line file, to gain indexing  
throughput you can re-use the Document & Field instances, and inside  
the while loop just use Field.setValue(...) to change the value per  
document.

Mike

Anshum wrote:

> Hi,
> How about just opening a file and parsing through it while adding  
> doing a
> doc.add on each newline? That should be pretty straight and simple.
> Just writing the snippet here, though this might have issues as  
> didnt try to
> compile it.
>
>    IndexWriter writer = new IndexWriter(indexDir, new  
> StandardAnalyzer(),
> true);
>
>    FileInputStream fstream = new FileInputStream("textfile.txt");
>    DataInputStream in = new DataInputStream(fstream);
>    BufferedReader br = new BufferedReader(new InputStreamReader(in));
>    String strLine;
>    while ((strLine = br.readLine()) != null)   {
>        Document doc = new Document();
>            doc.add(new Field("filename",
> f.getCanonicalPath(),Field.Store.YES,Field.Index.TOKENIZED));
>            doc.add(new Field("filename",
> strLine,Field.Store.YES,Field.Index.TOKENIZED));//DEPENDING UPON HOW  
> YOU
> WANT TO INDEX IT
>            writer.addDocument(doc);
>    }
>    in.close();
>    writer.close();
>
>
> Also, I have tokenized the content and stored it so that it could be
> fetched, you might just want to have a ref key instead of storing the
> entrire content though. Upto you for implementation.
>
> --
> Anshum
> http://ai-cafe.blogspot.com
>
>
>
> On Thu, Aug 7, 2008 at 1:42 AM, Brittany Jacobs <bjacobs@jbmanagement.com 
> >wrote:
>
>> Hello, I am new to all this.  I need to read in a text file and  
>> have each
>> line in the file be a document.
>>
>> The LineDocMaker seems to be intended for this purpose.  But I  
>> can't figure
>> out how to read the data into it.
>>
>> Any examples would be greatly appreciated.
>>
>>
>>
>>
>
>
> -- 
> --
> The facts expressed here belong to everybody, the opinions to me.
> The distinction is yours to draw............


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message