lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haipeng Du" <flyabove...@hotmail.com>
Subject RE: I am new to lucene
Date Tue, 14 Sep 2004 15:50:35 GMT
Thanks Aviran.
But how could I use content to search the document if I use 
Field.Text("content", reader). I do not want to  save the document content .
Thanks a lot.
Haipeng


>From: "Aviran" <amordo@infosciences.com>
>Reply-To: "Lucene Developers List" <lucene-dev@jakarta.apache.org>
>To: "'Lucene Developers List'" <lucene-dev@jakarta.apache.org>
>Subject: RE: I am new to lucene
>Date: Tue, 14 Sep 2004 11:28:08 -0400
>
>1. Field.Text Constructs a Reader-valued Field that is tokenized and
>indexed, but is not stored in the index verbatim, Thus you can not retrieve
>the text. You need to use Field.Text("content", String) to be able to read
>back the content.
>2. You can use an open source project called PDFBox which can extract text
>from a PDF document.
>
>Aviran
>
>-----Original Message-----
>From: Haipeng Du [mailto:flyabovesun@hotmail.com]
>Sent: Tuesday, September 14, 2004 11:18 AM
>To: lucene-dev@jakarta.apache.org
>Subject: I am new to lucene
>
>
>Hi, everyone:
>I am new to Lucene. There are some questions I want to know why.
>(1) when I use Field.Text("content", Reader) to index the file content, I
>can not retrive it when I search. Here is part of code
>Analyzer analyzer = new StopAnalyzer();
>     Searcher searcher = new IndexSearcher(indexPath);
>     Query query = QueryParser.parse(queryString, key2,
>                               analyzer);
>     Hits hits = searcher.search(query);
>I can not find the field when I use : hits.doc(i).get("content"). It is
>null. But I can get all other fields value as the same way. How could I get
>that?
>(2) Does Lucene have a way to index pdf content? Which is the best API that
>can be easy used to change pdf to text?
>Please response me. Thanks a lot.
>Haipeng
>
>_________________________________________________________________
>Express yourself instantly with MSN Messenger! Download today - it's FREE!
>hthttp://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>

_________________________________________________________________
Donít just search. Find. Check out the new MSN Search! 
http://search.msn.click-url.com/go/onm00200636ave/direct/01/


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message