lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "DES" <>
Subject Re: Indexing terms only
Date Wed, 22 Dec 2004 16:45:17 GMT
I actually use Field.Text(String,String) to add documents to my index. Maybe 
I do not understand the way an analyzer works, but I thought that all German 
articles (der, die, das etc) should be filtered out. However if I use Luke 
to view my index, the original text is completely stored in a field. And 
what I need is term vector, that I can create from an indexed document 
field. So this field should contain terms only.

> Whether or not the text is stored in the index is a different concern
> that how it is analyzed.  If you want the text to be indexed, and not
> stored, then use the Field.Text(String, String) method or the
> appropriate constructor when adding a field to the Document.  You'll
> need to also store a reference to the actual file (URL, Path, etc) in
> the document so it can be retrieved from the doc returned in the Hits
> object.
> Or did I completely misunderstand the question?
> -Mike
> On Wed, 22 Dec 2004 17:23:24 +0100, DES <> wrote:
>> hi
>> i need to index my text so that index contains only tokenized stemmed 
>> words without stopwords etc. The text ist german, so I tried to use 
>> GermanAnalyzer, but it stores whole text, not terms. Please give me a tip 
>> how to index terms only. Thanks!
>> DES
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message