lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martin O'Shea" <app...@dsl.pipex.com>
Subject Use of hyphens in StandardAnalyzer
Date Sun, 24 Oct 2010 19:58:41 GMT
Hello

 

I have a StandardAnalyzer working which retrieves words and frequencies from
a single document using a TermVectorMapper which is populating a HashMap.

 

But if I use the following text as a field in my document, i.e. 

 

addDoc(w, "lucene Lawton-Browne Lucene");

 

The word frequencies returned in the HashMap are:

 

browne 1

lucene 2

lawton 1

 

The problem is the words 'lawton' and 'browne'. If this is an actual
'double-barreled' name, can Lucene recognise it as 'Lawton-Browne' where the
name is actually a single word?

 

I've tried combinations of:

 

addDoc(w, "lucene \"Lawton-Browne\" Lucene");

 

And single quotes but without success.

 

Thanks

 

Martin O'Shea.

 

 

 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message