lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lee <lee...@gmail.com>
Subject Re: indexing Guides? Indexing names
Date Tue, 10 Jun 2014 17:49:24 GMT

On 10/06/2014 18:40, Ted Dunning wrote:
>
 > On Tue, Jun 10, 2014 at 8:08 AM, Lee Goddard <leegee@gmail.com
 > <mailto:leegee@gmail.com>> wrote:
 >
 > Is it possible to weight the individual initials as words?
 >
 > Would you recommend employing a stemmer?
 >
 >
 > Yes it is definitely possible.  But don't just use any stemmer.  You
 > need to adapt something so that you preserve initial letters and
 > likely uses heuristics such as possibly preserving case.

Am I going to have to write a parser in Java for that, or is it a matter 
of combing what is in the box? I've previously created indexes of photos 
(my own parser) and indexes of documents, but indexing a single company 
name is quite a new idea to me.

> You will also probably want to  include alternative forms in other
 > fields.  These would include nicknames, stock symbols and
 > abbreviations.

Not in this — it's simply an interface to find information held by the 
state on the affairs of a company, so the alternative forms are of the 
final element of the company registered name: it might be 'Limited' but 
people may search 'ltd', it may be 'SE' but people may search 'european'.

TIA
Lee

Mime
View raw message