lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexandre Rafalovitch <arafa...@gmail.com>
Subject Re: Using SOLR to search for Names from RDBMS
Date Thu, 02 Mar 2017 19:32:09 GMT
You would absolutely want to read "Relevant Search" book first. It is
based on Elasticsearch examples, but the concepts map to Solr (and
there is an appendix).

(The following is mostly for names, phone numbers, don't know about addresses)

The core issue is that you will want to setup a bunch of copyFields to
create different level of analysis precision for the names with the
goal of more precise versions of the name matching with higher boost.
Otherwise, you are going to have an issue where a common surname
matches against the name (Smith Jones) and get really boosted.

Then, you will want to have a look at phonetic mapping. There is
several different algorithms, depending on the kinds of names you get.
Some are better for Western, some are better for Eastern European. You
can mix them again with balancing the boosts.

You have to decide whether you are doing one big search box (could get
messy) or one where people can enter different elements in different
boxes. The later is easier to tune, but you need to pass the data to
Solr from your middleware. This example may help to see how to
construct search line only if some of the search fields are provided:
https://gist.github.com/arafalov/5e04884e5aefaf46678c

Also, if you search phone numbers, you may want to do suffix search
(last n digits of the number). I recommend squishing all non-digits,
reversing the string and doing EdgeNGrams. Makes it a lot easier. I
did a presentation on this a couple of years back, I could dig it out
probably if you need more details.

Regards,
   Alex.

----
http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 2 March 2017 at 13:11, Bijesh EB <e.b.bijesh@gmail.com> wrote:
> Hi All,
>
> First off, what a fabulous job you all are doing creating and supporting an
> open source solution! Great Work and many thanks for that.
>
>  I am reasonably new to SOLR and our team is trying to integrate SOLR to a
> structured database to help with Searching Person Records (first name, last
> name, address etc).
>  Our developers (who also are currently learning SOLR) have managed to get
> the initial set up done, but when I am trying to test, I do see that there
> are a lot of seemingly unrelated results coming up potentially because of
> the combinations of algorithms used.
> Because of that, I was wondering, is there any subset of algorithms that
> are recommended to be used when working with relation DB for items such as
> names and addresses as opposed to searching a website for a string etc.
> Also is there any link to algorithms and their behaviours with some
> examples that non technical persons can relate to.
>
> I am not a technical person myself, but I am trying to learn from the
> experts here who might have been there and done that many times over.
>
> Thanks,
>
> Bijesh  EB

Mime
View raw message