lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Renaud Delbru <renaud.del...@deri.org>
Subject Re: Search across related/correlated multivalue fields in Solr
Date Wed, 27 Apr 2011 18:17:19 GMT
Hi,

you might want to look at the SIREn plugin [1,2], which allows you to 
index and query 1:N relationships such as yours, in a tabular data 
format [3].

[1] http://siren.sindice.com/
[2] https://github.com/rdelbru/SIREn
[3] 
https://dev.deri.ie/confluence/display/SIREn/Indexing+and+Searching+Tabular+Data

Kind Regards,
-- 
Renaud Delbru

On 27/04/11 18:30, ronotica wrote:
> The nature of my project is such that search is needed and specifically
> search across related entities. We want to perform several queries involving
> a correlation between two or more properties of a given entity in a
> collection.
>
> To put things in context, here is a snippet of the domain:
>
> Student { firstname, lastname }
> Education { degreeCode, degreeYear, institution }
>
> The database tables look like so:
>
> STUDENT
> ----------
> STUDENT_ID     FNAME      LNAME
> 100                 John          Doe
> 200                 Rasheed     Jones
> 300                 Mary          Hampton
>
> EDUCATION
> -------------
> EDUCATION_ID      DEGREE_CODE       DEGREE_YR       INSTITUTION
> STUDENT_ID
> 1                         MD                      2008
> OHIO_ST                100
> 2                         PHD                     2010                 YALE
> 100
> 3                         MS                      2007
> OHIO_ST               200
> 4                         MD                      2010                  YALE
> 300
>
> A student can have many educations. Currently, our documents look like this
> in solr:
>
> DOC_ID       STUDENT_ID    FNAME       LNAME      DEGREE_CODE    DEGREE_YR
> INSTITUTION
> 100             100                John          Doe          MD PHD
> 2008 2010     OHIO_ST YALE
> 101             200                Rasheed     Jones        MS
> 2007             OHIO_ST
> 102             300                Mary          Hampton   MD
> 2010             YALE
>
> Searching for all students who graduated from OHIO_ST in 2010 currently
> gives a hit (John Doe) when it shouldn't.
>
> What is the best way to have overcome this issue in Solr? This is only
> happening when I am searching across correlated fields, mainly because the
> data has been denormalized and Lucene has no notion of relationships between
> the various fields.
>
> One way that as come to mind is to have separate documents for "education"
> and perform multiple searches to get at an answer. Besides this, is there
> any other way? Does Solr provide any elegant solution for this?
>
> Any help will be greatly appreciated.
>
> Thanks.
>
> PS: We have about 15 of these kind of relationships all relating to the
> student and will like to perform search on each of them.
>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Search-across-related-correlated-multivalue-fields-in-Solr-tp2871176p2871176.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Mime
View raw message