lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacek Plebanek <pleba...@gmail.com>
Subject Federated search in Solr - proposal
Date Thu, 10 May 2012 16:31:11 GMT
Hello,

I'm starting to work on federated search algorithms for my PhD study.
I'll use Solr to implement them (Since I have two years experience with
Solr at my work).

I thought that at least part of my work could be useful for Solr Project
and I could contribute some code. I mean specifically the
components/modifications to add federated search support to Solr.

By "Federated Search" I mean searching across heterogeneous data sources
(something different than existing Distributed Search implemented in
Solr) - to allow Solr to merge results not only from SolrServer
instances, but also to include results from external sources (eg. search
engines using different API). The use case would look like this:
- user sends the request to Solr (eg. SearchRequest)
- Solr handles the request internally and/or sends it to other Solr
instances (current Distributed Search) AND sends it to specified
external data sources using dedicated adapters.
- Solr merges the results from Solr instances with results from external
collections and returns the combined results to user.

To perform this scenario the four common federated search parts should
be supported:
- collection representation (external collections probably won't provide
the same informations as Solr, like tf-idf)
- collection selection (predict which collections may return relevant
results and transfer the search request only to them)
- result merging (merge results based on more limited informations than
Solr provides)
- external sources connection (common API to write custom collections
adapters)

I thought I would write some federated search components - schema to
allow developers to implement custom algorithms/plugins for each part of
federated search scenario.


What do You think about that?


Sorry for my English :)

Jacek Plebanek

Interdisciplinary Centre for Mathematical and Computational Modelling
University of Warsaw, Poland


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message