lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Noble Paul (JIRA)" <>
Subject [jira] Commented: (SOLR-469) Data Import RequestHandler
Date Tue, 22 Apr 2008 15:00:22 GMT


Noble Paul commented on SOLR-469:

hi Grant,
we started of with something like that and very soon realized that it cannot scale beyond
the very basic usecases. 
We need the ability to apply transformations like, splitting, merging fields etc etc.
sometimes we need to put in a totally different piece of data .
eg: if a value is 1-5 put in the string 'low' , 5-10 put in 'medium' etc etc. 

All these are really driven by the business requirements

And there is the need for joining one table with another from the values in one table or merging
one table with many tables. 

Then we had use cases where data comes from a Db and using a key we have to fetch data from
an xml/http datasource etc etc. 

So , the fundamental design or the 'kernel' of the system is supposed to be totally agnostic
of the use cases and we let the users plug in the  implemenations in java/JS etc so that they
can do what they actually want. And we want to share some of the components which can be common
for others. 

> Data Import RequestHandler
> --------------------------
>                 Key: SOLR-469
>                 URL:
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 1.3
>            Reporter: Noble Paul
>            Assignee: Grant Ingersoll
>             Fix For: 1.3
>         Attachments: SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch,
SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch
> We need a RequestHandler Which can import data from a DB or other dataSources into the
Solr index .Think of it as an advanced form of SqlUpload Plugin (SOLR-103).
> The way it works is as follows.
>     * Provide a configuration file (xml) to the Handler which takes in the necessary
SQL queries and mappings to a solr schema
>           - It also takes in a properties file for the data source configuraution
>     * Given the configuration it can also generate the solr schema.xml
>     * It is registered as a RequestHandler which can take two commands do-full-import,
>           -  do-full-import - dumps all the data from the Database into the index (based
on the SQL query in configuration)
>           - do-delta-import - dumps all the data that has changed since last import.
(We assume a modified-timestamp column in tables)
>     * It provides a admin page
>           - where we can schedule it to be run automatically at regular intervals
>           - It shows the status of the Handler (idle, full-import, delta-import)

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message