lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Noble Paul (JIRA)" <j...@apache.org>
Subject [jira] Commented: (SOLR-469) Data Import RequestHandler
Date Tue, 22 Apr 2008 13:18:28 GMT

    [ https://issues.apache.org/jira/browse/SOLR-469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591293#action_12591293
] 

Noble Paul commented on SOLR-469:
---------------------------------

Giving a single SQL may limit the utility, because you may need to join more than one table
in most of the usecases. 

But it is possible to pass on the whole dataconfig itself as a request parameter. .We currently
use that in the interactive development mode. 

We have tried hard to cut down the verbosity of the configuration patch after patch . Now
the 'metadata' i.e the extra information other than the queries itself is minimal. We leverage
on the data such as schema etc to achieve it.

The connections are created once and consumed throughout one import. We take in the details
for creating connections in the configuration (see documentation)




> Data Import RequestHandler
> --------------------------
>
>                 Key: SOLR-469
>                 URL: https://issues.apache.org/jira/browse/SOLR-469
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 1.3
>            Reporter: Noble Paul
>            Assignee: Grant Ingersoll
>             Fix For: 1.3
>
>         Attachments: SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch,
SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch
>
>
> We need a RequestHandler Which can import data from a DB or other dataSources into the
Solr index .Think of it as an advanced form of SqlUpload Plugin (SOLR-103).
> The way it works is as follows.
>     * Provide a configuration file (xml) to the Handler which takes in the necessary
SQL queries and mappings to a solr schema
>           - It also takes in a properties file for the data source configuraution
>     * Given the configuration it can also generate the solr schema.xml
>     * It is registered as a RequestHandler which can take two commands do-full-import,
do-delta-import
>           -  do-full-import - dumps all the data from the Database into the index (based
on the SQL query in configuration)
>           - do-delta-import - dumps all the data that has changed since last import.
(We assume a modified-timestamp column in tables)
>     * It provides a admin page
>           - where we can schedule it to be run automatically at regular intervals
>           - It shows the status of the Handler (idle, full-import, delta-import)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message