Return-Path: Delivered-To: apmail-lucene-solr-dev-archive@locus.apache.org Received: (qmail 27635 invoked from network); 22 Apr 2008 15:03:47 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 22 Apr 2008 15:03:47 -0000 Received: (qmail 2821 invoked by uid 500); 22 Apr 2008 15:03:47 -0000 Delivered-To: apmail-lucene-solr-dev-archive@lucene.apache.org Received: (qmail 2794 invoked by uid 500); 22 Apr 2008 15:03:47 -0000 Mailing-List: contact solr-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-dev@lucene.apache.org Delivered-To: mailing list solr-dev@lucene.apache.org Received: (qmail 2783 invoked by uid 99); 22 Apr 2008 15:03:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Apr 2008 08:03:47 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Apr 2008 15:03:03 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 598A1234C101 for ; Tue, 22 Apr 2008 08:00:22 -0700 (PDT) Message-ID: <550204726.1208876422365.JavaMail.jira@brutus> Date: Tue, 22 Apr 2008 08:00:22 -0700 (PDT) From: "Noble Paul (JIRA)" To: solr-dev@lucene.apache.org Subject: [jira] Commented: (SOLR-469) Data Import RequestHandler In-Reply-To: <14055668.1201860068880.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/SOLR-469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591320#action_12591320 ] Noble Paul commented on SOLR-469: --------------------------------- hi Grant, we started of with something like that and very soon realized that it cannot scale beyond the very basic usecases. We need the ability to apply transformations like, splitting, merging fields etc etc. sometimes we need to put in a totally different piece of data . eg: if a value is 1-5 put in the string 'low' , 5-10 put in 'medium' etc etc. All these are really driven by the business requirements And there is the need for joining one table with another from the values in one table or merging one table with many tables. Then we had use cases where data comes from a Db and using a key we have to fetch data from an xml/http datasource etc etc. So , the fundamental design or the 'kernel' of the system is supposed to be totally agnostic of the use cases and we let the users plug in the implemenations in java/JS etc so that they can do what they actually want. And we want to share some of the components which can be common for others. > Data Import RequestHandler > -------------------------- > > Key: SOLR-469 > URL: https://issues.apache.org/jira/browse/SOLR-469 > Project: Solr > Issue Type: New Feature > Components: update > Affects Versions: 1.3 > Reporter: Noble Paul > Assignee: Grant Ingersoll > Fix For: 1.3 > > Attachments: SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch > > > We need a RequestHandler Which can import data from a DB or other dataSources into the Solr index .Think of it as an advanced form of SqlUpload Plugin (SOLR-103). > The way it works is as follows. > * Provide a configuration file (xml) to the Handler which takes in the necessary SQL queries and mappings to a solr schema > - It also takes in a properties file for the data source configuraution > * Given the configuration it can also generate the solr schema.xml > * It is registered as a RequestHandler which can take two commands do-full-import, do-delta-import > - do-full-import - dumps all the data from the Database into the index (based on the SQL query in configuration) > - do-delta-import - dumps all the data that has changed since last import. (We assume a modified-timestamp column in tables) > * It provides a admin page > - where we can schedule it to be run automatically at regular intervals > - It shows the status of the Handler (idle, full-import, delta-import) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.