lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Moser (JIRA)" <j...@apache.org>
Subject [jira] Commented: (SOLR-469) Data Import RequestHandler
Date Tue, 13 May 2008 00:06:55 GMT

    [ https://issues.apache.org/jira/browse/SOLR-469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12596237#action_12596237
] 

Chris Moser commented on SOLR-469:
----------------------------------

Hi Shalin, 

I'm indexing forums with Solr and have tables with a structure similar to this:

{code}
posts
------
forumid int
messageid int
deleted boolean
message text

forums
------
forumid int
name text
deleted boolean

{code}

The simplified data query I'm running goes like this:

{code}
SELECT 
   p.forumid,
   p.messageid,
   IF (p.deleted OR f.deleted,true,false) as deleted,
   p.message
  
FROM 
   posts p, forums f

WHERE
   f.forumid = p.forumid
{code}

The query checks to see if the post or the forum is deleted, and marks it in the index as
deleted in either case (which is why I'm doing the join).  The problem I'm running into is
that the importer is running the WHERE clause like this:

{code}
WHERE 
   f.forumid = p.forumid and forumid=123 and messageid=123456789
{code}

In this case, the _forumid=123_ part is ambiguous (forumid being in the posts and the forums
table) so this causes a SQL error.  So I added an additional attribute to the entity defintion
(pkTable) which prepends the _forumid=123_ with the pkTable value so it generates _pkTable.forumid=123_.

Not sure if this is the best way to do it but it fixed the problem :)

> Data Import RequestHandler
> --------------------------
>
>                 Key: SOLR-469
>                 URL: https://issues.apache.org/jira/browse/SOLR-469
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 1.3
>            Reporter: Noble Paul
>            Assignee: Grant Ingersoll
>             Fix For: 1.3
>
>         Attachments: SOLR-469-contrib.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch,
SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch
>
>
> We need a RequestHandler Which can import data from a DB or other dataSources into the
Solr index .Think of it as an advanced form of SqlUpload Plugin (SOLR-103).
> The way it works is as follows.
>     * Provide a configuration file (xml) to the Handler which takes in the necessary
SQL queries and mappings to a solr schema
>           - It also takes in a properties file for the data source configuraution
>     * Given the configuration it can also generate the solr schema.xml
>     * It is registered as a RequestHandler which can take two commands do-full-import,
do-delta-import
>           -  do-full-import - dumps all the data from the Database into the index (based
on the SQL query in configuration)
>           - do-delta-import - dumps all the data that has changed since last import.
(We assume a modified-timestamp column in tables)
>     * It provides a admin page
>           - where we can schedule it to be run automatically at regular intervals
>           - It shows the status of the Handler (idle, full-import, delta-import)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message