lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shalin Shekhar Mangar <shalinman...@gmail.com>
Subject Re: Indexing and searching of sharded/ partitioned databases and tables
Date Wed, 07 Oct 2009 11:51:09 GMT
On Wed, Oct 7, 2009 at 5:09 PM, Sandeep Tagore <sandeep.tagore@gmail.com>wrote:

>
> Hi Jayant,
> You can use Solr to achieve your objective.
> The data-config.xml which you posted is incomplete.
>
>
Sandeep, the data-config that Jayant posted is not incomplete. The <field>
declaration is not necessary if the name of the column in the database and
the field name in schema.xml is the same.


> I would like to suggest you a way to index the full data.
> Try to index a database at a time. Sample xml conf.....
>
> <dataSource type="JdbcDataSource" name="ds1" driver="com.mysql.jdbc.Driver"
> url="jdbc:mysql://localhost/Db1" user="user-name" password="password" />
>  <document name="Tbl1">
>   <entity name="Tbl1" query="select id,name,category from Tbl1">
>            <field column="id" name="id" />
>            <field column="name" name="name" />
>            <field column="category" name="category" />
> </entity></document>
> <document name="Tbl2">
>   <entity name="Tbl2" query="select id,name,category from Tbl2">
>            <field column="id" name="id" />
>            <field column="name" name="name" />
>            <field column="category" name="category" />
> </entity></document>
> <document name="Tbl3">
>   <entity name="Tbl3" query="select id,name,category from Tbl3">
>            <field column="id" name="id" />
>            <field column="name" name="name" />
>            <field column="category" name="category" />
> </entity></document>
>
> You can write an automated program which will change the DB conf details in
> that xml and fire the full import command. You can use
> http://localhost:8983/solr/dataimport url to check the status of the data
> import.
>
>
You could do that but I don't think it is required. If you do want to do
this, it is possible to post the data-config.xml to /dataimport (this is how
the dataimport.jsp works)


> But be careful while declaring the <uniqueKey> field. Make sure that you
> are
> not overwriting the records.
>

Yes, good point. That is a typical problem with sharded databases with
auto-increment primary key. If you do not have unique keys, you can
concatenate the shard name with the value of the primary key.

-- 
Regards,
Shalin Shekhar Mangar.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message