lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Veliz <rob...@mavenbridge.com>
Subject Re: Multi-core support for indexing multiple servers
Date Tue, 12 Nov 2013 14:41:08 GMT
I have two sources/servers--one of them is Magento. Since Magento has a more or less out of
the box integration with Solr, my thought was to run Solr server from the Magento instance
and then use DIH to get/merge content from the other source/server. Seem feasible/appropriate?
 I spec'd it out and it seems to make sense...

R

> On Nov 11, 2013, at 11:25 PM, Liu Bo <diablo47@gmail.com> wrote:
> 
> like Erick said, merge data from different datasource could be very
> difficult, SolrJ is much easier to use but may need another application to
> do handle index process if you don't want to extends solr much.
> 
> I eventually end up with a customized request handler which use SolrWriter
> from DIH package to index data,
> 
> So that I can fully control the index process, quite like SolrJ, you can
> write code to convert your data into SolrInputDocument, and then post them
> to SolrWriter, SolrWriter will handles the rest stuff.
> 
> 
>> On 8 November 2013 21:46, Erick Erickson <erickerickson@gmail.com> wrote:
>> 
>> Yep, you can define multiple data sources for use with DIH.
>> 
>> Combining data from those multiple sources into a single
>> index can be a bit tricky with DIH, personally I tend to prefer
>> SolrJ, but that's mostly personal preference, especially if
>> I want to get some parallelism going on.
>> 
>> But whatever works
>> 
>> Erick
>> 
>> 
>> On Thu, Nov 7, 2013 at 11:17 PM, manju16832003 <manju16832003@gmail.com
>>> wrote:
>> 
>>> Eric,
>>> Just a question :-), wouldn't it be easy to use DIH to pull data from
>>> multiple data sources.
>>> 
>>> I do use DIH to do that comfortably. I have three data sources
>>> - MySQL
>>> - URLDataSource that returns XML from an .NET application
>>> - URLDataSource that connects to an API and return XML
>>> 
>>> Here is part of data-config data source settings
>>> <dataSource type="JdbcDataSource" name="solr"
>>> driver="com.mysql.jdbc.Driver"
>>> url="jdbc:mysql://localhost/employeeDB" batchSize="-1" user="root"
>>> password="root"/>
>>>       <dataSource name="CRMServer" type="URLDataSource" encoding="UTF-8"
>>> connectionTimeout="5000" readTimeout="10000"/>
>>>       <dataSource name="ImageServer" type="URLDataSource"
>> encoding="UTF-8"
>>> connectionTimeout="5000" readTimeout="10000"/>
>>> 
>>> 
>>> Of course, in application I do the same.
>>> To construct my results, I do connect to MySQL and those two data
>> sources.
>>> 
>>> Basically we have two point of indexing
>>> - Using DIH at one time indexing
>>> - At application whenever there is transaction to the details that we
>> are
>>> storing in Solr.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> --
>>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Multi-core-support-for-indexing-multiple-servers-tp4099729p4099933.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 
> -- 
> All the best
> 
> Liu Bo

Mime
View raw message