lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergio Martín Cantero <sergio.mar...@playence.com>
Subject Re: Use DIH with more than one entity at the same time
Date Fri, 18 May 2012 14:32:40 GMT
I see.
What I need is not multiple threads for one entity but multiple entities 
at the same time.

What I have done is rename the DIH for each of the entities in 
solrconfig, altough the are using the same data-import-confg.xml.
Something like:
<!-- Used for simultaneous full-import with various entities -->
<requestHandler name="/dataimportUsers" 
class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">data-import-config.xml</str>
</lst>
</requestHandler>
<!-- Used for simultaneous full-import with various entities -->
<requestHandler name="/dataimportProducts" 
class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">data-import-config.xml</str>
</lst>
</requestHandler>

Then I can run each entity at the same time with:
http://localhost:8080/solr/dataimportUsers?command=full-import&entity=users
http://localhost:8080/solr/dataimportProducts?command=full-import&entity=products

Being users and products entities defined in the same data-import-config.xml

This way, I don´t need to wait  to run products until users has finished.
This allows me to call full-import for users lets say each 15 min and 
for products each 10 min, and don´t need to wait until one has finsihed. 
Both can be overlaping.

Any drawback to this approach?

Thanks!!

Sergio

El 18/05/12 16:21, Dyer, James escribió:
>
> "threads" lets you run a single entity with multiple threads, so tis 
> probably not what you wanted.What we've done here is partition the 
> source data and then we have multiple handlers running at the same 
> time, each processing its own partition.So we multi-thread the import 
> without using the "threads" parameter.
>
> Even if this sounds like something useful, I recommend against using 
> it."threads" has tons of bugs, although some fixes were made for Solr 
> 3.6.For Solr 4.0 this feature is removed.
>
> *James Dyer*
>
> E-Commerce Systems
>
> Ingram Content Group
>
> (615) 213-4311
>
> *From:*Sergio Martín Cantero [mailto:sergio.martin@playence.com]
> *Sent:* Friday, May 18, 2012 6:23 AM
> *To:* solr-user@lucene.apache.org
> *Cc:* Dyer, James
> *Subject:* Re: Use DIH with more than one entity at the same time
>
> What the wiki indicates actually works, altough it´s not what I 
> wanted. I have tried it and works fine.
>
> I have also tried Jack´s approach and also works fine (and is what I 
> was looking for :-)
>
> Still, I have one more question. You wrote: " This is a 1.4.1 
> installation, back when there was no "threads" option in DIH. ". I´m 
> using 3.5 Solr. What would the use of threads change. How could I take 
> advantage ot it, instead of declaring various DIHs in SolrConfgi.xml?
>
> Thanks a lot!
>
>
> El 17/05/12 18:33, Dyer, James escribió:
>
> The wiki here indicates that you can specify "entity" more than once on the request and
it will run multiple entities at the same time, in the same handler:   http://wiki.apache.org/solr/DataImportHandler#Commands
>   
> But I can't say for sure that this actually works!   Having been in the DIH code, I would
think such a feature is buggy at best, if it works at all.   But if you try it let us know
how it works for you.   Also, if anyone else out there is using multiple "entity" parameters
to get entities running in parallel, I'd be interested in hearing about it.
>   
> But the approach taken in the link Jack sites below does work.   Its a pain to set it
up though.
>   
> James Dyer
> E-Commerce Systems
> Ingram Content Group
> (615) 213-4311
>   
> From: Jack Krupansky [mailto:jack@basetechnology.com]
> Sent: Thursday, May 17, 2012 10:21 AM
> To:solr-user@lucene.apache.org  <mailto:solr-user@lucene.apache.org>
> Subject: Re: Use DIH with more than one entity at the same time
>   
> Okay, the answer is “Yes, sort of, but...”
>   
> “One annoyance is because of how DIH is designed, you need a separate handler set up
in solrconfig.xml for each DIH you plan to run.  So you have to plan in advance how many DIH
instances you want to run, which config files they'll use, etc.”
>   
> See:
> http://lucene.472066.n3.nabble.com/Multiple-dataimport-processes-to-same-core-td3645525.html
>   
> -- Jack Krupansky
>   
> From: Sergio Martín Cantero<mailto:sergio.martin@playence.com>
> Sent: Thursday, May 17, 2012 11:07 AM
> To:solr-user@lucene.apache.org  <mailto:solr-user@lucene.apache.org><mailto:solr-user@lucene.apache.org>
> Cc: Jack Krupansky<mailto:jack@basetechnology.com>
> Subject: Re: Use DIH with more than one entity at the same time
>   
> Thanks Jack, but that´s not what I want.
>   
> I don´t want multiple entities in one invocation, but two simultaneous invocations of
the DIH with different entities.
>   
> Thanks.
> [cid:B1C89B4707D142DCB6BFBD6B07E47BC7@JackKrupansky]<http://www.playence.com>
> [cid:3F3E4BE8DC9D4B808C9038D507DE8415@JackKrupansky]
> Sergio Martín Cantero
>   
> Office (ES) +34 91 733 73 97
>   
> playence Spain SL
>   
> sergio.martin@playence.com  <mailto:sergio.martin@playence.com><mailto:sergio.martin@playence.com>
>   
> Calle Vicente Gaceo 19
>   
> 28029 Madrid - España
>   
>   
>   
>   
> El 17/05/12 17:04, Jack Krupansky escribió:
> Yes. From the doc:
>   
> "Multiple 'entity' parameters can be passed on to run multiple entities at once. If nothing
is passed, all entities are executed."
>   
> See:
> http://wiki.apache.org/solr/DataImportHandler
>   
> But that is one invocation of DIH, not two separate updates as you tried.
>   
> -- Jack Krupansky
>   
> -----Original Message----- From: Sergio Martín Cantero
> Sent: Thursday, May 17, 2012 10:46 AM
> To:solr-user@lucene.apache.org  <mailto:solr-user@lucene.apache.org><mailto:solr-user@lucene.apache.org>
> Subject: Use DIH with more than one entity at the same time
>   
> I´m new to this list, so... Hello everybody.
>   
> I´m trying to run the DIH with more than one entity at the same time,
> but only the first entity I call is being indexed. The other doesn´t get
> any response.
> For example:
> First call:
> http://localhost:8080/solr/dataimport?command=full-import&clean=false&entity=users
 <http://localhost:8080/solr/dataimport?command=full-import&clean=false&entity=users>
> Before the indexing has finished, I call:
> http://localhost:8080/solr/dataimport?command=full-import&clean=false&entity=products
 <http://localhost:8080/solr/dataimport?command=full-import&clean=false&entity=products>
>   
> The second call doesn´t have any effedt, and the products are not
> indexed at all.
>   
> Isn´t it possible to run more than one full import for different
> entities at the same time?
>   
> Thanks a lot for your help
> Sergio

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message