lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Per Steffensen <st...@designware.dk>
Subject Re: Multiple shards for one collection on the same Solr server
Date Mon, 26 Nov 2012 14:55:28 GMT
Mark Miller skrev:
> The Collections API was fairly rushed - so that 4.0 had something easier than the CoreAdmin
API.
>   
Yes I see. Our collection-creation code is more sophisticated than 
yours. We probably would like to migrate to the Solr Collection API now 
anyway - to be using it already when features are added later.
> Due to that, it has a variety of limitations:
>
> 1. It only picks instances for a collection one way - randomly from the list of live
instances. This means it's no good for multiple shards on the same instance. You should have
enough instances to satisfy numShards X replicationFactor (although just being short on replicationFactor
will currently just use what is there)
>   
Well I think it shuffles the list of live-nodes and the begin assigning 
shard from one end. That is ok for us for now. But it will not start 
over in the list of live-nodes when there are more shards (shards * 
replica) than instances. This could easily be acheived, without making a 
very fancy allocation algorithm
> 2. It randomly chooses which instances to use rather than allowing manual specification
or looking at existing cores.
>   
A manual spec would be nice to be able to control everything if you 
really want to. But you probably also want to make different built-in 
shard-allocation-strategies that can be used out-of-the-box. E.g. a 
"AlwaysAssignNextShardToInstanceWithFewestShardsAlready"-strategy, but 
there are also other concerns that might be more interesting for people 
to have build into assignment algorithms - e.g. a rack-aware algorithm 
that assign replica of the same slice to instances running on different 
"racks".
> 3. You cannot get responses of success or failure other than polling for the expected
results later.
>   
Well we do that anyway, and will keep doing that in our own code for now.
>
> Someone has a patch up for 3 that I hope to look at soon - others have contributed bug
fixes that will be in 4.1. We still need to add the ability to control placement in other
ways though.
>
> I would say there are def plans, but I don't personally know exactly when I'll find the
time for it, if others don't jump in.
>   
Well I would like to jump in with respect to making support for running 
several shards of the same collection on the same instance, it is just 
so damn hard to get you to commit stuff :-) and we really dont want to 
have too many differences in our Solr compared to Apache Solr (and we 
have enough already - SOLR-3178 etc.). It seems like this feature with 
several shards on same instance is the only missing feature of the 
Collection API before we can "live with it".
> - Mark
>   
Regards, Per Steffensen
> On Nov 26, 2012, at 4:57 AM, Per Steffensen <steff@designware.dk> wrote:
>
>   
>> Hi
>>
>> Before upgrading to Solr 4.0.0 we used to handle our collection creation ourselves,
by creating each shards through the low-level CoreAdmin API. We used to create multiple shards
under the same collection on each Solr server. Performance tests has shown that this is a
good idea, and it is also a good idea for easy elasticity later on - it is much easier to
move an entire existing shards from one Solr server to another one that just joined the cluter
than it is to split an exsiting shard among the Solr that used to run it and the new Solr.
>>
>> Now we are trying to migrate to the Solr Collection API for creation of collections,
but it seems like it will not accept multiple shards under the same collection running on
the same Solr server. E.g. if we have 4 Solr servers and ask to have a collection created
with 8 shards, all 8 shards will be "created" but only 4 of them will acutally run - one on
each Solr server.
>>
>> Is the a good reason why Solr does not allow multiple shards under the same collection
to run on the same Solr server, or is it just made this way "by coincidence"? In general I
seek info on the matter - is it planed for later? etc.
>>
>> Thanks!
>>
>> Regards, Per Steffensen
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>     
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>
>   


Mime
View raw message