lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki ...@getopt.org>
Subject Re: Solr Cloud wiki and branch notes
Date Sat, 16 Jan 2010 19:40:38 GMT
On 2010-01-16 18:18, Yonik Seeley wrote:
> On Fri, Jan 15, 2010 at 7:36 PM, Andrzej Bialecki<ab@getopt.org>  wrote:
>> Hi,
>>
>> My 0.02 PLN on the subject ...
>>
>> Terminology
>> -----------
>> First the terminology: reading your emails I have a feeling that my head is
>> about to explode. We have to agree on the vocabulary, otherwise we have no
>> hope of reaching any consensus.
>
> We not only need more standardized terminology for email, but for
> exact strings to put in zookeeper.

Indeed.

>> I propose the following vocabulary that has
>> been in use and is generally understood:
>>
>> * (global) search index: a complete collection of all indexed documents.
>>  From a conceptual point of view, this is our complete search space.
>
> We're currently using "collection".  Notice how you had to add
> (global) to clarify what you meant.  I fear that a sentence like "what
> index are you querying" would need constant clarification.

I avoided the word "collection", because Solr deploys various cores 
under "collectionX" names, leading users to assume that core == 
collection. "Global index" is two words but it's unambiguous. I'm fine 
with the "collection" if we clarify the definition and avoid using this 
term for other stuff.

>
>> * index shard: a non-overlapping part of the search index.
>
> When you get down to modeling it, this gets a little squishy and is
> hard to avoid using two words.
> Say the complete collection is covered by ShardX and ShardY.
>
> A way to model this is like so:
>
> /collection
>    /ShardX
>      /node1 [url=..., version=...]
>      /node2 [url=..., version=...]
>      /node3 [url=..., version=...]
>
> It becomes clearer that there are logical shards and physical shards.
> If shards are updateable, they may have different versions at
> different times.

Yes, but they are supposed to be ultimately consistent - that's where 
the replication comes in.

> It may also be that all the physical shards go down,
> but the logical "ShardX" remains.

Yes, as a missing piece of the global index not served currently by any 
node, thus leading to incomplete results.

>
> Even the statement "what shard did that response come from" becomes
> ambiguous since we could be talking a part of the index (ShardX) or we
> could be talking about the specific physical shard/server (it came
> from node2).

Agreed - but it could be as simple as qualifying this with "from shardX 
on node2".

This would be quite natural if you consider that even the same query 
submitted again could be answered by a different set of nodes that 
manage the same set of shards. E.g. with two nodes {n1, n2} and 2 shards 
{s1,s2}, and the replication factor of 2, the selection of what shard on 
what node contributes to the list of results could look like this (time 
in the Y axis):

q1 {n1:s1,n2:s2}
q2 {n1:s2,n2:s1}
...


>
>> All shards in the
>> system form together the complete search space of the search index. E.g.
>> having initially one big index I could divide it into multiple shards using
>> MultiPassIndexSplitter, and if I combined all the shards again, using
>> IndexMerger, I should obtain the original complete search index (modulo
>> changed Lucene docids .. doesn't matter). I strongly believe in
>> micro-sharding, because they are much easier to handle and replicate. Also,
>> since we control the shards we don't have to deal with overlapping shards,
>> which is the curse of P2P search.
>
> Prohibiting overlapping shards effectively prohibits ever merging or
> splitting shards online (it could only be an offline or blocking
> operation).  Anyway, in the opaque shard model (where clients create
> shards, and we don't know how they partitioned them), shards would
> have to be non-overlapping.

The opaque model means it's more difficult to support updates. IMHO it 
makes sense to start with a set of stricter assumptions in order to 
build something workable, and then relax them as we gain experience.

>
> As far as the future (allocation and rebalancing), I'm happy with a
> small-shard approach that avoids merging and splitting.  It carries
> some other nice little side benefits as well.
>
>> * partitioning: a method whereby we can determine the target shard ID based
>> on a doc ID.
>
> I think we're all using partitioning the same way, but that's a
> narrower definition than needed.
> A user may partition the index, and Solr may not have the mapping of
> docid to shard.

See above - of course this would be cool and extra convenient to users, 
but much more difficult to implement so that it supports updates.

>
> You've also used some slightly new terminology... "shard ID" as
> opposed to just shard, which reinforces the need for different
> terminology for the physical vs the logical.

You got me ;) yes, when I say "shard" I mean the logical entity, as 
defined by a set of documents - physical shard I would call a replica.


>> Now, to translate this into Solr-speak: depending on the details of the
>> design, and the evolution of Solr, one search node could be one Solr
>> instance that manages one shard per core.
>
> A solr core is a bit too heavyweight for a microshard though.
> I think a single solr core really needs to be able to handle multiple
> shards for this to become practical.

Ok. This is actually related to the issue below (witness SOLR-1366).

>
>> Let's forget here about the
>> current distributed search component, and the current replication
>
> Heh.  I think this is what is causing some of the mismatches...
> different starting points and different assumptions.

They work well with the current assumptions, and are known to work 
poorly with the design that we are discussing.

>
>> - they
>> could be useful in this design as a raw transport mechanism, but someone
>> else would be calling the shots (see below).
>
> Seems like we need to be flexible in allowing customers to call the
> shots to varying degrees.

Eventually, yes - but initially I fear we won't be able to come up with 
a model that allows this much flexibility and is still implementable in 
a reasonable time ...

-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Mime
View raw message