lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-2592) Pluggable shard lookup mechanism for SolrCloud
Date Sun, 16 Sep 2012 15:37:07 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456587#comment-13456587
] 

Yonik Seeley commented on SOLR-2592:
------------------------------------

This issue should be about "custom hashing".  "Custom sharding" is different, and covers many
other things such as time-based sharding.
I haven't had a chance to look at the patches here, but it seems like simpler may be better.
I like the approach of just hashing on the id field.

For example, if you have an email search application and want to co-locate all a users data
on the same shard, then simply use id's of the form
userid:emailid  (or whatever separator we choose - we should chose one by default that is
less likely to accidentally clash with normal ids, and also one that works well in URLs and
hopefully in query parser syntax w/o the need for escaping).

And then when you hash, you simply use the upper bits of the userid and the lower bits of
the emailid to construct the hash that selects the node placement.  The only real configurable
part you need is where the split is (i.e. how many bits for each side).

                
> Pluggable shard lookup mechanism for SolrCloud
> ----------------------------------------------
>
>                 Key: SOLR-2592
>                 URL: https://issues.apache.org/jira/browse/SOLR-2592
>             Project: Solr
>          Issue Type: New Feature
>          Components: SolrCloud
>    Affects Versions: 4.0-ALPHA
>            Reporter: Noble Paul
>            Assignee: Mark Miller
>         Attachments: dbq_fix.patch, pluggable_sharding.patch, pluggable_sharding_V2.patch,
SOLR-2592.patch, SOLR-2592_r1373086.patch, SOLR-2592_r1384367.patch, SOLR-2592_rev_2.patch,
SOLR_2592_solr_4_0_0_BETA_ShardPartitioner.patch
>
>
> If the data in a cloud can be partitioned on some criteria (say range, hash, attribute
value etc) It will be easy to narrow down the search to a smaller subset of shards and in
effect can achieve more efficient search.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message