lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Garski (JIRA)" <>
Subject [jira] [Commented] (SOLR-2592) Pluggable shard lookup mechanism for SolrCloud
Date Sat, 15 Sep 2012 01:38:07 GMT


Michael Garski commented on SOLR-2592:

Dan, perhaps you could just reverse the composite id portions and use a filter query to restrict
your query to a subset of the data in the shard. Using your example you would have unique
ids doc=Sep2012_1234 and doc=Oct2012_4567, with each document containing a field with the
values Sep2012 and Oct2012 respectively. In that way, if you started with a small amount of
shards and both of those documents wound up on the same shard the search would be restricted
to just the shard where they reside, and the filter would be applied to only include the docs
you want. In the patch I submitted, if you only wanted to query the documents from Sep2012
you would add the parameter shard.keys=Sep2012 and the appropriate filter query. I take that
exact approach with user-specific data to ensure all docs for a specific user reside in the
same shard and the query is only executed against that shard and returns docs for only that

The downside with that approach for your use case would be with a potentially low number of
unique values that are hashed you could wind up with an uneven distribution of data across
the shards. 

It sounds like what you really want is a date-based distribution policy that will add new
shards to the collection each month. Does that sound about right?
> Pluggable shard lookup mechanism for SolrCloud
> ----------------------------------------------
>                 Key: SOLR-2592
>                 URL:
>             Project: Solr
>          Issue Type: New Feature
>          Components: SolrCloud
>    Affects Versions: 4.0-ALPHA
>            Reporter: Noble Paul
>            Assignee: Mark Miller
>         Attachments: dbq_fix.patch, pluggable_sharding.patch, pluggable_sharding_V2.patch,
SOLR-2592.patch, SOLR-2592_r1373086.patch, SOLR-2592_r1384367.patch, SOLR-2592_rev_2.patch,
> If the data in a cloud can be partitioned on some criteria (say range, hash, attribute
value etc) It will be easy to narrow down the search to a smaller subset of shards and in
effect can achieve more efficient search.  

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message