lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "NewSolrCloudDesign" by YonikSeeley
Date Thu, 18 Aug 2011 13:55:09 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "NewSolrCloudDesign" page has been changed by YonikSeeley:
http://wiki.apache.org/solr/NewSolrCloudDesign?action=diff&rev1=5&rev2=6

Comment:
no need for max_hash_value

  The hash is added as an indexed field in the doc and it is immutable. This may also be used
during an index split
  
  The hash function is pluggable. It can accept a document and return a consistent & positive
integer hash value. The system provides a default hash function which uses the content of
a configured, required & immutable field (default is unique_key field) to calculate hash
values.
+ 
+ === Using full hash range ===
+ Alternatively, there need not be any max_hash_value - the full 32 bits of the hash can be
used since each shard will have a range of hash values anyway.
+ Avoiding a configurable max_hash_value makes things easier on clients wanting related hash
values next to each other.  For example, in an email search application, one could construct
a hashcode as follows: {{{
+ (hash(user_id)<<24) | (hash(message_id)>>>8)
+ }}}
+ By deriving the top 8 bits of the hashcode from the user_id, it guarantees that any users
emails are in the same 256th portion of the cluster.  At search time, this information can
be used to only query that portion of the cluster.
  
  == Shard Assignment ==
  

Mime
View raw message