Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5FC1F925E for ; Mon, 21 May 2012 18:21:43 +0000 (UTC) Received: (qmail 59874 invoked by uid 500); 21 May 2012 18:21:41 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 59780 invoked by uid 500); 21 May 2012 18:21:41 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 59679 invoked by uid 99); 21 May 2012 18:21:41 -0000 Received: from issues-vm.apache.org (HELO issues-vm) (140.211.11.160) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 May 2012 18:21:41 +0000 Received: from isssues-vm.apache.org (localhost [127.0.0.1]) by issues-vm (Postfix) with ESMTP id 7087B142826 for ; Mon, 21 May 2012 18:21:41 +0000 (UTC) Date: Mon, 21 May 2012 18:21:41 +0000 (UTC) From: "Michael Garski (JIRA)" To: dev@lucene.apache.org Message-ID: <1424136546.4139.1337624501465.JavaMail.jiratomcat@issues-vm> In-Reply-To: <176709306.6153.1308125267048.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (SOLR-2592) Pluggable shard lookup mechanism for SolrCloud MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/SOLR-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Garski updated SOLR-2592: --------------------------------- Attachment: pluggable_sharding_V2.patch Here is an update to my original patch that accounts for the requirement of hashing based on unique id and works as follows: 1. Configure a ShardKeyParserFactory in SolrConfig under config/shardKeyParserFactory. If there is not one configured the default implementation of sharding on the document's unique id will be performed. The default configuration is equivalent to: {code:xml} {code} 2. The ShardKeyParser has two methods to parse a shard key out of the unique id or a delete by query. The default implementation returns the string value of the unique id when parsing the unique id to forward it to the specific shard, and null when parsing the delete by query to broadcast a delete by query to the entire collection. 3. Queries can be directed to a subset of shards in the collection by specifying one or more shard keys in the request parameter 'shard.keys'. Notes: There are no distinct unit tests for this change yet, however all current unit tests pass. The switch to hashing on the string value rather than the indexed value is how I realized the real-time get component requires support for hashing based on the document's unique id with a failing test. By hashing on the string values rather than indexed values, the solrj client can direct queries to a specific shard however this is not yet implemented. I put the hashing function in the oas.common.cloud.HashPartioner class, which encapsulates the hashing and partitioning in one place. I can see a desire for a pluggable collection partitioning where a collection could be partitioned on time periods or some other criteria but that is outside of the scope of pluggable shard hashing. > Pluggable shard lookup mechanism for SolrCloud > ---------------------------------------------- > > Key: SOLR-2592 > URL: https://issues.apache.org/jira/browse/SOLR-2592 > Project: Solr > Issue Type: New Feature > Components: SolrCloud > Affects Versions: 4.0 > Reporter: Noble Paul > Attachments: pluggable_sharding.patch, pluggable_sharding_V2.patch > > > If the data in a cloud can be partitioned on some criteria (say range, hash, attribute value etc) It will be easy to narrow down the search to a smaller subset of shards and in effect can achieve more efficient search. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org