Return-Path: Delivered-To: apmail-lucene-solr-dev-archive@minotaur.apache.org Received: (qmail 43480 invoked from network); 14 Jan 2010 23:22:01 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 14 Jan 2010 23:22:01 -0000 Received: (qmail 71206 invoked by uid 500); 14 Jan 2010 23:22:00 -0000 Delivered-To: apmail-lucene-solr-dev-archive@lucene.apache.org Received: (qmail 71130 invoked by uid 500); 14 Jan 2010 23:22:00 -0000 Mailing-List: contact solr-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-dev@lucene.apache.org Delivered-To: mailing list solr-dev@lucene.apache.org Received: (qmail 71120 invoked by uid 99); 14 Jan 2010 23:22:00 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Jan 2010 23:22:00 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jason.rutherglen@gmail.com designates 209.85.160.46 as permitted sender) Received: from [209.85.160.46] (HELO mail-pw0-f46.google.com) (209.85.160.46) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Jan 2010 23:21:50 +0000 Received: by pwi11 with SMTP id 11so4358646pwi.5 for ; Thu, 14 Jan 2010 15:21:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=nK7lVggy/K1SLTLm8E5Hx/qQP6Lnl93g/VXgpb8kCKM=; b=YGL5DM10Edg/9lD1GzN5jDVm4MZzEdCZiEtA2Nfrll8NDb10nEXmS+Uf261zcPHHjF bfa1i5syp1b7QjQCqiyn8ZFJnHbfC1ZtEfvc5YLj/yny7D76GD/H6JSa0mneppIKoUWm LfVGL7nIvNVozE9o8ogf2RQp++mTT9sLlbk6g= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=AhG12gk8+9UooVhHmbTLApuc7WZIAsZ6P0c4BRM7EjO0xscXjytmB4GH0s+wf06Z9H rGSgoHHf/svgAeG53Vdd/QNTyoNgJQtfAXepv7KNyGdO7ojJ3BgC+ihZtQ+oG++otX0X IjLe/V1jC5mefXkrvA+W7DqOxXZrjow/rkPv4= MIME-Version: 1.0 Received: by 10.141.187.6 with SMTP id o6mr1086025rvp.59.1263511287710; Thu, 14 Jan 2010 15:21:27 -0800 (PST) In-Reply-To: <4B4FA18E.5040202@gmail.com> References: <85d3c3b61001141337w6cefda4fr19a2cd638ba75cef@mail.gmail.com> <4B4FA18E.5040202@gmail.com> Date: Thu, 14 Jan 2010 15:21:27 -0800 Message-ID: <85d3c3b61001141521t357898b6hadca02c0eae5a9d6@mail.gmail.com> Subject: Re: SolrCloud logical shards From: Jason Rutherglen To: solr-dev@lucene.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Uri, > "core" to represent a single index and "shard" to be > represented by a single core Can you elaborate on what you mean, isn't a core a single index too? It seems like shard was used to represent a remote index (perhaps?). Though here I'd prefer "remote core", because to the uninitiated Solr outsider it's immediately obvious (i.e. they need only know what a core is, in the Solr glossary or term dictionary). In Google vernacular, which is where the name shard came from, a "shard" is basically a local sub-index http://research.google.com/archive/googlecluster.html where there would be many "shards" per server. However that's a digression at this point. I personally prefer relatively straightforward names, that are self-evident, rather than inventing new language for fairly simple concepts. Slice, even though it comes from our buddy Yonik, probably doesn't make any immediate sense to external users when compared with the word shard. Of course software projects have a tendency to create their own words to somewhat mystify users into believing in some sort of magic occurring underneath. If that's what we're after, it's cool, I mean that makes sense. And I don't mean to be derogatory here however this is an open source project created in part to educate users on search and be made easily accessible as possible, to the greatest number of users possible. I think Doug did a create job of this when Lucene started with amazingly succinct code for fairly complex concepts (eg, anti-mystification of search). Jason On Thu, Jan 14, 2010 at 2:58 PM, Uri Boness wrote: > Although Jason has some valid points here, I'm with Yonik here. I do believe > that we've gotten used to the terms "core" to represent a single index and > "shard" to be represented by a single core. A "node" seems to indicate a > machine or a JVM. Changing any of these (informal perhaps) definitions will > only cause confusion. That's why I think a "slice" is a good solution now... > first it's a new term to a new view of the index (logical shard AFAIK don't > really exists yet) so people won't need to get used to it, but it's also > descriptive and intuitive. I do like Jason's idea about having a protocol > attached to the URL's. > > Cheers, > Uri > > Jason Rutherglen wrote: >>> >>> But I've kind of gotten used to thinking of shards as the >>> actual physical queryable things... >>> >> >> I think a mistake was made referring to Solr cores as shards. >> It's the same thing with 2 different names. Slices adds yet >> another name which seems to imply the same thing yet again. I'd >> rather see disambiguation here, and call them cores (partially >> because that's what's in the code and on the wiki), and cores >> only. It's a Solr specific term, it's going to be confused with >> microprocessor cores, but at least there's only one name, which >> as search people, we know creates fewer posting lists :). >> >> Logical groupings of cores can occur, which can be aptly named >> core groups. This way I can submit a query to a core group, and >> it's reasonable to assume I'm hitting N cores. Further, cores >> could point to a logical or physical entity via a URL. (As a >> side note, I've always found it odd that the shards param to >> RequestHandler lacks the protocol, what if I want to use HTTPS >> for example?). >> >> So there could be http://host/solr/core1 (physical), >> core://megacorename (logical), >> coregroup://supergreatcoregroupname (a group of cores) in the >> "shards" parameter (whose name can perhaps be changed for >> clarity in a future release). Then people can mix and match and >> we won't have many different XML elements floating around. We'd >> have a simple list of URLs that are transposed into a real >> physical network request. >> >> >> On Thu, Jan 14, 2010 at 12:56 PM, Yonik Seeley >> wrote: >> >>> >>> On Thu, Jan 14, 2010 at 1:38 PM, Yonik Seeley >>> wrote: >>> >>>> >>>> On Thu, Jan 14, 2010 at 12:46 PM, Yonik Seeley >>>> wrote: >>>> >>>>> >>>>> I'm actually starting to lean toward "slice" instead of "logical >>>>> shard". >>>>> >>> >>> Alternate terminology could be "index" for the actual physical lucene >>> lindex (and also enough of the URL that unambiguously identifies it), >>> and then "shard" could be the logical entity. >>> >>> But I've kind of gotten used to thinking of shards as the actual >>> physical queryable things... >>> >>> -Yonik >>> http://www.lucidimagination.com >>> >>> >> >> >