lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: SolrCloud logical shards
Date Thu, 14 Jan 2010 23:29:36 GMT
Shard has the interesting additional implication that it is part of a
composite index made up of many sub-indexes.

A lucene index could be a complete index or a shard.  I would presume the
same of what might be called a core.

On Thu, Jan 14, 2010 at 3:21 PM, Jason Rutherglen <
jason.rutherglen@gmail.com> wrote:

> Uri,
>
> > "core" to represent a single index and "shard" to be
> > represented by a single core
>
> Can you elaborate on what you mean, isn't a core a single index
> too? It seems like shard was used to represent a remote index
> (perhaps?). Though here I'd prefer "remote core", because to the
> uninitiated Solr outsider it's immediately obvious (i.e. they
> need only know what a core is, in the Solr glossary or term
> dictionary).
>
> In Google vernacular, which is where the name shard came from, a
> "shard" is basically a local sub-index
> http://research.google.com/archive/googlecluster.html where
> there would be many "shards" per server. However that's a
> digression at this point.
>
> I personally prefer relatively straightforward names, that are
> self-evident, rather than inventing new language for fairly
> simple concepts. Slice, even though it comes from our buddy
> Yonik, probably doesn't make any immediate sense to external
> users when compared with the word shard. Of course software
> projects have a tendency to create their own words to somewhat
> mystify users into believing in some sort of magic occurring
> underneath. If that's what we're after, it's cool, I mean that
> makes sense. And I don't mean to be derogatory here however this
> is an open source project created in part to educate users on
> search and be made easily accessible as possible, to the
> greatest number of users possible. I think Doug did a create job
> of this when Lucene started with amazingly succinct code for
> fairly complex concepts (eg, anti-mystification of search).
>
> Jason
>
> On Thu, Jan 14, 2010 at 2:58 PM, Uri Boness <uboness@gmail.com> wrote:
> > Although Jason has some valid points here, I'm with Yonik here. I do
> believe
> > that we've gotten used to the terms "core" to represent a single index
> and
> > "shard" to be represented by a single core. A "node" seems to indicate a
> > machine or a JVM. Changing any of these (informal perhaps) definitions
> will
> > only cause confusion. That's why I think a "slice" is a good solution
> now...
> > first it's a new term to a new view of the index (logical shard AFAIK
> don't
> > really exists yet) so people won't need to get used to it, but it's also
> > descriptive and intuitive. I do like Jason's idea about having a protocol
> > attached to the URL's.
> >
> > Cheers,
> > Uri
> >
> > Jason Rutherglen wrote:
> >>>
> >>> But I've kind of gotten used to thinking of shards as the
> >>> actual physical queryable things...
> >>>
> >>
> >> I think a mistake was made referring to Solr cores as shards.
> >> It's the same thing with 2 different names. Slices adds yet
> >> another name which seems to imply the same thing yet again. I'd
> >> rather see disambiguation here, and call them cores (partially
> >> because that's what's in the code and on the wiki), and cores
> >> only. It's a Solr specific term, it's going to be confused with
> >> microprocessor cores, but at least there's only one name, which
> >> as search people, we know creates fewer posting lists :).
> >>
> >> Logical groupings of cores can occur, which can be aptly named
> >> core groups. This way I can submit a query to a core group, and
> >> it's reasonable to assume I'm hitting N cores. Further, cores
> >> could point to a logical or physical entity via a URL. (As a
> >> side note, I've always found it odd that the shards param to
> >> RequestHandler lacks the protocol, what if I want to use HTTPS
> >> for example?).
> >>
> >> So there could be http://host/solr/core1 (physical),
> >> core://megacorename (logical),
> >> coregroup://supergreatcoregroupname (a group of cores) in the
> >> "shards" parameter (whose name can perhaps be changed for
> >> clarity in a future release). Then people can mix and match and
> >> we won't have many different XML elements floating around. We'd
> >> have a simple list of URLs that are transposed into a real
> >> physical network request.
> >>
> >>
> >> On Thu, Jan 14, 2010 at 12:56 PM, Yonik Seeley
> >> <yonik@lucidimagination.com> wrote:
> >>
> >>>
> >>> On Thu, Jan 14, 2010 at 1:38 PM, Yonik Seeley
> >>> <yonik@lucidimagination.com> wrote:
> >>>
> >>>>
> >>>> On Thu, Jan 14, 2010 at 12:46 PM, Yonik Seeley
> >>>> <yonik@lucidimagination.com> wrote:
> >>>>
> >>>>>
> >>>>> I'm actually starting to lean toward "slice" instead of "logical
> >>>>> shard".
> >>>>>
> >>>
> >>> Alternate terminology could be "index" for the actual physical lucene
> >>> lindex (and also enough of the URL that unambiguously identifies it),
> >>> and then "shard" could be the logical entity.
> >>>
> >>> But I've kind of gotten used to thinking of shards as the actual
> >>> physical queryable things...
> >>>
> >>> -Yonik
> >>> http://www.lucidimagination.com
> >>>
> >>>
> >>
> >>
> >
>



-- 
Ted Dunning, CTO
DeepDyve

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message