incubator-blur-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravikumar Govindarajan <ravikumar.govindara...@gmail.com>
Subject Re: Possible to have a META shard?
Date Thu, 10 Jul 2014 17:00:54 GMT
Aaron,

This is a lengthy post. Please bear...

We are looking at Blur slightly differently. No Map-Red ops, No immutable
RowId data etc... Just plain online-search like regular lucene/SOLR/ES

Our use-case mandates that Documents for a RowId will arrive incrementally.
We don't have the luxury of dropping the whole-row and re-indexing it, as a
given Row will have hundreds of thousands of docs...

A single row-id will always be found in one shard, but spread across
segments. We have modified blur sources on both indexing/search side to
support this requirement

In other words, we support ADD_RECORDS thrift-op to an existing Row..

We actually are now testing a sharding strategy similar to databases in Blur

1. Initially we start with lets say 300 shards per table aka base-shards
2. Each shard has a fixed size lets say 16 GB. Client will watch for this
    and spawn a new shard when size exceeds. {An alias-shard in ES terms}
3. ZK will hold the Base --> List-of-Alias shards
4. A RowId will be allocated a shard that has least number of alias shards.
    This mapping will never change in the lifetime of a Row
5. ADD_RECORDS op will go the latest alias, while DEL/UPDATE will go to
    all aliases+base shards.
6. Once all 300 base-shards have spawned aliases, admins can create new
    base shards on the cluster. Newer RowIds will auto-allocate to freshly
    created shards
7. Both horizontal & vertical scaling of shards can be supported easily by
    this approach

Now all these are possible only if the RowId -> Base-Shard mapping is
maintained externally.

Thats why I had raised this META shard issue.

--
Ravi





On Thu, Jul 10, 2014 at 6:02 PM, Aaron McCurry <amccurry@gmail.com> wrote:

> It would be nice to understand what you changed in Blur and why so that we
> may be able to incorporate those changes in the code base so you don't have
> to modify on new releases.
>
> Thanks!
>
> Aaron
>
>
> On Tue, Jul 1, 2014 at 12:11 AM, Ravikumar Govindarajan <
> ravikumar.govindarajan@gmail.com> wrote:
>
> > Yeah...
> >
> > We have actually modified Blur to do this... But till now we were using
> an
> > external system {Redis} to hold the mappings. But I prefer it to be self
> > contained within Blur eco-system and thats why this META shard stuff came
> > up
> >
> > --
> > Ravi
> >
> >
> > On Tue, Jul 1, 2014 at 6:43 AM, Aaron McCurry <amccurry@gmail.com>
> wrote:
> >
> > > Would a configurable partitioner help with what you are trying to do?
> > >
> > >
> > > On Fri, Jun 27, 2014 at 1:43 PM, Ravikumar Govindarajan <
> > > ravikumar.govindarajan@gmail.com> wrote:
> > >
> > > > Aaron,
> > > >
> > > > We are trying to externalize RowId to Shard mapping, instead of
> current
> > > > hashing algo.
> > > >
> > > > We thought of storing this in a META shard, where lucene's in-memory
> > > > postings format can be used.
> > > >
> > > > Also BlurClient can cache these mappings so there are progressively
> > fewer
> > > > trips to the meta-shard.
> > > >
> > > > --
> > > > Ravi
> > > >
> > > >
> > > >
> > > > On Fri, Jun 27, 2014 at 6:35 PM, Aaron McCurry <amccurry@gmail.com>
> > > wrote:
> > > >
> > > > > When you get a change let us know what the use case is that you are
> > > > solving
> > > > > and perhaps we can add a feature to handle it.
> > > > >
> > > > > Aaron
> > > > >
> > > > >
> > > > > On Fri, Jun 27, 2014 at 7:57 AM, Ravikumar Govindarajan <
> > > > > ravikumar.govindarajan@gmail.com> wrote:
> > > > >
> > > > > > Thanks Tim,
> > > > > >
> > > > > > Shall surely do that. That should suffice
> > > > > >
> > > > > > --
> > > > > > Ravi
> > > > > >
> > > > > >
> > > > > > On Fri, Jun 27, 2014 at 3:21 PM, Tim Williams <
> > williamstw@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > On Friday, June 27, 2014, Ravikumar Govindarajan <
> > > > > > > ravikumar.govindarajan@gmail.com> wrote:
> > > > > > >
> > > > > > > > We have a use-case whereby there is a need for a META
shard.
> We
> > > > don't
> > > > > > > store
> > > > > > > > row-documents here as other normal shards, but some
arbitrary
> > > > > > searchable
> > > > > > > > meta-data
> > > > > > > >
> > > > > > > > Is it possible in Blur to do this?
> > > > > > > >
> > > > > > >
> > > > > > > Hi Ravi,
> > > > > > > Why not just create a "meta" table and put it in there
with a
> 1-1
> > > > > > > row-record relationship?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > --tim
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message