incubator-blur-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Williams <william...@gmail.com>
Subject Re: Possible to have a META shard?
Date Fri, 11 Jul 2014 12:28:25 GMT
On Thu, Jul 10, 2014 at 1:00 PM, Ravikumar Govindarajan
<ravikumar.govindarajan@gmail.com> wrote:
> Aaron,
>
> This is a lengthy post. Please bear...
>
> We are looking at Blur slightly differently. No Map-Red ops, No immutable
> RowId data etc... Just plain online-search like regular lucene/SOLR/ES
>
> Our use-case mandates that Documents for a RowId will arrive incrementally.
> We don't have the luxury of dropping the whole-row and re-indexing it, as a
> given Row will have hundreds of thousands of docs...
>
> A single row-id will always be found in one shard, but spread across
> segments. We have modified blur sources on both indexing/search side to
> support this requirement
>
> In other words, we support ADD_RECORDS thrift-op to an existing Row..
>
> We actually are now testing a sharding strategy similar to databases in Blur
>
> 1. Initially we start with lets say 300 shards per table aka base-shards
> 2. Each shard has a fixed size lets say 16 GB. Client will watch for this
>     and spawn a new shard when size exceeds. {An alias-shard in ES terms}
> 3. ZK will hold the Base --> List-of-Alias shards
> 4. A RowId will be allocated a shard that has least number of alias shards.
>     This mapping will never change in the lifetime of a Row
> 5. ADD_RECORDS op will go the latest alias, while DEL/UPDATE will go to
>     all aliases+base shards.
> 6. Once all 300 base-shards have spawned aliases, admins can create new
>     base shards on the cluster. Newer RowIds will auto-allocate to freshly
>     created shards
> 7. Both horizontal & vertical scaling of shards can be supported easily by
>     this approach
>
> Now all these are possible only if the RowId -> Base-Shard mapping is
> maintained externally.

Hi Ravi,
Can you explain how searching across a records in a row works in this
case?  For example, the row query example in the docs[1]?

Thanks,
--tim

[1] - http://incubator.apache.org/blur/docs/0.2.2/data-model.html#row_query

Mime
View raw message