accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Medinets <david.medin...@gmail.com>
Subject Re: How does Accumulo compare to HBase
Date Wed, 25 Jun 2014 17:11:09 GMT
Your ideas aren't wrong. I'm just providing alternatives to think
about. If you co-mingle data in the same table then every piece of
software that accesses that table needs to know about the co-mingling.

I hadn't thought about namespaces. That could be a good approach.

On Wed, Jun 25, 2014 at 12:16 PM, Jianshi Huang <jianshi.huang@gmail.com> wrote:
> Hi David,
>
> Having all data in one table can be beneficial.
>
> Imagine you added some new metadata, or added some new indexes. You want to
> beta testing your next version using real/production data, while keep the
> existing version working. With different visibility settings they can work
> on the same table.
>
> But maybe having a replication (or subset) is a better approach, as it will
> guarantee safety to the production data...
>
> Jianshi
>
>
>
> On Wed, Jun 25, 2014 at 7:45 PM, David Medinets <david.medinets@gmail.com>
> wrote:
>>
>> Adding the environment name to the table name is one approach. Or use
>> a metadata table to hold the name of the Accumulo table using the
>> environment as part of the key to find the correct table. The second
>> approach can be quite flexible because the lookup key can incorporate
>> any information - like the name of the developer. Thus every developer
>> could have their own table in every environment.
>>
>>
>> On Wed, Jun 25, 2014 at 5:30 AM, Jianshi Huang <jianshi.huang@gmail.com>
>> wrote:
>> > Ah I see. Then I need to control versioning myself. A customized
>> > versioning
>> > iterator aware of a/b/prod labels?
>> >
>> > Maybe there's a better way to do it.
>> >
>> > Jianshi
>> >
>> >
>> >
>> > On Wed, Jun 25, 2014 at 4:19 PM, Sean Busbey <busbey@cloudera.com>
>> > wrote:
>> >>
>> >> On Wed, Jun 25, 2014 at 2:52 AM, Jianshi Huang
>> >> <jianshi.huang@gmail.com>
>> >> wrote:
>> >>>
>> >>> + another 2cents myself
>> >>>
>> >>> I think one innovative way to use the visibility tag is for version
>> >>> controls in development. I can set, say, "alpha", "beta", "released"
>> >>> visibility tags to each cell and set different users in testing and
>> >>> production. Looks like this will simplify testing a lot.
>> >>>
>> >>> i.e.
>> >>> - production user: "production"
>> >>> - beta testing user: "beta" && "production"
>> >>> - alpha testing user: "alpha" && "beta" && "production"
>> >>>
>> >>> BTW, will they be counted as same record with different version? Or
>> >>> different records?
>> >>>
>> >>> Does that make sense?
>> >>>
>> >>>
>> >>
>> >>
>> >> Within Accumulo those will be different cells. In HBase they will be
>> >> different versions of the same cell.
>> >>
>> >> There are tradeoffs for both approaches. In Accumulo, for example, if
>> >> you
>> >> have
>> >>
>> >> row 1 | user props | bob | alpha          |  ts0 | foo=dee
>> >> row 1 | user props | bob | beta            |  ts2 | foo=cats
>> >> row 1 | user props | bob | production |  ts1 | foo=bar
>> >>
>> >> then with your given user accesses, those users will see multiple cells
>> >> and you'll need application logic to deal with it.
>> >>
>> >>
>> >> --
>> >> Sean
>> >
>> >
>> >
>> >
>> > --
>> > Jianshi Huang
>> >
>> > LinkedIn: jianshi
>> > Twitter: @jshuang
>> > Github & Blog: http://huangjs.github.com/
>
>
>
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/

Mime
View raw message