hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <andrew.purt...@gmail.com>
Subject Re: Append Visibility Labels?
Date Thu, 14 Apr 2016 18:11:57 GMT
> > ​Actually the old product data doesn't have to die, Benedict. Set
> > VERSIONS > 1 in your schema. The old cell version(s) carrying the old
> > label set will still be there, accessible with a Scan that asks for N
> > versions instead of just the latest. You'll get back a Result with up
> > to N cells to iterate over and figure out how to process and display
> > the information. If you only want the latest, use a Get instead.
>
> Good to know that we haven't killed off the old products! But I'm not sure
> the archaeological approach would scale.

I'm curious if it would get you over your current hump.

Something we could consider is providing an operation attribute that tells
core to do what Append and Increment already do, which is for all tags on
the old value, grab them and add them to the tag set of the current value.
No plug in combiners. Tags are "combined" by core as in piled up all in the
latest cell. However this has a bunch of problems:
- Mutations carrying that attribute would now have to read and possibly go
to disk to find any relevant old value
- Coprocessors like the AccessController and VisibilityController must be
taught to handle cases where when enumerating over tags on a cell they'll
find more than one. They should handle this anyway though. I need to check
the code to see what they do (or don't do)
- Tags themselves don't have timestamps. We can try to keep them sorted (by
time) when building lists of them in memory and serializing them.
- Unlikely that one-size-fits-all semantics will satisfy everyone, or anyone

> ​The generic facility you describe, caveats noted, certainly seems to fit
> our use case - especially if we are talking of combining label
expressions.
> I guess we'd always use an 'OR' operator to add them. But what if we
> wanted to remove a product/visibility label?

That's a problem with a generic approach. The default 'combiner' for the
visibility label tag type would do one general thing - probably, OR. So
we'd want to allow users to supply their own, configurable in CF schema,
and I imagine having just one will not be flexible enough, so supply a
stack of them, and probably in implementation combination should happen at
compaction time because that's when we are iterating over cells anyway and
when expired cells or cells lying under a tombstone or newer version would
otherwise be lost - and hey! now we've implemented Accumulo's iterators in
HBase. Why not just do that?


On Wed, Apr 13, 2016 at 11:05 AM, <benedict.whittamsmith@thomsonreuters.com>
wrote:

> Yes - it's a capability we would need to efficiently support permissioning.
>
> Good to know that we haven't killed off the old products! But I'm not sure
> the archaeological approach would scale.
>
> ​​
> The generic facility you describe, caveats noted, certainly seems to fit
> our use case - especially if we are talking of combining label expressions.
>
> I guess we'd always use an 'OR' operator to add them. But what if we
> wanted to remove a product/visibility label?
>
> -----Original Message-----
> From: Andrew Purtell [mailto:andrew.purtell@gmail.com]
> Sent: 13 April 2016 17:23
> To: user@hbase.apache.org
> Subject: Re: Append Visibility Labels?
>
> I think Benedict was asking if it would be possible to add the capability.
>
> ​​
> Actually the old product data doesn't have to die, Benedict. Set VERSIONS
> > 1 in your schema. The old cell version(s) carrying the old label set will
> still be there, accessible with a Scan that asks for N versions instead of
> just the latest. You'll get back a Result with up to N cells to iterate
> over and figure out how to process and display the information. If you only
> want the latest, use a Get instead.
>
> I think it could be possible to introduce a generic facility for handling
> the case where you have an existing value on the server, that value has
> tags attached, now a new mutation op has arrived with a tag attached _and_
> another op attribute set by the client is asking for any tags on an earlier
> cell version be brought forward. For each tag type there would be a
> registered "combiner" that does what makes sense for its particulars. We do
> this in core for Append and Increment already, but without the notion of
> combination. This is an off the cuff remark, caveat: I haven't spent time
> thinking through implications.
>
> > On Apr 13, 2016, at 8:58 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > There is currently no API for appending Visibility Labels.
> >
> > checkAndPut() only allows you to compare value, not labels.
> >
> > On Wed, Apr 13, 2016 at 8:12 AM,
> > <benedict.whittamsmith@thomsonreuters.com>
> > wrote:
> >
> >> We sell data. A product can be defined as a permission to access data
> >> (at a cell level). Visibility Labels look like a very good candidate
> >> for implementing this model.
> >>
> >> The implementation works well until we create a new product over old
> data.
> >> We can set the visibility label for the new product but, whoops, by
> >> applying it to the relevant cells we've overwritten all the existing
> >> labels on those cells, destroying the permissioning of our older
> >> products. What to do?
> >>
> >> One answer would be to append the new visibility label to the
> >> existing label expressions on the cells with an 'OR'. But I'm not
> >> sure that's possible .. yet?
> >>
> >> Thanks,
> >>
> >> Ben
> >>
> >> ________________________________
> >>
> >> This e-mail is for the sole use of the intended recipient and
> >> contains information that may be privileged and/or confidential. If
> >> you are not an intended recipient, please notify the sender by return
> >> e-mail and delete this e-mail and any attachments. Certain required
> >> legal entity disclosures can be accessed on our website.<
> >> http://site.thomsonreuters.com/site/disclosures/>
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message