hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jesse Yates (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4605) Add constraints as a top-level feature
Date Mon, 17 Oct 2011 21:11:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13129203#comment-13129203

Jesse Yates commented on HBASE-4605:

I like the idea of using a system level coprocessor with a minimal extension interface for
the checks to be performed. For the actual interface, you could even use Predicate from the
google guava lib, or have Constraint just be a named interface that extends Predicate<Put>.
Not critical, but plugging in to a standard interface instead of doing a one-off may enable
future uses...
That is exactly what I was thinking for the "top-level" implementation

seems like we could make that sufficiently generic to enable both the coprocessors case and
this with just changes to the shell code

Right now coprocessors have a special syntax for loading on table level, which feels kind
of clunky to do by hand (specifying COPROCESSOR$). I feel like we could definitely help enable
setting values with a more concrete syntax (like a setCoprocessor method that we have on the
HTableDescriptor now), which should handle the numbering, etc. 

So using an abstract version of the stuff from 4554 would definitely help with that. I don't
know if we can just the use update shell though - we would probably need to update the java
connection as well.

Right now the code for storing things in the conf would be fine, we just need to abstract
it a little bit, so it would look something like:
public void addCoprocessor(name){
 addProcessingElement("coprocessor$", name);}

public void addConstriant(name){
 addProcessingElement("constriant$", name);}

public void addProcessingElement(String tag, String value){
...//all the checking/add currently in addCoprocessor


Since they are just table configuration values, turning them on/off will be relatively painless.

Cross-table transactions is separate can of worms and really goes against the whole design
paradigm of HBase (see discussion on dev about this). This would be optimized to do single
table checking, though people could implement cross table checks at serious cost (and later
we can build in more optimized mechanisms if it is a common thing people do).

HCatalog schema will be transformable as HBase constraints, adding value to the two of them...

That should be super simple, it would just take a simple tool to create the corresponding
constraints. I would use constraints to enforce things like data sanitation, rather than schema
enforcement (its the last ditch barrier to things going into a table properly, since shipping
things across the wire is expensive), which should be done client side, but it could definitely

> Add constraints as a top-level feature
> --------------------------------------
>                 Key: HBASE-4605
>                 URL: https://issues.apache.org/jira/browse/HBASE-4605
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, coprocessors
>    Affects Versions: 0.94.0
>            Reporter: Jesse Yates
>            Assignee: Jesse Yates
> From Jesse's comment on dev:
> {quote}
> What I would like to propose is a simple interface that people can use to implement a
'constraint' (matching the classic database definition). This would help ease of adoption
by helping HBase more easily check that box, help minimize code duplication across organizations,
and lead to easier adoption.
> Essentially, people would implement a 'Constraint' interface for checking keys before
they are put into a table. Puts that are valid get written to the table, but if not people
can will throw an exception that gets propagated back to the client explaining why the put
was invalid.
> Constraints would be set on a per-table basis and the user would be expected to ensure
the jars containing the constraint are present on the machines serving that table.
> Yes, people could roll their own mechanism for doing this via coprocessors each time,
but this would make it easier to do so, so you only have to implement a very minimal interface
and not worry about the specifics.
> {quote}

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message