hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Dimiduk (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8693) Implement extensible type API based on serialization primitives
Date Fri, 19 Jul 2013 16:52:50 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13713834#comment-13713834

Nick Dimiduk commented on HBASE-8693:

This {{HDataType}} interface and the two codecs upon which the implementations rely is not
schema management for HBase. {{HDataType}} can be used to manage encoding values into rowkeys,
column qualifiers, or values. Use an instance of {{Struct}}, or don't, in any of those contexts.
The use of {{Struct}} in the order-sensitive context has driven more design thought, but it
generates a {{byte[]}} wherever it's used. Would an example of an Avro, Thrift, or Protobuff
{{HDataType}} implementation help to drive this idea home?

My trouble with using the word "schema" for key-values is that context is too narrow a scope.
Being able to consistently read a value out of a cell does not tell me what the schema of
the database is. HBase provides basic *table* definition management but not *data* definition
management, the effective meaning of schema. Pheonix and Kiji both provide a layer of schema
management on top of HBase. Through them you define the logical layout of data in tables,
and you abandon to them how that data is physically arranged and encoded. {{HDataType}} provides
an API with which its user can control how data is physically arranged and encoded. Its user
is still left to manage the logical layout and its meaning to their application for themselves.

This patch is not schema management. It provides a common set of primitives that other applications
can consume -- be them user applications developed directly against HBase or Phoenix or Kiji
themselves. The consumers I've always had in mind have always been myself and application
developers like me, Hive, Pig, and Phoenix. The primary benefit being that all those applications
gain some level of interoperability through data in HBase. That I was able to read Kiji's
avdl file and in an afternoon understand how HDataType could be used to make it's implementation
simpler and more extensible is validation of utility.
> Implement extensible type API based on serialization primitives
> ---------------------------------------------------------------
>                 Key: HBASE-8693
>                 URL: https://issues.apache.org/jira/browse/HBASE-8693
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Client
>            Reporter: Nick Dimiduk
>            Assignee: Nick Dimiduk
>             Fix For: 0.95.2
>         Attachments: 0001-HBASE-8693-Extensible-data-types-API.patch, 0001-HBASE-8693-Extensible-data-types-API.patch,
0001-HBASE-8693-Extensible-data-types-API.patch, 0002-HBASE-8693-example-Use-DataType-API-to-build-regionN.patch,

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message