hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matteo Bertozzi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8693) Implement extensible type API based on serialization primitives
Date Thu, 18 Jul 2013 23:24:50 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13713076#comment-13713076
] 

Matteo Bertozzi commented on HBASE-8693:
----------------------------------------

Thanks for keeping following up on my out of scope questions.

again, I think that I'm focusing more on the cell-value side instead of the key part which
will be the one that will have the benefit from the ordered byte stuff and will probably have
more restriction on the evolution since this stuff is client side only and you've to deal
with the raw byte sorting of hbase.

{quote}It's quite out of scope for my purposes, but I'm curious what you think about the future
direction with schema. I think the Phoenix and Kiji folk will have some good insights.{quote}

(I'll talk only about cell-values here, so I'm not interested in the ordered stuff in this
case)
I want to write my app today with this library.
I'll start off using a Struct, and it's ok until I have to add/remove a field.
so.. I can add a version/schema id.. but now I have the problem that I have to keep all the
schemas and then project to the schema that I want to use.

Example:
- get row0 -> cell with schema 1
- get row1 -> cell with schema 2
- get row2 -> cell with schema 3
- Now the user/api have to handle this 3 different rows and project to a user provided schema
to get out something useful to the user...

In this case, you have to store all the schemas and you've to provide a mapping for each schema
to the one that the user wants.

The other approach, more protobuf like is each field has an id that must be unique. on read
you provide your "read schema" and you load only the field present in the "read schema".
note that this can also work with just with the api similar to what you have "getField(field_id)"
where the id is the unique id and not the index.

again, I think that your focus at the moment is more on the key side... and my guess is that
the struct is fine for that.
but this jira is "serialization primitives" without a "row-keys" in front... so I assume you
plan to use this stuff also for the cell values, and from what I said above... I don't see
an easy way to evolve my cell data, without rewrite every time or doing "manual" mappings
for each struct version.
                
> Implement extensible type API based on serialization primitives
> ---------------------------------------------------------------
>
>                 Key: HBASE-8693
>                 URL: https://issues.apache.org/jira/browse/HBASE-8693
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Client
>            Reporter: Nick Dimiduk
>            Assignee: Nick Dimiduk
>             Fix For: 0.95.2
>
>         Attachments: 0001-HBASE-8693-Extensible-data-types-API.patch, 0001-HBASE-8693-Extensible-data-types-API.patch,
0001-HBASE-8693-Extensible-data-types-API.patch, 0002-HBASE-8693-example-Use-DataType-API-to-build-regionN.patch,
KijiFormattedEntityId.java
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message