hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Vimont (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HBASE-17257) Add column-aliasing capability to hbase-client
Date Sun, 11 Dec 2016 01:20:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15738807#comment-15738807
] 

Daniel Vimont edited comment on HBASE-17257 at 12/11/16 1:20 AM:
-----------------------------------------------------------------

Okay, when I Google about for "hbase review board", two different RB implementations come
up: one hosted by Apache (review.apache.org), the other by Cloudera (review.cloudera.org).
I'm starting out presuming that we're using the Apache-hosted one. My Apache account that
already exists for logging into this JIRA system does not work for getting me into review.apache.org,
so I created a new account to get in.

Curious item: when I Googled for "how to post to hbase review board", the first link that
came up was by a certain "Ted Yu" ;) who was running into trouble (back in 2010) getting logged
into Review Board and posting a diff file: http://grokbase.com/t/hbase/dev/10c5bdz188/putting-patch-up-on-review-board

Anyway, using my new account to get into review.apache.org, I selected "New Review Request",
and in the "New Review Request for Pending Change" pane I uploaded my patch, whereupon the
web UI asked me "What is the base directory for this diff?". Supposing the correct answer
to this was the relative directory within the project in which I generated the patch, I entered
"/hbase". Then I receive the error, "The specified diff file could not be parsed. Line 43:
No valid separator after the filename was found in the diff header".

So, must I manually alter the patch header to make it acceptable to RB? Or do I need to use
some other git function to generate a differently formatted "diff" file? Or something else
altogether?

BTW, after I get this all sorted out, I will definitely create a new JIRA entry for inserting
basic info about the existence and usage of RB into the HBase Reference Guide!


was (Author: daniel_vimont):
Okay, when I Google about for "hbase review board", two different RB implementations come
up: one hosted by Apache (review.apache.org), the other by Cloudera (review.cloudera.org).
I'm starting out presuming that we're using the Apache-hosted one. My Apache account that
already exists for logging into this JIRA system does not work for getting me into review.apache.org,
so I created a new account to get in.

Curious item: when I Googled for "how to post to hbase review board", the first link that
came up was by a certain "Ted Yu" who was running into trouble (back in 2010) getting logged
into Review Board and posting a diff file: http://grokbase.com/t/hbase/dev/10c5bdz188/putting-patch-up-on-review-board

Anyway, using my new account to get into review.apache.org, I selected "New Review Request",
and in the "New Review Request for Pending Change" pane I uploaded my patch, whereupon the
web UI asked me "What is the base directory for this diff?". Supposing the correct answer
to this was the relative directory within the project in which I generated the patch, I entered
"/hbase". Then I receive the error, "The specified diff file could not be parsed. Line 43:
No valid separator after the filename was found in the diff header".

So, must I manually alter the patch header to make it acceptable to RB? Or do I need to use
some other git function to generate a differently formatted "diff" file? Or something else
altogether?

BTW, after I get this all sorted out, I will definitely create a new JIRA entry for inserting
basic info about the existence and usage of RB into the HBase Reference Guide!

> Add column-aliasing capability to hbase-client
> ----------------------------------------------
>
>                 Key: HBASE-17257
>                 URL: https://issues.apache.org/jira/browse/HBASE-17257
>             Project: HBase
>          Issue Type: New Feature
>          Components: Client
>    Affects Versions: 2.0.0
>            Reporter: Daniel Vimont
>            Assignee: Daniel Vimont
>              Labels: features
>         Attachments: HBASE-17257-v2.patch, HBASE-17257-v3.patch, HBASE-17257.patch
>
>
> Column aliasing will provide the option for a 1, 2, or 4 byte alias value to be stored
in each cell of an "alias enabled" column-family, in place of the full-length column-qualifier.
Aliasing is intended to operate completely invisibly to the end-user developer, with absolutely
no "awareness" of aliasing required to be coded into a front-end application. No new public
hbase-client interfaces are to be introduced, and only a few new public methods should need
to be added to existing interfaces, primarily to allow an administrator to designate that
a new column-family is to be alias-enabled by setting its aliasSize attribute to 1, 2, or
4.
> To facilitate such functionality, new subclasses of HTable, BufferedMutatorImpl, and
HTableMultiplexer are to be provided. The overriding methods of these new subclasses will
invoke methods of the new AliasManager class to facilitate qualifier-to-alias conversions
(for user-submitted Gets, Scans, and Mutations) and alias-to-qualifier conversions (for Results
returned from HBase) for any Table that has one or more alias-enabled column families. All
conversion logic will be encapsulated in the new AliasManager class, and all qualifier-to-alias
mappings will be persisted in a new aliasMappingTable in a new, reserved namespace.
> An informal polling of HBase users at HBaseCon East and at the Strata/Hadoop-World conference
in Sept. 2016 showed that Column Aliasing could be a popular enhancement to standard HBase
functionality, due to the fact that full column-qualifiers are stored in each cell, and reducing
this qualifier storage requirement down to 1, 2, or 4 bytes per cell could prove beneficial
in terms of reduced storage and bandwidth needs. Aliasing is intended chiefly for column-families
which are of the "narrow and tall" variety (i.e., that are designed to use relatively few
distinct column-qualifiers throughout a large number of rows, throughout the lifespan of the
column-family). A column-family that is set up with an alias-size of 1 byte can contain up
to 255 unique column-qualifiers; a 2 byte alias-size allows for up to 65,535 unique column-qualifiers;
and a 4 byte alias-size allows for up to 4,294,967,295 unique column-qualifiers.
> Fuller specifications will be entered into the comments section below. Note that it may
well not be viable to add aliasing support in the new "async" classes that appear to be currently
under development.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message