hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Dimiduk (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction
Date Thu, 28 Feb 2013 21:01:14 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13589917#comment-13589917

Nick Dimiduk commented on HBASE-7221:

bq. I must say that the class names in HBASE-7692 are similar to those in HBASE-7221, which
isn’t entirely surprising because HBASE-7221 is the ticket that started this whole rowkey
construction conversation in the first place.

Indeed they are similar, but on why I cannot remark. In 7692 simply used what Orderly provides.
Personally, I think {{RowKey}} should not be a part of the name in either API; these are not
just for rowkeys but for anywhere the existing Client API expects a {{byte[]}}.

Personally, I wish the user didn't have to think about any of this. HBase should ship with
it's own type-system, have convenient constructors for those types based on Java types, and
be done with it. Forcing the user to think about this as being used for a rowkey vs a column
qualifier vs a value is unnecessary cognitive overhead that we *should* avoid. When I want
to use a constant like {{5}} in a SQL expression, I just type {{5}} and the system handles
coercing it into the appropriate system type; that's it. Likewise, I should be able write
{{Get g = new Get(5).addColumn("c1", 12);}}. But I digress.

bq. But I think that 7692 is mixing terms and is harder to use and understand. 7221 refers
to a RowKey as the “whole key” (e.g., 7221’s FixedLengthRowKey) which is consistent
with that usage in HBase, whereas a part of a RowKey in 7221 is called a RowKeyElement. To
contrast 7692's classnames, is BigDecimalRowKey the whole thing? Or a part of the rowkey?

Agreed. 7692 is designed to solve a different problem than this ticket. It just happens to
also include the feature described here. Both patches use the term "RowKey" in their API,
to their detriment. This confusion is why I don't like {{RowKey}} used in these APIs. In Orderly,
any of the {{*RowKey}} types can be used to create {{byte[]}}s for use in any context. So
yes, {{BigDecimalRowKey}} produces a {{byte[]}} so it can be used as a stand-alone rowkey
or as part of a compound rowkey, as desired. The {{StructRowKey}}, roughly analogous to this
ticket's {{FixedLengthRowKey}}, is just another way to produce a {{byte[]}}.

The Orderly library has the added benefit of providing both fixed-length and variable-length
encodings for the applicable types. It also includes support for specifying serialization
order, a necessary consideration when implementing an HBase schema, something this ticket's
latest patch cannot provide because of its dependency on {{Bytes}}.

bq. The hashing, while it can be added to 7692, was designed in from the get-go with 7221
because that’s the way we recommend folks to build keys. Lars/Ian, as you pointed out earlier
in this ticket there is a reason that you found the 7221 approach familiar even in the first
approach - because it’s similar to what you did internally.

7692 can easily add support for hashing. Further, that support can be mixed with variable-length
components. Otherwise, what we have here is a stylistic approach -- the builder pattern vs
the format-string approach. This is a matter of taste, upon which the 'Client' component owners
should comment.

bq. Personally, I think this the 7221 approach is easier to understand and use, and still
has safety-nets built-in for length testing on setters.

Again, this is a difference of opinion between you and I, consistent with my initial comments
in this ticket about using the format-string style instead of builders. The "safety-net" is
an implementation detail of any fixed-length implementation; this is provided by implementations
attached to both tickets.
> RowKey utility class for rowkey construction
> --------------------------------------------
>                 Key: HBASE-7221
>                 URL: https://issues.apache.org/jira/browse/HBASE-7221
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: HBASE_7221.patch, hbase-common_hbase_7221_2.patch, hbase-common_hbase_7221_v3.patch,
hbase-common_hbase_7221_v4.patch, hbase-server_hbase_7221_v5.patch, hbase-server_hbase_7221_v6.patch
> A common question in the dist-lists is how to construct rowkeys, particularly composite
keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to you to sensibly populate
that byte-array, and that's where things tend to go off the rails.
> The intent of this RowKey utility class isn't meant to add functionality into Put/Get/Scan,
but rather make it simpler for folks to construct said arrays.  Example:
> {code}
>    RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
>    key.addHash(a);
>    key.add(b);
>    byte bytes[] = key.getBytes();
> {code} 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message