hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doğacan Güney (JIRA) <j...@apache.org>
Subject [jira] Commented: (HBASE-899) Support for specifying a timestamp and numVersions on a per-column basis
Date Thu, 25 Sep 2008 14:19:44 GMT

    [ https://issues.apache.org/jira/browse/HBASE-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634489#action_12634489

Doğacan Güney commented on HBASE-899:

> Although in general what this request is asking for is to move some overhead of culling
results from client side to server side. In general is that a good idea? Region servers are
quite busy.

I am just worried about having to pass large amounts of data over RPC, only to consistently
discard. It seems... a bit wasteful :D

And, if hbase intends to support row-wide timestamp range and numVersions, I just don't see
how doing it per-column would be any more difficult or slower. A many-column read will already
be done in a read-one-column-merge-result-to-rest kind of way. So, while reading one column,
region server just checks what user specified for that column. (or maybe I am missing something:)

> Support for specifying a timestamp and numVersions on a per-column basis
> ------------------------------------------------------------------------
>                 Key: HBASE-899
>                 URL: https://issues.apache.org/jira/browse/HBASE-899
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Doğacan Güney
> This is just an idea and it may be better to wait after the planned API changes. But
I think it would be useful to support fetching different timestamps and versions for different
> Example:
> If a row has 2 columns, "col1:" and "col2:" I want to be able to ask for (during scan
or read time, doesn't matter) 2 versions of "col1:" (maybe even between timestamps t1 and
t2) but only 1 version of "col2:". This would be especially handy if during an MR job you
have to read 2 versions of a small column, but do not want the overhead of reading 2 versions
of every other column too....
> (Also, the mechanism is already there. I mean, making the changes to support a per-column
timestamp/numVersions is  ridiculously easy :)

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message