hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-880) Improve the current client API by creating new container classes
Date Fri, 19 Sep 2008 20:28:44 GMT

    [ https://issues.apache.org/jira/browse/HBASE-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632807#action_12632807

stack commented on HBASE-880:

I took a look at this patch with J-D as Virgil.

I need to study it more.  We have to be real careful making fundamental changes to our API
-- as is this patch -- to make sure we get it right.  On my first drive through, it strikes
me that it needs to be more elegant than it is at flush blush.

Meantime, here a few remarks.

Is below intentional:

Index: src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
--- src/java/org/apache/hadoop/hbase/client/HConnectionManager.java	(revision 696049)
+++ src/java/org/apache/hadoop/hbase/client/HConnectionManager.java	(working copy)
@@ -860,10 +860,12 @@
           if (tries == numRetries - 1) {
+            t.printStackTrace();
             throw new RetriesExhaustedException(callable.getServerName(),
                 callable.getRegionName(), callable.getRow(), tries, exceptions);
           if (LOG.isDebugEnabled()) {
+            t.printStackTrace();
             LOG.debug("reloading table servers because: " + t.getMessage());

Could add the exception as argument if wanted rather than printStackTrace.

Don't touch whats under v5.  Thats migration stuff.  Move what it needs under v5 if you want
to change BatchUpdate, etc., or just leave them in place till actual removal and we can worry
about how to migrate then.

Below is synopsis of J-D/Stack back and forth:

[12:33]	<st^ack>	jdcryans: RowGet doesn't have a constructor to set number of versions?
[12:34]	<jdcryans>	st^ack: I wish that we don't have to change the API at each new feature
[12:34]	<jdcryans>	so keeping a small constructor is, I think, the way to go
[12:34]	<jdcryans>	for the rest, setters
[12:35]	<st^ack>	Immutables are nice.
[12:35]	<st^ack>	You have to add setter/getter every time you add a new feature.
[12:36]	<st^ack>	The worry is that the constructor will grow madly?
[12:36]	<jdcryans>	st^ack: yeah
[12:36]	<jdcryans>	and that we have to have every combination for every feature set
[12:43]	<st^ack>	but we have to add a getter/setter for each new attribute, right?
[12:44]	<st^ack>	Why is RowDelete different from RowUpdate?
[12:45]	<jdcryans>	st^ack: yes, new getter/setter, that's what I think is better instead
of g/s + all contructors
[12:46]	<jdcryans>	RowDelete is for deleteAll and deleteFamily, does not have a list
of batchoperations
[12:46]	<jdcryans>	if you think it's bad, I would say that a=b=c so current is bad too
[12:46]	<jdcryans>	current API*
[12:48]	<st^ack>	deleteAll breaks things. usually the operation contains the rows to
operate on. when deleteAll, the operation doesn't supply columns -- presumtpion is all columns.
[12:48]	<st^ack>	Current API doesn't scale, true.
[12:48]	<st^ack>	How do we do write-if-not-modified using this changed API?
[12:49]	<st^ack>	you know, the feature where we only update a value if it hasn't been
[12:49]	<jdcryans>	yeah yeah I was in that conversation
[12:49]	<st^ack>	i.e. take out a lock on row, look at value, then withing same row lock
do update if not changed.
[12:50]	<jdcryans>	that would be another operation I guess
[12:51]	<st^ack>	jdcryans: I don't think I like the fact that our RPC is having primitive
types replaced by more compound types
[12:51]	<jdcryans>	at least that would make another bunch of methods in HTable
[12:51]	<st^ack>	let me read more
[12:52]	<jdcryans>	then it will be hard to batch gets
[12:52]	<st^ack>	sort of.
[12:52]	<st^ack>	Could be row [][]
[12:53]	<st^ack>	column and timestamp same for all
[12:53]	<st^ack>	let me read more
[12:53]	<jdcryans>	st^ack: yeah but's limiting
[12:53]	<jdcryans>	public RowResult getRow(final byte [][] row, final byte [][][] columns,
final long[] ts, final RowLock[] rl) would be ugly too
[12:59]	<st^ack>	On the pattern of setters vs. constructor args, I kinda feel we should
do one or the other. Odd spanning both.
[13:00]	<jdcryans>	st^ack: y
[13:00]	<st^ack>	Regards expanding constructor, is it not the case that it won't be
as bad once the extra args have been moved out of HTable method names instead into Operation
[13:01]	<jdcryans>	st^ack: won't be as bad, true
[13:02]	<st^ack>	I suppose adding a new feature, you'll have to fix the baseline Operation
class and then all of its derivatives. That'll be a bit messy.
[13:02]	<jdcryans>	st^ack: you mean the constructors?
[13:02]	<st^ack>	Yeah, constructors
[13:03]	<jdcryans>	yes that's something I saw too, having to recopy all of them
[13:03]	<jdcryans>	part of why I prefer to keep g/s (but forgot about it since now :P)
[13:04]	<st^ack>	Whats a constructor now? column(s), start-timestamp, end-timestamp,
[13:04]	<st^ack>	Does lock belong inside an Operation?
[13:04]	<st^ack>	Shouldn't lock be outside an operation?
[13:06]	<jdcryans>	st^ack: same goal, remove overloads
[13:06]	<jdcryans>	I see that as one new feature, got the same treatment
[13:07]	<st^ack>	locks span operations though?
[13:07]	<st^ack>	row locks span row operations
[13:07]	<st^ack>	Its different that timestamp, etc.
[13:07]	<jdcryans>	proposed design does not prevent that
[13:09]	<jdcryans>	would have to bundle lock with every operation, pretty much in the
same way as if it was another parameter
[13:12]	<st^ack>	To lock across a set of Operations, you suggest bundling the lock with
each Operation -- setting it into each Operation?
[13:12]	<jdcryans>	yes
[13:12]	<st^ack>	What if user included an Operation in a set for a row that did not
include the lock.
[13:12]	<st^ack>	How should that run over on the server.
[13:13]	<st^ack>	What is difference between RowDelete and RowUpdate (or do you want
me to read the code -- smile)
[13:13]	<jdcryans>	garbage in garbage out for that user
[13:14]	<jdcryans>	I don't copy that last question
[13:14]	<st^ack>	If lock was a distinct param on a method in HTable, then we'd have
to have two versions of every method -- one with lock and one without?
[13:15]	<st^ack>	as we do now.
[13:15]	<st^ack>	I want to know how RowUpdate and RowDelete differ. Why couldn't I just
pass a RowUpdate to a deleteAll?
[13:15]	<jdcryans>	yes
[13:17]	<jdcryans>	deleteAll wouldn't know which column to take
[13:17]	<jdcryans>	if RowUpdate has many BatchOperations
[13:18]	<jdcryans>	RowUpdate == BatchUpdate
[13:20]	<st^ack>	Sorry, I don't follow. I need to study this patch more.

> Improve the current client API by creating new container classes
> ----------------------------------------------------------------
>                 Key: HBASE-880
>                 URL: https://issues.apache.org/jira/browse/HBASE-880
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.19.0
>         Attachments: hbase-880-v1.patch
> The current API does not scale very well. For each new feature, we have to add many methods
to take care of all the overloads. Also, the need to batch row operations (gets, inserts,
deletes) implies that we have to manage some "entities" like we are able to do with BatchUpdate
but not with the other operations. The RowLock should be an attribute of such an entity.
> The scope of this jira is only to replace current API with another feature-compatible
one, other methods will be added in other issues.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message