accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dlmar...@comcast.net
Subject Re: [VOTE] ACCUMULO-3176
Date Tue, 25 Nov 2014 20:21:47 GMT

The client api will evolve over time, it will always change but likely slowly. As a community
we should be mindful of the amount of work our users will have to endure when upgrading from
one version to another. At the same time, we should not prevent change because it will cause
one user more work over all the others. I don't think that for this specific issue the client
change is that great. 

Regarding the issue around the eventual consistency of property updates, I agree that it is
an issue. However, I think a veto of feature A because there is a defect in feature B sets
a dangerous precedent. Personally, I would rather see property update issue have it's own
JIRA issue. 

----- Original Message -----

From: "Sean Busbey" <busbey@cloudera.com> 
To: "dev@accumulo apache. org" <dev@accumulo.apache.org> 
Sent: Tuesday, November 25, 2014 2:57:35 PM 
Subject: Re: [VOTE] ACCUMULO-3176 

> altering the public API (as a standalone change) 

I am very conservative about changes to our public API. I have had to write 
code that works across Accumulo versions and even when only working with 
the public API is it painful on almost every release. I have also seen 
plenty of cases where deployments of Accumulo lag behind version changes 
because of this same issue. 

I fully support that major version changes are there for us to break APIs. 
That doesn't mean that such breakage doesn't have a high cost that needs to 
be weighed carefully. 

In general, for me to be positive on an API change there needs to be a 
compelling improvement or a correctness issue. For me, that correctness 
issue is the race condition on property updates you mention. 

While I acknowledge that a per-table configured load balancer can't be set 
prior to the initial tablet assignment without this change, I don't think 
that out weighs having to update all table creation code. (this will be the 
case even if we go through a deprecation cycle. the cycle just draws it 
out.) Granted, I have never had cause to use a per-table load balancer that 
was hampered by where the initial tablet was assigned. So I might just not 
know the level of pain that fix would address. 

> implementing a partial fix for the larger problem 

This fix leaves those who need a consistent view of some set of properties 
with only options that involve table creation, either in this changed API 
or via Keith's clone table work around on an offline table. That means if 
you need to alter a constraint or an iterator (that needs to be used 
universally), you have the same negative consequence as some of the 
proposed solutions while also having to change the underlying table id. More 
importantly, I think it leaves people prone to not realizing that their 
change to these things won't be uniformly enforced during 
the propagation period. 

We could mitigate some of the gotchas around when property updates need to 
be strongly consistent with better docs, but that only helps when people 
are at the point of reading docs. I'd rather stop them from self inflicted 
pain directly because people often don't have the time to read docs. I see 
stopping this correctness problem as worth the behavior change something 
like "only update properties when tables are offline" would necessitate. But 
this vote isn't about the merits of that particular alternative. 

I'm willing to consider other options for solving the consistent property 
issue and have tried to work through them in the comments around 
ACCUMULO-3176. I just don't believe the one meets my standard for 
inflicting API pain. 

On Tue, Nov 25, 2014 at 1:16 PM, Josh Elser <josh.elser@gmail.com> wrote: 

> Can I pick your brain some more? (also, I'm sorry if this is already 
> addressed in the JIRA comments somewhere. I'm being lazy) 
> 
> As we know, there is a larger problem of ensuring consistent configuration 
> across all servers in the cluster. I don't think there is any contention 
> here. There are ways we can do this, but no one has done it yet. That's the 
> general problem, and, I believe, what you mean by the "underlying issue". 
> 
> A subset of that problem is configuration updates for a table which is 
> newly created. In my experience, this is often how I get bitten by this 
> race condition -- I create a table, I updated some property (typically set 
> an iterator/combiner/constraint), and then did some insert/scan before the 
> tabletserver I communicated with got my property update. 
> 
> I see this taking what is a technically difficult problem (assumption, 
> since no one has done it yet) and making the problem partially better. In 
> practice for me, this also means that how I often encountered this race 
> condition is addressed. 
> 
> I also don't see the changes that Jenna wrote as a blocker to implementing 
> a "proper" blocking configuration update. 
> 
> Can you clarify your level of concern with this change in: altering the 
> public API (as a standalone change), implementing a partial fix for the 
> larger problem, and the combination of the previous two points? It would be 
> much appreciated. 
> 
> 
> 
> Sean Busbey wrote: 
> 
>> -1 
>> 
>> This change alters our public API while not solving the underlying issue 
>> of 
>> race conditions in property updates. 
>> 
>> On Tue, Nov 25, 2014 at 11:14 AM, Christopher<ctubbsii@apache.org> 
>> wrote: 
>> 
>> Committers, this is a consensus vote on whether or not to include Jenna's 
>>> patch for ACCUMULO-3176 to the 1.7.0-SNAPSHOT (master) branch. 
>>> 
>>> This patch improves the table creation API with the introduction of a 
>>> NewTableConfiguration object (similar to the pattern for 
>>> BatchWriterConfig), which allows us to be flexible on improving table 
>>> creation options in the future without creating many overloaded methods 
>>> (as 
>>> the table creation API has been plagued by in the past). The main goal of 
>>> the patch is to allow table properties to be set on a table at the time 
>>> of 
>>> creation, before any tablets are assigned, but it also lays the 
>>> foundation 
>>> for future table creation improvements. Creating initial table properties 
>>> was already present in the RPC calls, but not exposed in the API. This 
>>> can 
>>> support a number of use cases. 
>>> 
>>> Since an objection was raised by Sean Busbey (and a veto expected), I've 
>>> initiated this vote in lieu of applying the patch under lazy consensus so 
>>> that any veto votes can be justified and addressed here. 
>>> 
>>> Note: there are a few bugs in the Mock implementation of this that I've 
>>> fixed, as well as some minor deprecation warnings and javadoc 
>>> improvements 
>>> I'm adding, please apply your vote under the assumption that those will 
>>> be 
>>> fixed before it will be applied. 
>>> 
>>> Please vote in accordance with the bylaws for consensus voting. 
>>> My vote is +1. 
>>> 
>>> Thanks. 
>>> 
>>> -- 
>>> Christopher L Tubbs II 
>>> http://gravatar.com/ctubbsii 
>>> 
>>> 
>> 
>> 
>> 


-- 
Sean 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message