accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Tubbs" <ctubbsii...@apache.org>
Subject Re: Review Request 28873: ACCUMULO-3393 Follow-on work for per-table volume chooser.
Date Wed, 10 Dec 2014 18:36:21 GMT


> On Dec. 10, 2014, 2:54 a.m., Christopher Tubbs wrote:
> > server/base/pom.xml, lines 79-82
> > <https://reviews.apache.org/r/28873/diff/1/?file=787887#file787887line79>
> >
> >     New dependency? Does it add significant value to warrant additional dependency
conflict resolution and packaging complexity?
> 
> Sean Busbey wrote:
>     It's generally better for us to rely on an external dependency that focuses on a
task outside of our main concerns since they'll give more focus on things. I didn't see any
of our current dependencies providing an object pool. If you know of one, please let me know
and I'll switch to it.
>     
>     As for commons-pool in particular, it's a ASF project and it does exactly the task
needed and no more. It'll also be removable once the TODO above its use is done.

This is fine, as far as libraries go. I just didn't see the need for the object pool in the
first place. We can shift discussion to that. (For what it's worth, Guava does have pools,
but not in any released version, as far as I can tell.)


> On Dec. 10, 2014, 2:54 a.m., Christopher Tubbs wrote:
> > server/base/src/main/java/org/apache/accumulo/server/fs/PerTableVolumeChooser.java,
line 39
> > <https://reviews.apache.org/r/28873/diff/1/?file=787888#file787888line39>
> >
> >     The volatile for lazy-initialization is not needed. We don't suffer from having
this assigned twice, which this code doesn't prevent. Also, it should be initialized to null
here before it is assigned to localConf.
> >     
> >     It should also be ServerConfigurationFactory. The ServerConfiguration type was
only retained internally after Havanki's refactor to avoid changing the balancer API.
> >     
> >     Ideally, we'd have an init method and be able to pass in the AccumuloServerContext,
but this was problematic last I looked, because it created a dependency cycle between the
VolumeManager and ServerConfigurationFactory, because this is called by the VolumeManager
code and that is needed by the ServerConfigurationFactory (so it can't depend on the ServerConfigurationFactory).
> 
> Sean Busbey wrote:
>     The volatile is needed to ensure that Object initialization is properly handled before
the Object is visible to other threads.
>     
>     The default value is already null, what's the advantage of repeating it?
>     
>     If ServerConfiguration isn't intended to be used as a placeholder for the more general
case, can you please file a follow on to mark it @Deprecated with a note?
>     
>     The TODO above the variable calls out the preference for an init and the reason it
can't currently happen. Will filing that ticketa address that part of your comment?

The object (ServerConfigurationFactory) is fully initialized before it is assigned. I'm not
sure what you're referring to. The use of volatile in this context can help (in newer JVMs,
certainly in 1.6 and later) in the double-check idiom for lazy initialization. However, this
is not the idiom being employed here. Nothing in this code prevents the object from being
assigned twice (by another thread), which is what the double-check idiom is for. All this
volatile does is prevent the JVM from caching the current reference.

The suggestion for null is just a nice to have. It makes it clear what the default value is.
It's not important.

ServerConfiguration predates the switch to ServerConfigurationFactory. Yes, more follow-on
work needs to be done to remove it completely. That work was discussed and deferred in https://issues.apache.org/jira/browse/ACCUMULO-2615
and its sub-tasks. In the meantime, I'm simply suggesting to avoid it, as the factory more
explicitly conveys its function (it is used *as* a factory, not as a configuration).

I'm not too concerned about the timeline for adding an init method. If you want to create
a JIRA, go ahead. I'd have to put more thought into that. The main point of that comment was
to empathize/agree that an init method might be helpful. On the other hand, if we don't guarantee
that we can maintain state, it may not be any better than passing stuff in with the environment
object on the choose method.


> On Dec. 10, 2014, 2:54 a.m., Christopher Tubbs wrote:
> > server/base/src/main/java/org/apache/accumulo/server/fs/PerTableVolumeChooser.java,
line 53
> > <https://reviews.apache.org/r/28873/diff/1/?file=787888#file787888line53>
> >
> >     This change unnecessarily adds a restriction that the initial volume chooser
for a table will be fixed for the lifetime of the TabletServer. This prevents dynamic changes
to a per-table configuration property without justification. This is problematic by itself,
but especially so given that a user is unable to set an initial VolumeChooser for the table
because of your objection to ACCUMULO-3176.
> 
> Sean Busbey wrote:
>     that isn't true. The second if block is used if there already was  chooser. In that
case it checks to see if the property has changed, using hte same idiom the per table balancer
users.

You're right. I missed that.


> On Dec. 10, 2014, 2:54 a.m., Christopher Tubbs wrote:
> > server/base/src/main/java/org/apache/accumulo/server/fs/PreferredVolumeChooser.java,
line 44
> > <https://reviews.apache.org/r/28873/diff/1/?file=787889#file787889line44>
> >
> >     This is incorrect. This is not an experimental property. It is a custom per-table
property, specific to the PreferredVolumeChooser.
> 
> Sean Busbey wrote:
>     how would you prefer I convey the information that operators can't assume any compatibility
guarantees on its use?
> 
> Josh Elser wrote:
>     I don't get this either. The property is defined to take a comma-separated list of
Volumes (URIs). If this fact changes, we'd have to make a new property to support whatever
we needed it to become. Experimental has been used to define things that are incompletely
implemented -- anything else has been handled by documentation.
> 
> Sean Busbey wrote:
>     I'm trying to convey a similar level of reliability as we do with @Experimental on
GENERAL_VOLUME_CHOOSER and TABLE_VOLUME_CHOOSER. That is, we might change the variable name
or what it holds. Would describing it as a "table property specific to the current implementation
of PreferredVolumeChooser" work?
> 
> Josh Elser wrote:
>     Hrm, I hadn't noticed that GENERAL_VOLUME_CHOOSER was also Experimental so that makes
sense for it to also trickle down to TABLE_VOLUME_CHOOSER. I don't think I mind experimental
moving down the chain to keep all of these related. The PREFERRED_VOLUMES_CUSTOM_KEY by itself
may not be experimental (by the previous definition), but since it relies on other things
which are still marked as such, I could see the reason behind also treating it as experimental.

Sean: Yeah, something along the lines of indicating that it's tied to this implementation
I think would be better. I just don't want it to convey the false idea that custom per-table
properties are experimental.


> On Dec. 10, 2014, 2:54 a.m., Christopher Tubbs wrote:
> > server/base/src/main/java/org/apache/accumulo/server/fs/PreferredVolumeChooser.java,
line 99
> > <https://reviews.apache.org/r/28873/diff/1/?file=787889#file787889line99>
> >
> >     Do we really need an object pool for a map which holds 0 or 1 element for the
duration of a single, infrequently called method? A better optimization for this case would
be to add an optimized get(String key) method to AccumuloConfiguration / TableConfiguration,
so we don't have to use a map at all.
> 
> Sean Busbey wrote:
>     This code path is in the critical section of tablet splitting, the frequency of which
varies greatly based on system deployment. There's already a TODO for doing the refactoring
you mention, which will allow us to later remove the pooling entirely.
> 
> Josh Elser wrote:
>     I really don't see the creation of a new HashMap as being critical WRT the cost of
waiting for a tablet to close, performing multiple metadata table updates and then waiting
for the tablets to come back online.
> 
> Sean Busbey wrote:
>     Am I correct that we currently don't have any microbenchmarks that would tell us
one way or the other?
> 
> Josh Elser wrote:
>     ditto what I said down below.

As far as I know, we have not done any microbenchmarks on this. Which is why I suggested it
was premature optimization.


> On Dec. 10, 2014, 2:54 a.m., Christopher Tubbs wrote:
> > server/base/src/main/java/org/apache/accumulo/server/fs/PreferredVolumeChooser.java,
lines 116-122
> > <https://reviews.apache.org/r/28873/diff/1/?file=787889#file787889line116>
> >
> >     This change unnecessarily prevents dynamic changes to preferredVolumes, and
forces the initial property to be the only one possible to read. I would say that this is
a problem by itself, but given that you've objected to ACCUMULO-3176, which would give users
the ability to set initial properties, this is even more problematic.
> 
> Sean Busbey wrote:
>     This change only caches the parsed version of particular volume strings to avoid
calling split on every invocation if the configured volumes haven't changed. How does that
prevent the dynamic configuration?

Again, you're right. I mis-read the code, as a result of it jumping in complexity. I do think
avoiding String.split is premature optimization, though.


> On Dec. 10, 2014, 2:54 a.m., Christopher Tubbs wrote:
> > server/base/src/main/java/org/apache/accumulo/server/fs/PreferredVolumeChooser.java,
lines 125-141
> > <https://reviews.apache.org/r/28873/diff/1/?file=787889#file787889line125>
> >
> >     How is this implementation better than the simple, readable, higher-level collections-based
implementation?
> 
> Sean Busbey wrote:
>     VC.choose is in the critical path for tablet splitting. The collections based approach
does more allocations of much heavier weight objects. Additionally, since all of these things
only live in the scope of the method those allocations will cause gc pressure during heavy
splitting.
> 
> Josh Elser wrote:
>     bq. will cause gc pressure during heavy splitting.
>     
>     *might* cause gc pressure :). I don't think you can make that assertion without actually
quantifying how much "gc pressure" is introduced WRT the amount of time the other known slow
operations with comprise splitting a tablet actually take.
> 
> Sean Busbey wrote:
>     Am I correct that we currently don't have any microbenchmarks that would tell us
one way or the other?
> 
> Josh Elser wrote:
>     I imagine that you could easily look at how long it takes a split to happen with
and without ingest load via the traces. If we don't have that instrumented already, maybe
Eric or Keith could chime in as I imagine they would be the two who know the most off the
top of their heads. I don't want to force you to start writing microbenchmarks -- my gut reaction
was that back of the envelope calculations would hint that this optimization might not be
with much total improvement.

At the very least, I'd like to see this code split into a separate method, with full unit
test coverage. I have very little confidence by looking at it that it does the set intersection
that the two lines it replaces did. It's not clear to me at all how it's supposed to work.


> On Dec. 10, 2014, 2:54 a.m., Christopher Tubbs wrote:
> > server/base/src/main/java/org/apache/accumulo/server/fs/PreferredVolumeChooser.java,
line 150
> > <https://reviews.apache.org/r/28873/diff/1/?file=787889#file787889line150>
> >
> >     If you get here, the modifiedOptions had better not be null. Shouldn't it have
numValid entries in it?
> 
> Sean Busbey wrote:
>     If all of the options were valid then we wouldn't have copied anything. ref the comment
"avoid copying if a prefix of the options are all fine".
>     
>     Should I expand the comment to explain the optimization?

Maybe. I guess my main concern is that it's hard to grok what the previous block of String
array manipulation is actually doing. A comment could help. Moving it to a private method
might also.


> On Dec. 10, 2014, 2:54 a.m., Christopher Tubbs wrote:
> > server/base/src/main/java/org/apache/accumulo/server/fs/VolumeChooser.java, line
21
> > <https://reviews.apache.org/r/28873/diff/1/?file=787891#file787891line21>
> >
> >     N.B.?
> 
> Sean Busbey wrote:
>     It's a phrase that means the reader should pay attention to the bit that follows.
> 
> Mike Drob wrote:
>     https://en.wikipedia.org/wiki/Nota_bene

Interesting. If you could replace this jargon with plain, clear English, I think it would
be helpful.


> On Dec. 10, 2014, 2:54 a.m., Christopher Tubbs wrote:
> > server/base/src/main/java/org/apache/accumulo/server/fs/VolumeManagerImpl.java,
lines 581-586
> > <https://reviews.apache.org/r/28873/diff/1/?file=787893#file787893line581>
> >
> >     +1 for this check, but this should be a separate issue, and also would benefit
1.6 as a sanity-check/bug-prevention measure.
> 
> Sean Busbey wrote:
>     It's not strictly needed in 1.6 because only someone with control of accumulo-site.xml
can set the volume chooser, but I'm happy to file a follow on.

We also allow overriding general system properties in ZooKeeper, so I'm not sure that's exactly
true. I'd have to check which properties we permit to be changed that way. In any case, to
use a particular VolumeChooser, we already require it to be on the classpath, which requires
the same level of privilege as modifying the accumulo-site.xml file, so I'm not sure this
is distinctly different between 1.6 and master.


- Christopher


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28873/#review64511
-----------------------------------------------------------


On Dec. 10, 2014, 9:58 a.m., Sean Busbey wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28873/
> -----------------------------------------------------------
> 
> (Updated Dec. 10, 2014, 9:58 a.m.)
> 
> 
> Review request for accumulo and Jenna Huston.
> 
> 
> Bugs: ACCUMULO-3393
>     https://issues.apache.org/jira/browse/ACCUMULO-3393
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> ACCUMULO-3393 Follow-on work for per-table volume chooser.
> 
> Still has TODOs for additional tickets I need to file. I'll update the review to remove
once I have them all filed. I think everything marked TODO can wait for a later ticket. Please
comment on relevant section if you think something needs to be done now.
> 
> * docs clean up + code guideline compliance.
> * ensure RandomVolumeChoosers are independent when used per-table.
> * make sure that per-table choosers can keep state the way that global choosers can
> * make sure that a chooser can only pick from the options it is presented.
> * avoid object creation in critical path for Tablet.split.
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/accumulo/core/conf/Property.java 4c2d0b41b3bce32449861da3ac42fa27deb2b182

>   pom.xml 31601a1bf84e19b861e4f48b50824eeb77987b52 
>   server/base/pom.xml c21a168dec2092692f1ee6877c4703ee2d3e3977 
>   server/base/src/main/java/org/apache/accumulo/server/fs/PerTableVolumeChooser.java
7a825c796eb5a25de8c748e2aba642f483697b9a 
>   server/base/src/main/java/org/apache/accumulo/server/fs/PreferredVolumeChooser.java
7ed7bba809a4e5e4b2d472c3327b15adb37251a7 
>   server/base/src/main/java/org/apache/accumulo/server/fs/RandomVolumeChooser.java f2eb2113cb848ed58ac5f41573c6ff2cde9b0a77

>   server/base/src/main/java/org/apache/accumulo/server/fs/VolumeChooser.java f523057b11a2dc42e82010675bb1ac8e3802f96d

>   server/base/src/main/java/org/apache/accumulo/server/fs/VolumeManager.java 890651e92f4c34514cb839b7b9ee9d23ad55070a

>   server/base/src/main/java/org/apache/accumulo/server/fs/VolumeManagerImpl.java dc1be739b634d91992894f8f27c2d9c184bd98cd

>   server/base/src/main/java/org/apache/accumulo/server/fs/VolumeUtil.java 877d01c2233cca086c1ac1539eb81cc282a7ceb4

>   server/base/src/main/java/org/apache/accumulo/server/util/TabletOperations.java c0e1a9b991d61a4dbb127c74ae297f171434e7d5

>   server/base/src/test/java/org/apache/accumulo/server/fs/VolumeManagerImplTest.java
582822a8ccbc398925a2184eed0b9d7fa853f9b4 
> 
> Diff: https://reviews.apache.org/r/28873/diff/
> 
> 
> Testing
> -------
> 
> Ran through VolumeManagerImplTest and VolumeChooserIT 
> 
> 
> Thanks,
> 
> Sean Busbey
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message