accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bill Havanki" <bhava...@clouderagovt.com>
Subject Re: Review Request 17426: ACCUMULO-1948 Tablet constructor no longer leaking this
Date Tue, 04 Feb 2014 15:30:47 GMT


> On Feb. 3, 2014, 6:49 p.m., kturner wrote:
> > server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java,
line 219
> > <https://reviews.apache.org/r/17426/diff/4/?file=457847#file457847line219>
> >
> >     this field seems unnecessary

True, I had left it behind from earlier attempts. I'm fine with eliminating it.


> On Feb. 3, 2014, 6:49 p.m., kturner wrote:
> > server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java,
line 356
> > <https://reviews.apache.org/r/17426/diff/4/?file=457847#file457847line356>
> >
> >     It seems like this warning could have occurred spuriously before your change
to copy the map.  So your change may be a bug fix, unrelated to fixing the "this" issue.
> >     
> >     This case of tabletReport being null seems like something that should never
happen now.  So maybe remove the null check and should it ever happen let the NPE occur.
> 
> kturner wrote:
>     Actually my comment about tabletReport not being null, assumes a well behaved memory
manager (which is pluggable).   So maybe the warning should stay.

Good point. I'll add a bit to the log message suggesting that the memory manager may not be
behaving.


> On Feb. 3, 2014, 6:49 p.m., kturner wrote:
> > server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java,
line 362
> > <https://reviews.apache.org/r/17426/diff/4/?file=457847#file457847line362>
> >
> >     this is not operating on the snapshot you created, but doing that wouldn't make
sense.  Maybe do the following.
> >     
> >     sync(tabletReports){
> >       Tablet other = tabletReports.remove(ext)
> >       if(other != tablet)
> >           tabletReports.put(ext, other)
> >     }
> >     
> >     Another alternative would be to make the key for the map be extent AND "something
unique to tablet".  But I think the syncronized remove block would be good.

After thinking about it, I wonder if any action is necessary here. The goal here is to remove
the closed tablet from the set of tablet reports. However, when a tablet is closed, it removes
itself from the set (Tablet.completeClose() -> TabletResourceManger.close() -> MemoryManagementFramework.tabletClosed()).

At this stage, though, I'm sure it's safer to keep this removal attempt. How about I use your
synchronized block and add a log message - debug? - that the report for the closed tablet
was found and it should already have been removed, but it's getting removed now? 


> On Feb. 3, 2014, 6:49 p.m., kturner wrote:
> > server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java,
line 458
> > <https://reviews.apache.org/r/17426/diff/4/?file=457847#file457847line458>
> >
> >     I think tabletResources is unused and can be removed along w/ the methods for
adding and removing.

Ha, you are correct. I'll remove it! I think if we'd noticed this earlier it would have made
eliminating the this leakage a lot more straightforward. :)


> On Feb. 3, 2014, 6:49 p.m., kturner wrote:
> > server/tserver/src/main/java/org/apache/accumulo/tserver/Tablet.java, line 302
> > <https://reviews.apache.org/r/17426/diff/4/?file=457845#file457845line302>
> >
> >     Is there any way we can bail if this code is called by the constructor?

We could maybe introduce a latch or barrier that blocks or aborts the calls until it's cleared.
The barrier would have to be maintained outside of Tablet, that is, the code that constructs
the Tablet has to trip the latch. It's not safe for the Tablet constructor to clear it, even
as its last instruction.

CountDownLatch startingGun = new CountDownLatch(1);
Tablet tablet = new Tablet(startingGun, ...);
startingGun.countDown();  // bang

I'm assuming this would be as a safety measure. Right now I can't see how updateMemoryUsageStats
would be called before tablet construction completes, but admittedly it's complex.

Speaking of this escaping, I noticed as I was walking the call graph for updateMemoryUsageStats
that the MemoryManagementFramework inner class of TabletServerResourceManager starts a couple
of threads in its constructor, which is a no-no [1]. I'll file a follow-up JIRA about that.

[1] http://stackoverflow.com/questions/5623285/why-not-to-start-a-thread-in-the-constructor-how-to-terminate


- Bill


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17426/#review33526
-----------------------------------------------------------


On Jan. 31, 2014, 2:28 p.m., Bill Havanki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17426/
> -----------------------------------------------------------
> 
> (Updated Jan. 31, 2014, 2:28 p.m.)
> 
> 
> Review request for accumulo.
> 
> 
> Bugs: ACCUMULO-1948
>     https://issues.apache.org/jira/browse/ACCUMULO-1948
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> Change in the construction of the cross-referencing between Tablet and TabletResourceManager
to avoid the Tablet constructor leaking this.
> 
> 
> Diffs
> -----
> 
>   server/tserver/pom.xml b627de092faa2b5b4dbc5f8b9e4fb7af696ca436 
>   server/tserver/src/main/java/org/apache/accumulo/tserver/Tablet.java 5ea2401acc720c70d58ee04c2dd82dc79fcf2b99

>   server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java 9d08c81f9b07a8c5f8a281769fd2914f5c8fc82b

>   server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java
e95843775c85ecf90a2b4c7afce91094381da450 
>   server/tserver/src/test/java/org/apache/accumulo/tserver/TabletResourceManagerTest.java
PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/17426/diff/
> 
> 
> Testing
> -------
> 
> Added unit test. Tested on pseudo-cluster with shell activity, randomwalk MultiTable
test.
> 
> 
> Thanks,
> 
> Bill Havanki
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message