accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Busbey" <s...@manvsbeard.com>
Subject Re: Review Request 17129: ACCUMULO-2213 Make Tracer more robust in handling batchwriter recovery
Date Tue, 21 Jan 2014 14:50:59 GMT


> On Jan. 21, 2014, 2:40 p.m., Eric Newton wrote:
> > src/server/src/main/java/org/apache/accumulo/server/trace/TraceServer.java, line
132
> > <https://reviews.apache.org/r/17129/diff/1/?file=432898#file432898line132>
> >
> >     What's up with the "XXX"?  What does it mean?

It's a way to flag a message for attention. Some IDEs will highlight it. similar to TODO.

Actually, I think the "there can still be NPEs expected" comments need the XXX but this one
probably doesn't.


> On Jan. 21, 2014, 2:40 p.m., Eric Newton wrote:
> > src/server/src/main/java/org/apache/accumulo/server/trace/TraceServer.java, line
188
> > <https://reviews.apache.org/r/17129/diff/1/?file=432898#file432898line188>
> >
> >     Why does this need a comment?

It's the only place writer is assigned outside of the synchronized protection. I figured it
was worth pointing out that the two can't conflict.


> On Jan. 21, 2014, 2:40 p.m., Eric Newton wrote:
> > src/server/src/main/java/org/apache/accumulo/server/trace/TraceServer.java, line
209
> > <https://reviews.apache.org/r/17129/diff/1/?file=432898#file432898line209>
> >
> >     Why duplicate the same code for two different exception types?
> >

I wanted to make sure we scope our catching minimally to just handle MRE and Runtime, so that
should the methods in the try{} later add additional checked exceptions the compiler will
let us know that this spot needs to be examined.

Once we can use Java 7 features this duplication can go away.


> On Jan. 21, 2014, 2:40 p.m., Eric Newton wrote:
> > src/server/src/main/java/org/apache/accumulo/server/trace/TraceServer.java, line
223
> > <https://reviews.apache.org/r/17129/diff/1/?file=432898#file432898line223>
> >
> >     Why demote these from error to warning?  It's unexpected behavior. If it stays
a warning, don't dump the stack.  It does not inform the user, and just provides more scary
output to read.

Since we retry in all cases, I figured 'warn' was more appropriate. There are a bunch of transient
network errors that cause these to fail and then pick back up and keep working.

Since it doesn't require operator attention to get things going again, I figured error wasn't
needed.

How about stack in the debug log, in case someone wants to investigate further?


- Sean


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17129/#review32360
-----------------------------------------------------------


On Jan. 20, 2014, 7:18 p.m., Sean Busbey wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17129/
> -----------------------------------------------------------
> 
> (Updated Jan. 20, 2014, 7:18 p.m.)
> 
> 
> Review request for accumulo and Eric Newton.
> 
> 
> Bugs: ACCUMULO-2213
>     https://issues.apache.org/jira/browse/ACCUMULO-2213
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> ACCUMULO-2213 Make writer recovery in the Tracer more robust.
>     
>     * Adds optional logging to TSBW closure so we can better instrument long run tests
to look for problems.
>     * Cleans up writer reseting in the TraceServer, avoids overly broad catching.
>     * tones down log levels in TraceServer to WARN because trace information is transient
and we retry everything.
> 
> 
> Diffs
> -----
> 
>   src/core/src/main/java/org/apache/accumulo/core/client/impl/TabletServerBatchWriter.java
e18f3f96f4e0959b975399921bb7d7958c561220 
>   src/server/src/main/java/org/apache/accumulo/server/trace/TraceServer.java 4d89e9c13f89e71e674875db32cf267e045d6131

> 
> Diff: https://reviews.apache.org/r/17129/diff/
> 
> 
> Testing
> -------
> 
> running on cluster with injected failures now
> 
> 
> Thanks,
> 
> Sean Busbey
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message