accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Turner <ke...@deenlo.com>
Subject Re: BatchWriter woes
Date Fri, 20 Jun 2014 14:32:56 GMT
On Thu, Jun 19, 2014 at 11:57 PM, William Slacum <
wilhelm.von.cloud@accumulo.net> wrote:

> I'm finding some ingest jobs I have running in a bit of a sticky sitch:
>
> I have a MapReduce job that reads a table, transforms the entries, creates
> an inverted index, and writes out mutations to two tables. The cluster size
> is in the tens of nodes, and I usually have 32 mappers running.
>
> The batch writer configs are:
> - memory buffer: 128MB
> - max latency: 5 minutes
> - threads: 32
> - timeout: default Long.MAX_VALUE
>
> I know we're on Accumulo 1.5.0 and I believe using CDH 4.5.0, Zookeeper
> 3.3.6.
>
> I'm noticing an ingest pattern of usually ok rates for the cluster (in the
> 100K+ entries per second), but after some time they start to drop off to
> ~10K E/s. Sometimes this happens when a round of compactions kicks off
> (usually major, not minor), sometimes not. Eventually, the mappers will
> timeout. We have them set to timeout after 10 minutes of not reporting
> status.
>
> I added a bit of probing/profiling, and noticed that there's an
> exponential growth in per entry processing time in the mapper. They're of
> pretty uniform size, so there should not be much variance in the times. The
> times go from single milliseconds, to hundreds of milliseconds, to seconds,
> to minutes.
>
> If I jstack a mapper, it's sitting in TabletServerBatchWriter#waitRTE. It
> should only enter that method if the batch writer has (a) too much data
> buffered or (b) the user requested a flush. I'm inferring that (a) is the
> case, because there is no explicit TabletServerBatchWriter#flush() call.
>
> We did notice that there was a send thread trying to send to a dead
> server. We can't ssh to the IP it was trying to send to, and have verified
> manually that it's not listed in the current tablet servers. We did notice
> that the master log is reporting that a recovery on a WAL associated with
> that IP is under way. Looking back, the master had been reporting that
> message for about a day and a half. The message was similar to the one
> described in https://issues.apache.org/jira/browse/ACCUMULO-1364 . I do
> not know the significance of this as it relates to my jobs.
>

Do you think its trying to write to a half dead server?  Does that server
still have locations in the metadata table?


>
> I did some digging in TabletServerBatchWriter, and the only thing I can
> kind of see happening is that if SendTask#sendMutationsToTabletServer
> receives a TException, it rethrows it as an IOException, then SendTask#send
> will catch that exception and add the mutations to the failures collection.
> Since the timeout is Long.MAX_VALUE, I think it's possible this loop can
> continue forever or until some outside force kills the entire process.
>
> Does this seem coherent? Is there anything else that could cause this?
>
> I'm on the track of converting the code over to using bulk ingest, but I
> think there's an issue with a vanilla BatchWriter that I would just be
> getting around instead of actually fixing.
>
> Also, I'd love to provide logs, but there's a high amount of friction in
> getting them, so I won't be able to deliver on that front.
>

Mime
View raw message