accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Turner <>
Subject Re: Accumulo v1.4.1 - ran out of memory and lost data
Date Wed, 30 Jan 2013 16:30:34 GMT
Was this resolved?

On Mon, Jan 28, 2013 at 8:28 AM, David Medinets
<> wrote:
> I had a plain Java program, single-threaded, that read an HDFS
> Sequence File with fairly small Sqoop records (probably under 200
> bytes each). As each record was read a Mutation was created, then
> written via Batch Writer to Accumulo. This program was as simple as it
> gets. Read a record, Write a mutation. The Row Id used YYYYMMDD (a
> date) so the ingest targeted one tablet. The ingest rate was over 150
> million entries for about 19 hours. Everything seemed fine. Over 3.5
> Billion entries were written. Then the nodes ran out of memory and
> Accumulo nodes went dead. 90% of the server was lost. And data poofed
> out of existence. Only 800M entries are visible now.
> We restarted the data node processes and the cluster has been running
> garbage collection for over 2 days.
> I did not expect this simple approach to cause an issue. From looking
> at the logs file, I think that at least two compactions were being run
> while still ingested those 176 million entries per hour. The hold
> times started rising and eventually the system simply ran out of
> memory. I have no certainty about this explanation though.
> My current thinking is to re-initialize Accumulo and find some way to
> programatically monitoring the hold time. The add a delay to the
> ingest process whenever the hold time rises over 30 seconds. Does that
> sound feasible?
> I know there are other approaches to ingest and I might give up this
> method and use another. I was trying to get some kind of baseline for
> analysis reasons with this approach.

View raw message