accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Fuchs <>
Subject Re: Accumulo 1.4 Memory Issues
Date Fri, 03 Aug 2012 11:45:17 GMT

It's possible that calling flush on the BatchWriter after every add of a
small mutation could cause memory usage to spike and the java garbage
collector to fall behind. I'm not sure this is in our standard test cases.
There are two things we should do to move forwards:

1. We should try to get your application working given these constraints.
Is there a reason you're flushing the writer after every mutation? Can you
limit the calls to flush to groups of mutations? Usually flushing too
frequently it's an indicator that you're trying to do a read-modify-write
loop, and there is often a better performing alternative. If you do need to
flush frequently (e.g. maybe you're using cells as locks) then we'll
probably have to skip to #2.

2. We should try to build a test that replicates this problem in a
simplified environment. Can you share done source code that we can use to
build that test?

On Aug 2, 2012 1:54 PM, "Matt Parker" <> wrote:

> I'm flushing/closing the writer after every small "transaction" (i.e. a
> nodes get updated or inserted, or a link record is moved
> (deleted/reinserted).
> The client will throw the Out of Memory error, but I can restart the
> client and rerun the same set of operatoins again. So I would assume the
> TServer seems to be uneffected.
> On Thu, Aug 2, 2012 at 1:49 PM, William Slacum <
>> wrote:
>> Is it the TServer bombing out or your client or both? How often are you
>> flushing your writer?
>> On Thu, Aug 2, 2012 at 10:34 AM, Matt Parker <>wrote:
>>> for my small test case, I'm storing some basic data in three tables:
>>> nodes - spatial index (id, list of child nodes, whether it's a leaf node
>>> )
>>> image metadata - (id, bounding box coordinates, a text string of the
>>> bounding box)
>>> link - linking table that tells which images correspond to specific
>>> nodes.
>>> The image data isn't being stored in Accumulo, yet.
>>> On Thu, Aug 2, 2012 at 1:25 PM, Marc Parisi <> wrote:
>>>> are you using native maps? if so, are they being used?
>>>> On Thu, Aug 2, 2012 at 1:16 PM, Matt Parker <>wrote:
>>>>> I setup a single instance Accumulo server.
>>>>> I can load 32K rows of image metadata without issue.
>>>>> I have another set of routines that build a dynamic spatial index,
>>>>> where nodes are inserted/updated/deleted over time.
>>>>> These operations are typically done one at a time, where each
>>>>> batchwriter are closed after use.
>>>>> It loads maybe a couple hundred operations, and then it dies with an
>>>>> OutOfMemory error when trying to close a batchwriter.
>>>>> I tried uping the memery settings on my client and on the tserver, but
>>>>> the results were the same.
>>>>> Outside of Accumulo, I can build the whole index in memory without any
>>>>> special JVM memory settings. I was wondering whether anyone else had
>>>>> into a similar issue?

View raw message