accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Busbey <bus...@cloudera.com>
Subject Re: Supporting large values
Date Tue, 27 May 2014 20:59:07 GMT
Just to confirm, you haven't changed the default key length constraint
(1MB), nor the default limit on thrift message sizes, right?


On Tue, May 27, 2014 at 3:55 PM, Bill Havanki <bhavanki@clouderagovt.com>wrote:

> No sir. I am seeing general out of heap space messages, nothing about
> direct buffers. One specific example would be while Thrift is writing to a
> ByteArrayOutputStream to send off scan results. (I can get an exact stack
> trace - easily :} - if it would be helpful.) It seems as if there just
> isn't enough heap left, after controlling for what I have so far.
>
> As a clarification of my original email: each row has 100 cells, and each
> cell has a 100 MB value. So, one row would occupy just over 10 GB.
>
>
> On Tue, May 27, 2014 at 4:49 PM, <dlmarion@comcast.net> wrote:
>
> > Are you seeing something similar to the error in
> > https://issues.apache.org/jira/browse/ACCUMULO-2495?
> >
> > ----- Original Message -----
> >
> > From: "Bill Havanki" <bhavanki@clouderagovt.com>
> > To: "Accumulo Dev List" <dev@accumulo.apache.org>
> > Sent: Tuesday, May 27, 2014 4:30:59 PM
> > Subject: Supporting large values
> >
> > I'm trying to run a stress test where each row in a table has 100 cells,
> > each with a value of 100 MB of random data. (This is using Bill Slacum's
> > memory stress test tool). Despite fiddling with the cluster
> configuration,
> > I always run out of tablet server heap space before too long.
> >
> > Here are the configurations I've tried so far, with valuable guidance
> from
> > Busbey and madrob:
> >
> > - native maps are enabled, tserver.memory.maps.max = 8G
> > - table.compaction.minor.logs.threshold = 8
> > - tserver.walog.max.size = 1G
> > - Tablet server has 4G heap (-Xmx4g)
> > - table is pre-split into 8 tablets (split points 0x20, 0x40, 0x60,
> ...), 5
> > tablet servers are available
> > - tserver.cache.data.size = 256M
> > - tserver.cache.index.size = 40M (keys are small - 4 bytes - in this
> test)
> > - table.scan.max.memory = 256M
> > - tserver.readahead.concurrent.max = 4 (default is 16)
> >
> > It's often hard to tell where the OOM error comes from, but I have seen
> it
> > frequently coming from Thrift as it is writing out scan results.
> >
> > Does anyone have any good conventions for supporting large values?
> > (Warning: I'll want to work on large keys (and tiny values) next! :) )
> >
> > Thanks very much
> > Bill
> >
> > --
> > // Bill Havanki
> > // Solutions Architect, Cloudera Govt Solutions
> > // 443.686.9283
> >
> >
>
>
> --
> // Bill Havanki
> // Solutions Architect, Cloudera Govt Solutions
> // 443.686.9283
>



-- 
Sean

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message