accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dlmar...@comcast.net
Subject Re: Supporting large values
Date Tue, 27 May 2014 20:49:40 GMT
Are you seeing something similar to the error in https://issues.apache.org/jira/browse/ACCUMULO-2495?


----- Original Message -----

From: "Bill Havanki" <bhavanki@clouderagovt.com> 
To: "Accumulo Dev List" <dev@accumulo.apache.org> 
Sent: Tuesday, May 27, 2014 4:30:59 PM 
Subject: Supporting large values 

I'm trying to run a stress test where each row in a table has 100 cells, 
each with a value of 100 MB of random data. (This is using Bill Slacum's 
memory stress test tool). Despite fiddling with the cluster configuration, 
I always run out of tablet server heap space before too long. 

Here are the configurations I've tried so far, with valuable guidance from 
Busbey and madrob: 

- native maps are enabled, tserver.memory.maps.max = 8G 
- table.compaction.minor.logs.threshold = 8 
- tserver.walog.max.size = 1G 
- Tablet server has 4G heap (-Xmx4g) 
- table is pre-split into 8 tablets (split points 0x20, 0x40, 0x60, ...), 5 
tablet servers are available 
- tserver.cache.data.size = 256M 
- tserver.cache.index.size = 40M (keys are small - 4 bytes - in this test) 
- table.scan.max.memory = 256M 
- tserver.readahead.concurrent.max = 4 (default is 16) 

It's often hard to tell where the OOM error comes from, but I have seen it 
frequently coming from Thrift as it is writing out scan results. 

Does anyone have any good conventions for supporting large values? 
(Warning: I'll want to work on large keys (and tiny values) next! :) ) 

Thanks very much 
Bill 

-- 
// Bill Havanki 
// Solutions Architect, Cloudera Govt Solutions 
// 443.686.9283 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message