accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <>
Subject Re: increase "running scans" in monitor?
Date Tue, 02 Apr 2013 15:06:55 GMT
Hi Marc,

How many tablets are in the table you're running MR over (see the 
monitor)? Might adding some more splits to your table (`addsplits` in 
the Accumulo shell) get you better parallelism?

What does your data look like in your table? Lots of small rows? Few 
very large rows?

On 4/2/13 10:56 AM, Marc Reichman wrote:
> Hello,
> I am running a accumulo-based MR job using the AccumuloRowInputFormat 
> on 1.4.1. Config is more-or-less default, using the native-standalone 
> 3GB template, but with the TServer memory put up to 2GB in 
> from its default. accumulo-site.xml has 
> tserver.memory.maps.max at 1G, at 50M, and 
> tserver.cache.index.size at 512M.
> My tables are created with maxversions for all three types (scan, 
> minc, majc) at 1 and compress type as gz.
> I am finding, on an 8 node test cluster with 64 map task slots, that 
> when a job is running, the 'Running Scans' count in the monitor is 
> roughly 0-4 on average for each tablet server. When viewed at the 
> table view, this puts the running scans anywhere from 4-24 on average. 
> I would expect/hope the scans to be somewhere close to the map task 
> count. To me, this means one of the following.
> 1. There is a configuration setting inhibiting the amount of scans 
> from accumulating (excuse the pun) to about the same amount as my map 
> tasks
> 2. My map task job is cpu-intensive enough to introduce delays between 
> scans and everything is fine
> 3. Some combination of 1/2.
> On an alternate cluster, 40 nodes with 320 task slots, we haven't seen 
> anywhere near full capacity scanning with map tasks which have the 
> same performance, and the problem seems much worse.
> I am experimenting with some of the readahead configuration variables 
> for the tablet servers in the meantime, but haven't found any smoking 
> guns yet.
> Thank you,
> Marc
> -- 

View raw message