accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ara Ebrahimi <ara.ebrah...@argyledata.com>
Subject Re: hdfs cpu usage
Date Mon, 09 Feb 2015 18:59:58 GMT
Nope. This is for a full table scan. The same config in our in-premise cluster performs well
while on google cloud we see this weird hdfs cpu usage issue.

Ara.

> On Feb 9, 2015, at 9:31 AM, Adam Fuchs <afuchs@apache.org> wrote:
>
> Ara,
>
> What kind of query load are you generating within your batch scanners?
> Are you using an iterator that seeks around a lot? Are you grabbing
> many small batches (only a few keys per range) from the batch scanner?
> As a wild guess, this could be the result of lots of seeks with a low
> cache hit rate, which would induce CPU load in HDFS fetching blocks
> and CPU load in Accumulo decrypting/decompressing those blocks. The
> monitor page will show you seek rates and cache hit rates.
>
> Adam
>
>
> On Sat, Feb 7, 2015 at 8:48 PM, Ara Ebrahimi
> <ara.ebrahimi@argyledata.com> wrote:
>> 2.4.0.2.1.
>>
>> Yeah seems like I need to do that. I was hoping I’d get some advice based on
>> prior experience with google cloud environment.
>>
>> Ara.
>>
>> On Feb 7, 2015, at 11:23 AM, Josh Elser <josh.elser@gmail.com> wrote:
>>
>> What version of Hadoop are you using?
>>
>> Have you considered hooking up a profiler to the Datanode on GCE to see
>> where the time is being spent? That might help shed some light on the
>> situation.
>>
>> Ara Ebrahimi wrote:
>>
>> Hi,
>>
>> We’re seeing some weird behavior from the hdfs daemon on google cloud
>> environment when we use accumulo Scanner to sequentially scan a table. Top
>> reports 200-300% cpu usage for the hdfs daemon. Accumulo is also around
>> 500%. iostat %util is low. avgrq-sz is low, rMB/s is low, there’s lots of
>> free memory. It seems like something causes the hdfs daemon to consume a lot
>> of cpu and not to send enough read requests to the disk (ssd actually, so
>> disk is super fast and vastly under-utilized). The process which sends scan
>> requests to accumulo is 500% active (using 3 query batch threads and
>> aggressive scan-batch-size/read-ahead-threashold values). So it seems like
>> somehow hdfs is the bottleneck. On another cluster we rarely see hdfs daemon
>> going over 10% cpu usage. Any idea what the issue could be?
>>
>> Thanks,
>> Ara.
>>
>>
>>
>> ________________________________
>>
>> This message is for the designated recipient only and may contain
>> privileged, proprietary, or otherwise confidential information. If you have
>> received it in error, please notify the sender immediately and delete the
>> original. Any other use of the e-mail by you is prohibited. Thank you in
>> advance for your cooperation.
>>
>> ________________________________
>>
>>
>>
>>
>> ________________________________
>>
>> This message is for the designated recipient only and may contain
>> privileged, proprietary, or otherwise confidential information. If you have
>> received it in error, please notify the sender immediately and delete the
>> original. Any other use of the e-mail by you is prohibited. Thank you in
>> advance for your cooperation.
>>
>> ________________________________
>>
>>
>>
>>
>>
>> ________________________________
>>
>> This message is for the designated recipient only and may contain
>> privileged, proprietary, or otherwise confidential information. If you have
>> received it in error, please notify the sender immediately and delete the
>> original. Any other use of the e-mail by you is prohibited. Thank you in
>> advance for your cooperation.
>>
>> ________________________________
>
>
>
> ________________________________
>
> This message is for the designated recipient only and may contain privileged, proprietary,
or otherwise confidential information. If you have received it in error, please notify the
sender immediately and delete the original. Any other use of the e-mail by you is prohibited.
Thank you in advance for your cooperation.
>
> ________________________________




________________________________

This message is for the designated recipient only and may contain privileged, proprietary,
or otherwise confidential information. If you have received it in error, please notify the
sender immediately and delete the original. Any other use of the e-mail by you is prohibited.
Thank you in advance for your cooperation.

________________________________
Mime
View raw message