impala-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Armstrong <tarmstr...@cloudera.com>
Subject Re: Bottleneck
Date Fri, 01 Sep 2017 16:41:51 GMT
Hi Alexander,
  It's hard to know based on the information available. Query profiles
often provide some clues here. I agree Impala would be able to max out one
of the resources in most circumstances.

On Impala 2.8 and earlier we saw behaviour similar to what you described
when running queries with selective scans on machines with many cores:
https://issues.apache.org/jira/browse/IMPALA-4923 . The bottleneck there
was lock contention during memory allocation - the threads spent a lot of
time asleep waiting to get a shared lock.

On Fri, Sep 1, 2017 at 8:36 AM, Alexander Shoshin <
Alexander_Shoshin@epam.com> wrote:

> Hi,
>
>
>
> I am working with Impala trying to find its maximum throughput on my
> hardware. I have a cluster under Cloudera Manager which consists of 7
> machines (1 master node + 6 worker nodes).
>
>
>
> I am running queries on Impala using JDBC. I’ve reached maximum throughput
> equals 80 finished queries per minute. It doesn’t grow up no matter how
> many hundreds of concurrent queries I send. But the strange thing is that
> no one of resources (memory, CPU, disk read/write, net send/received)
> hasn’t reached its maximum. They are used less than on a half.
>
>
>
> Could you suppose what can be a bottleneck? May it be some Impala setting
> that limits performance or maximum concurrent threads? The mem_limit option
> for my Impala daemons is about 70% of available machine memory.
>
>
>
> Thanks,
>
> Alexander
>

Mime
View raw message