impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Apple <jbap...@cloudera.com>
Subject Re: Stress tests stuck - what should I look at?
Date Thu, 17 Nov 2016 01:49:19 GMT
This is an ec2 c3.4xl instance with 16 cores and roughly 30GB ram. all
cores are mostly idle and 3.5GB ram is still free.

On Wed, Nov 16, 2016 at 5:42 PM, Jim Apple <jbapple@cloudera.com> wrote:
> impala.thrift-server.backend.connections-in-use 97 The number of
> active Impala Backend client connections to this Impala Daemon.
> impala.thrift-server.backend.total-connections 97 The total number of
> Impala Backend client connections made to this Impala Daemon over its
> lifetime.
> impala.thrift-server.beeswax-frontend.connections-in-use 64 The number
> of active Beeswax API connections to this Impala Daemon.
> impala.thrift-server.beeswax-frontend.total-connections 208 The total
> number of Beeswax API connections made to this Impala Daemon over its
> lifetime.
> impala.thrift-server.hiveserver2-frontend.total-connections 11 The
> total number of HiveServer2 API connections made to this Impala Daemon
> over its lifetime.
> impala.thrift-server.backend.connection-setup-queue-size 0 The number
> of connections to the Impala Backend Server that have been accepted
> and are waiting to be setup.
> impala.thrift-server.hiveserver2-frontend.connections-in-use 0 The
> number of active HiveServer2 API connections to this Impala Daemon.
>
> 0 queries in flight
> 0 waiting to be closed
>
>
> On Wed, Nov 16, 2016 at 5:32 PM, Alex Behm <alex.behm@cloudera.com> wrote:
>> The mini stress has been prone to hangs in the past due to test bugs.
>>
>> I'd recommend dumping the impala-server metrics and checking whether
>> the impala.thrift-server.beeswax-frontend.connections-in-use
>> is close to 64.
>> Then look at how many queries are actually still in flight. If there are
>> fewer than 64 queries in flight, then it's probably a test bug (because the
>> tests did not yet close their connections despite being done).
>>
>> You can grab http://localhost:25000/metrics?raw&json
>>
>> On Wed, Nov 16, 2016 at 5:29 PM, Tim Armstrong <tarmstrong@cloudera.com>
>> wrote:
>>
>>> It's probably worth looking at the debug pages to see what queries are
>>> active. Probably also worth grabbing stack traces with gdb or core dumps
>>> with gcore. If it's a hang then it's often possible to diagnose from the
>>> backtraces.
>>>
>>> Could also be worth running perf top to see where it's spending time (if
>>> anywhere).
>>>
>>> On Wed, Nov 16, 2016 at 5:19 PM, Jim Apple <jbapple@cloudera.com> wrote:
>>>
>>> > I'm running the EE tests on a machine and it seems to be stuck in the
>>> > stress tests. I have access to the machine for now, but my Jenkins
>>> > install is going to steal it from me when the job is force-timed-out
>>> > in a few hours. What should I look at now to try and understand what
>>> > is happening - in particular, what could be useful to me now but not
>>> > visible in the logs?
>>> >
>>>

Mime
View raw message