impala-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Piyush Narang <>
Subject Re: Debugging slow Impala hdfs-scans
Date Fri, 08 Dec 2017 21:08:47 GMT
Yeah you’re right, the day filter as of now doesn’t really filter any data.
I tried bumping up a couple of settings, “num_remote_hdfs_io_threads” like you mentioned
as well as, “read_size”. I pushed up io_thread value from 8 to 32, 128 and 256 and I do
see a notable improvement in the query runtime (up to 128). It went from ~1800s for 8 to 1055s
for 32 to 907s for 128. Seems to plateau beyond that. I tried bumping up the read_size to
32MB (default is close to ~8MB) but that didn’t improve things too much.

Do let me know if there are any other settings you can think of that I could try. I’ll try
experimenting with some smaller partitions (where the filter criteria actually is of use ☺)
and digging into the code more.


-- Piyush

From: Tim Armstrong <>
Reply-To: "" <>
Date: Wednesday, December 6, 2017 at 8:00 PM
To: "" <>
Subject: Re: Debugging slow Impala hdfs-scans

Interesting, so day=2017-10-04 doesn't actually filter any data out, right?
--num_hdfs_worker_threads I believe controls the size of the thread pool for HDFS metadata
operations like moves, deletions, etc (the argument name isn't very specific)
The option that would make a difference in your case is --num_remote_hdfs_io_threads, which
controls the number of concurrent remote HDFS operations. The default is 8, which could be
too low if you are doing a lot of remote reads.

On Wed, Dec 6, 2017 at 4:09 PM, Piyush Narang <<>>
Thanks for getting back Mostafa. IO rates while Presto was running seemed to be higher (60-125
MB/s). If it was a slowdown due to GC / HDFS, I’d imagine some of the Presto runs would
be affected but that doesn’t seem to be the case. We’ve run the query on a few occasions
on the same day (fairly close to each other) and also on different days to reduce the likelihood
of the fs cache and while we do see subsequent runs being a little quicker on Presto &
Impala, it always seems like the Impala run is much slower by 2x – 3x. I’ll check out
the GC metrics on our NameNode to be double check and get back with that.

The number of partitions & files is an interesting dimension. The data is partitioned
in this fashion (currently only have one day’s worth of data):
2 * 1 * 24 * 3 * 2 * 2 = 576

Looking at the number of files we end up touching as part of the day=2017-10-04 query, it
is: ~15K. Do you think this is on the high end given we have only 4 machines (with 48 cores
each?). I noticed that there’s a start-up Impala option “-num_hdfs_worker_threads”.
I didn’t find much documentation on this, but wondering if bumping this up would potentially
help? (Could try this out tomorrow).


-- Piyush

From: Mostafa Mokhtar <<>>
Reply-To: "<>" <<>>
Date: Wednesday, December 6, 2017 at 5:51 PM
To: "<>" <<>>
Subject: Re: Debugging slow Impala hdfs-scans

The query profile shows that a lot of time is spent in "- TotalStorageWaitTime: 20h4m", usually
this is an indicator that Impala is waiting on IO from HDFS.

Number of files can also be an issue, I recommend checking GC time for the HDFS NameNode.

The filter is not selective so it is interesting to see that Presto is running the same query
faster, did you observe IO rates while Presto was running or the data was read from the file
system cache?

On Wed, Dec 6, 2017 at 2:01 PM, Piyush Narang <<>>
Hi folks,

I’m trying to debug why the performance of our new Impala setup is performing a bit worse
on the same queries as Presto. We’re running Impala 2.10.0 and starting it up with 225G
of memory (-mem-limit). It runs on the same set of nodes as Presto (4 nodes, 48 cores each,
10G ethernet). We noticed a couple of queries in Impala were fairly slow in the hdfs-scan
stage so we tried to isolate the behavior with a slightly simpler query:
select max(hour + nb_display) from my_large_parquet_table where day = '2017-10-04';

This query takes around 28-30 mins on Impala and seems to run in around 8.5 mins on Presto.
The input table is 11.64TB and uses Parquet (snappy compressed).

Looking at the performance profile (attached in case anyone’s interested), I noticed that
during the query Impala seems to be averaging around 1.1 - 1.2 MB/s per thread and the total
read throughput is around 2.13 MB/s. I tried bumping up the number of scanner threads (from
48 to 96) but that seemed to only help marginally (improved runtimes by ~10s). Running “sar
–n DEV 1” on some of our hosts while the query is running, seems to show that Impala is
reading at a rate of 30-50 MB/s (whereas we see this go up to 60-125 MB/s on our other runs).

We haven’t tweaked our Impala setup much beyond the defaults that come out of the box. I’m
wondering if I’m missing some tuning settings that help improve read rates from HDFS when
Impala is running outside of the datanodes. If anyone has any ideas / suggestions, they’d
be welcome. I am happy to provide more details if needed as well.


-- Piyush

View raw message