impala-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fawze Abujaber <>
Subject Adding impala daemons on servers without local HDFS storage
Date Tue, 03 Apr 2018 17:22:40 GMT
Hi All,

I have reached a point in my cluster that i don't need more storage for the
HDFS and i need to add processing power, i'm using Yarn,Spark and Impala on
the normal nodes for processing.

My questions:

1- How much the data locality will impact impala performance as i know
impala rely on data locality on it's processing?

2- I have OS disk with 600GB, will this be enough to be used to spill to
disk when needed? is it dependent on other factors, the impala daemon
memory limit is 35GB.

3- Should i disable the  *HDFS Short Circuit Read*  on these nodes?

Will happy to get more recommendation on this ....

Take Care
Fawze Abujaber

View raw message