kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adar Lieber-Dembo <a...@cloudera.com>
Subject Re: clarification on Partitioning Guidelines and CPU cores
Date Thu, 18 Oct 2018 00:06:51 GMT
Hi Boris,

> Also, when they say tablets - I assume this is before replication? so in reality, it
is number of nodes x cpu cores / replication factor? If this is the case, it is not looking
good...

No, I think this is post-replication. The underlying assumption is
that you want to maximize parallelism for large tables, and since
Impala only uses one read thread per tablet, that means ensuring the
number of tablets is close or equal to the overall number of cores.
However, during a scan Impala will choose one of the tablet's replicas
to read from, so you don't need to "reserve" a core for the other
replicas.

>> can someone clarify if this recommendation below - does it mean physical or hyper-threaded
CPU cores? quite a big difference...

I think this refers to hyper-threaded CPU cores (i.e. a CPU unit
capable of executing an OS thread). But I'd be curious to hear if your
workload is substantially more or less performant either way.

Mime
View raw message