impala-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Apple <jbap...@cloudera.com>
Subject Re: Impala compute incremental stats and insert speed becomes slowly when the partitions and the amount of data is larger
Date Thu, 31 Mar 2016 15:55:17 GMT
bcc:impala-user@, to:user@impala.incubator.apache.org

How many columns do you have? How many impalad nodes are there? How much
memory is your catalog configured to run with?

Incremental stats are expensive to store in the catalog, and may be
expensive to distribute to the impalads.

http://www.cloudera.com/documentation/enterprise/latest/topics/impala_perf_resources.html
recommends fewer than 30,000 partitions. Also, "compute stats" without
 "incremental" may be something worth trying.


On Thu, Mar 31, 2016 at 5:41 AM, Qinggang Wang <qinggangwang7@gmail.com>
wrote:

> Hi All,
>         There is a table has about a hundred billion data and fifty
> thousand partitions in the impala.  It becomes  troublesome that when we
> insert new partitions and execute compute incremental stats , the speed of
>  insert as well as compute stats either becomes very slowly compared with
> the condition that the number of partitions and the amount of data is
> small. The time of insert and compute stats either more that 80 seconds
> now, while neither of the time of insert and compute stats more than 2
> seconds when the data is small.  As there are 68 partitions one day, it is
> really cost much time in insert and compute. Is there any way to solve that?
>
>
> Thanks
>
> --
> You received this message because you are subscribed to the Google Groups
> "Impala User" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to impala-user+unsubscribe@cloudera.org.
>

Mime
View raw message