hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dongwon Kim <>
Subject Re: Experimental results using TPC-DS (versus Spark and Presto)
Date Mon, 30 Jan 2017 23:50:35 GMT
Goun : Just to make all the engines use the same data and I usually
store data in ORC. I know that it can make biased results in favor of
Hive. I did Spark experiments with Parquet, and Spark works better
with Parquet as it is believed (not included in the result though).

Goden : Oops, 128GB main memory for the master and all the slaves for
sure because I'm using 80GB per each node.

Gopal : (yarn logs -application $APPID) doesn't contain a line
containing HISTORY so it doesn't produce svg file. Should I turn on
some option to get the lines containing HISTORY in yarn application

2017-01-31 4:47 GMT+09:00 Goden Yao <>:
> was the master 128MB or 128GB memory?
> On Mon, Jan 30, 2017 at 3:24 AM Gopal Vijayaraghavan <>
> wrote:
>> > Hive LLAP shows better performance than Presto and Spark for most
>> > queries, but it shows very poor performance on the execution of query 72.
>> My suspicion will be the the inventory x catalog_sales x warehouse join -
>> assuming the column statistics are present and valid.
>> If you could send the explain formatted plans and swimlanes for LLAP, I
>> can probably debug this better.
>> Use the "submitted to <appid>" in this to get the diagram.
>> Cheers,
>> Gopal
> --
> Goden

View raw message