hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Gates <alanfga...@gmail.com>
Subject Re: Spark SQL is not returning records for HIVE transactional tables on HDP
Date Mon, 14 Mar 2016 16:48:40 GMT
I don’t know why you’re seeing Hive on Spark sometimes work with transactional tables and
sometimes not.  But note that in general it doesn’t work.  The Spark runtime in Hive does
not send heartbeats to the transaction/lock manager so it will timeout any job that takes
longer than the heartbeat interval (5 min by default).  

Alan.

> On Mar 12, 2016, at 00:24, @Sanjiv Singh <sanjiv.is.on@gmail.com> wrote:
> 
> Hi All,
> 
> I am facing this issue on HDP setup on which COMPACTION is required only once for transactional
tables to fetch records with Spark SQL.
> On the other hand, Apache setup doesn't required compaction even once.
> 
> May be something got triggered on meta-store after compaction, Spark SQL start recognizing
delta files.
>   
> Let know me if needed other details to get root cause.
> 
> Try this,
> 
> See complete scenario :
> 
> hive> create table default.foo(id int) clustered by (id) into 2 buckets STORED AS
ORC TBLPROPERTIES ('transactional'='true');
> hive> insert into default.foo values(10);
> 
> scala> sqlContext.table("default.foo").count // Gives 0, which is wrong because data
is still in delta files
> 
> Now run major compaction:
> 
> hive> ALTER TABLE default.foo COMPACT 'MAJOR';
> 
> scala> sqlContext.table("default.foo").count // Gives 1
> 
> hive> insert into foo values(20);
> 
> scala> sqlContext.table("default.foo").count // Gives 2 , no compaction required.
> 
> 
> 
> 
> Regards
> Sanjiv Singh
> Mob :  +091 9990-447-339


Mime
View raw message