hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gopal Vijayaraghavan <gop...@apache.org>
Subject Re: Hive transactional table with delta files, Spark cannot read and sends error
Date Mon, 01 Aug 2016 21:53:38 GMT


> Spark fails reading this table. What options do I have here?

Would your issue be the same as
https://issues.apache.org/jira/browse/SPARK-13129?


LLAPContext in Spark can read those tables with ACID semantics (as in
delete/updates will work right).

var conn = LlapContext.newInstance(sc, hs2_url);
var df: DataFrame = conn.sql("select * from payees").persist();

Please be aware that's entirely in auto-commit mode, so you will be
getting lazy snapshot isolation (hence, persist is a good idea).

Even though "payees" is a placeholder, but this approach is intended for
tables like that which have multiple consumers, the practical reason to
use this pathway would be to apply specific masking/filtering by accessing
user (like hide amounts or just fit amounts into ranges, like 0-99, 99-999
etc instead of actual values for compliance audits without creating
complete copies).

Cheers,
Gopal



Mime
View raw message