Hi, All,

I need to overwrite data in a Hive table and I use the following code to do so:

df = sqlContext.sql(my-spark-sql-statement);
df.count
df.write.format("orc").mode("overwrite").saveAsTable("foo") // I also tried 'insertInto("foo")

The "df.count" shows that there are only 452 records in the result.
But "select count(*) from foo" (run in beeline) shows that there are 716 records.

The final table contains more data than expected.

Does anyone know the reason and how to overwrite data in a Hive table with spark sql?

I'm using spark 2.2

Thanks

Boying




   
本邮件内容包含保密信息。如阁下并非拟发送的收件人,请您不要阅读、保存、对外披露或复制本邮件的任何内容,或者打开本邮件的任何附件。请即回复邮件告知发件人,并立刻将该邮件及其附件从您的电脑系统中全部删除,不胜感激。


     
This email message may contain confidential and/or privileged information. If you are not the intended recipient, please do not read, save, forward, disclose or copy the contents of this email or open any file attached to this email. We will be grateful if you could advise the sender immediately by replying this email, and delete this email and any attachment or links to this email completely and immediately from your computer system.