kudu-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From granthe...@apache.org
Subject [kudu] 03/03: [backup] Use upsert on restore
Date Fri, 25 Jan 2019 23:16:11 GMT
This is an automated email from the ASF dual-hosted git repository.

granthenke pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git

commit 628322523420c6024f911357dec33626af9a915a
Author: Grant Henke <granthenke@apache.org>
AuthorDate: Thu Jan 24 08:30:20 2019 -0600

    [backup] Use upsert on restore
    
    Changes the restore job to use upserts instead of
    inserts so that Spark task retries do not fail when
    a key already exists.
    
    Change-Id: I9905fe4301db8c06fe4f18318ccb04b4179a1601
    Reviewed-on: http://gerrit.cloudera.org:8080/12268
    Reviewed-by: Adar Dembo <adar@cloudera.com>
    Tested-by: Kudu Jenkins
---
 .../src/main/scala/org/apache/kudu/backup/KuduRestore.scala           | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala b/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
index 695d704..9a0173c 100644
--- a/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
+++ b/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
@@ -70,7 +70,9 @@ object KuduRestore {
       val writeOptions = new KuduWriteOptions(ignoreDuplicateRowErrors = false, ignoreNull
= false)
       // TODO: Use client directly for more control?
       // (session timeout, consistency mode, flush interval, mutation buffer space)
-      context.insertRows(df, restoreName, writeOptions)
+
+      // Upsert so that Spark task retries do not fail.
+      context.upsertRows(df, restoreName, writeOptions)
     }
   }
 


Mime
View raw message