hudi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1248: Adding delete docs to QuickStart
Date Sat, 18 Jan 2020 21:08:41 GMT
vinothchandar commented on a change in pull request #1248: Adding delete docs to QuickStart
URL: https://github.com/apache/incubator-hudi/pull/1248#discussion_r368248651
 
 

 ##########
 File path: docs/quickstart.md
 ##########
 @@ -109,6 +109,57 @@ Notice that the save mode is now `Append`. In general, always use append
mode un
 [Querying](#query) the data again will now show updated trips. Each write operation generates
a new [commit](http://hudi.incubator.apache.org/concepts.html) 
 denoted by the timestamp. Look for changes in `_hoodie_commit_time`, `rider`, `driver` fields
for the same `_hoodie_record_key`s in previous commit. 
 
+## Delete data {#deletes}
+Delete records for the HoodieKeys passed in. Lets first generate a new batch of insert and
delete the same. Query to verify
+that all records are deleted.
+
+```
+val inserts = convertToStringList(dataGen.generateInserts(10))
+val df = spark.read.json(spark.sparkContext.parallelize(inserts, 2))
+df.write.format("org.apache.hudi").
+    options(getQuickstartWriteConfigs).
+    option(PRECOMBINE_FIELD_OPT_KEY, "ts").
+    option(RECORDKEY_FIELD_OPT_KEY, "uuid").
+    option(PARTITIONPATH_FIELD_OPT_KEY, "partitionpath").
+    option(TABLE_NAME, tableName).
+    mode(Overwrite).
+    save(basePath);
+
+// Fetch the rider value for the batch of records inserted just now
+val roDeleteViewDF = spark.
+    read.
+    format("org.apache.hudi").
+    load(basePath + "/*/*/*/*")
+roDeleteViewDF.registerTempTable("hudi_ro_table")
+spark.sql("select distinct rider from  hudi_ro_table where").show()
+
+// replace the rider value in below query to a value from above. "rider-213" is first batch
and "rider-284" is second batch.
+val ds = spark.sql("select uuid, partitionPath from hudi_ro_table where rider = 'rider-284'")
+
+// issue deletes
 
 Review comment:
   Lets have it after incremental query.. deletes will conclude the flow of writing and reading
nicely

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message