spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [spark] HeartSaVioR commented on a change in pull request #26108: [SPARK-26154][SS] Streaming left/right outer join should not return outer nulls for already matched rows
Date Mon, 14 Oct 2019 08:06:00 GMT
HeartSaVioR commented on a change in pull request #26108: [SPARK-26154][SS] Streaming left/right
outer join should not return outer nulls for already matched rows
URL: https://github.com/apache/spark/pull/26108#discussion_r334363117
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinExec.scala
 ##########
 @@ -270,20 +279,30 @@ case class StreamingSymmetricHashJoinExec(
         // * Getting an iterator over the rows that have aged out on the left side. These
rows are
         //   candidates for being null joined. Note that to avoid doing two passes, this
iterator
         //   removes the rows from the state manager as they're processed.
-        // * Checking whether the current row matches a key in the right side state, and
that key
-        //   has any value which satisfies the filter function when joined. If it doesn't,
-        //   we know we can join with null, since there was never (including this batch)
a match
-        //   within the watermark period. If it does, there must have been a match at some
point, so
-        //   we know we can't join with null.
+        // * (state format version 1) Checking whether the current row matches a key in the
 
 Review comment:
   I've left origin comment as it is since it is still applied for state format version 1,
and added comment for explaining state format version 2 and the reason of making a change.
Please let me know if we just want to remove explanation of state format version 1.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message