spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From BryanCutler <>
Subject [GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...
Date Tue, 23 Oct 2018 22:05:20 GMT
Github user BryanCutler commented on a diff in the pull request:
    --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
    @@ -63,7 +65,7 @@ private[spark] object PythonEvalType {
     private[spark] abstract class BasePythonRunner[IN, OUT](
         funcs: Seq[ChainedPythonFunctions],
    -    evalType: Int,
    +    evalTypes: Seq[Int],
    --- End diff --
    I don't see that the additional complexity this adds is worth it for now, but curious
what others think. 
    If I understand correctly, the python worker just takes an index range for bounded windows
and the entire range for unbounded. It does not really care about anything else. So couldn't
you just send an index that encompasses the entire range for unbounded? Then you would only
need to define `SQL_WINDOW_AGG_PANDAS_UDF` in the worker and all the same code for both, which
would simplify quite a bit.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message