spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kris Mok (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-26061) Reduce the number of unused UnsafeRowWriters created in whole-stage codegen
Date Wed, 14 Nov 2018 10:11:01 GMT
Kris Mok created SPARK-26061:
--------------------------------

             Summary: Reduce the number of unused UnsafeRowWriters created in whole-stage
codegen
                 Key: SPARK-26061
                 URL: https://issues.apache.org/jira/browse/SPARK-26061
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.4.0, 2.3.2, 2.3.1, 2.3.0
            Reporter: Kris Mok


Reduce the number of unused UnsafeRowWriters created in whole-stage generated code.
They come from the CodegenSupport.consume() calling prepareRowVar(), which uses GenerateUnsafeProjection.createCode()
and registers an UnsafeRowWriter mutable state, regardless of whether or not the downstream
(parent) operator will use the rowVar or not.
Even when the downstream doConsume function doesn't use the rowVar (i.e. doesn't put row.code
as a part of this operator's codegen template), the registered UnsafeRowWriter stays there,
which makes the init function of the generated code a bit bloated.

This ticket doesn't track the root issue, but makes it slightly less painful: when the doConsume
function is split out, the prepareRowVar() function is called twice, so it's double the pain
of unused UnsafeRowWriters. This fix simply moves the original call to prepareRowVar() down
into the doConsume split/no-split branch so that we're back to just 1x the pain.

To fix the root issue, something that allows the CodegenSupport operators to indicate whether
or not they're going to use the rowVar would be needed. That's a much more elaborate change
so I'd like to just make a minor fix first.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message