spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacek Laskowski <ja...@japila.pl>
Subject Re: Why Dataset.hint uses logicalPlan (= analyzed not planWithBarrier)?
Date Fri, 26 Jan 2018 11:54:28 GMT
Thanks Wenchen --> https://github.com/apache/spark/pull/20405

I'd also like to write a new test where broadcast hint could be specified
with table identifiers + improve scaladoc for Dataset.hint to note that
hint does not have to be used with the Dataset but any Dataset (as long as
the table identifier is resolvable). That would help understanding that
part of Spark SQL a little better (i.e. writing a unit test with logical
rules and such).

Should I fill an issue in JIRA for this? Any suggestions how to do it the
right way?

Pozdrawiam,
Jacek Laskowski
----
https://about.me/JacekLaskowski
Mastering Spark SQL https://bit.ly/mastering-spark-sql
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Kafka Streams https://bit.ly/mastering-kafka-streams
Follow me at https://twitter.com/jaceklaskowski

On Fri, Jan 26, 2018 at 9:08 AM, Wenchen Fan <cloud0fan@gmail.com> wrote:

> Looks like we missed this one, feel free to submit a patch, thanks for
> your finding!
>
> On Fri, Jan 26, 2018 at 3:39 PM, Jacek Laskowski <jacek@japila.pl> wrote:
>
>> Hi,
>>
>> I've just noticed that every time Dataset.hint is used it triggers
>> execution of logical commands, their unions and hint resolution (among
>> other things that analyzer does).
>>
>> Why?
>>
>> Why does hint trigger hint resolution (through QueryExecution.analyzed)?
>> [1]
>>
>> And moreover why not to use planWithBarrier instead? [2] Looks like an
>> oversight, doesn't it?
>>
>> [1] https://github.com/apache/spark/blob/master/sql/core/src
>> /main/scala/org/apache/spark/sql/Dataset.scala#L1219
>>
>> [2] https://github.com/apache/spark/blob/master/sql/core/src
>> /main/scala/org/apache/spark/sql/Dataset.scala#L195
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> ----
>> https://about.me/JacekLaskowski
>> Mastering Spark SQL https://bit.ly/mastering-spark-sql
>> Spark Structured Streaming https://bit.ly/spark-structured-streaming
>> Mastering Kafka Streams https://bit.ly/mastering-kafka-streams
>> Follow me at https://twitter.com/jaceklaskowski
>>
>
>

Mime
View raw message