spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Preece (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-12319) Address endian specific problems surfaced in 1.6
Date Tue, 22 Dec 2015 14:34:46 GMT

    [ https://issues.apache.org/jira/browse/SPARK-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15068176#comment-15068176
] 

Tim Preece commented on SPARK-12319:
------------------------------------

[~marmbrus]
Hi Michael,
I think this may be a problem with the new DataSet API, in particular the new "as" function
of DataFrame which I see is tagged as Experimental.

When we run the DatasetAggregatorSuite test "typed aggregation: class input with reordering"
the implementation seems to get confused between the ordering of the data in the unsaferow
(string,int) and the schema (int,string). This results in a testcase failure that shows up
to BE platforms ( although the data is also corrupted on LE platforms ).

At the moment I'm not sure how to fix, so any pointers would be helpful.

> Address endian specific problems surfaced in 1.6
> ------------------------------------------------
>
>                 Key: SPARK-12319
>                 URL: https://issues.apache.org/jira/browse/SPARK-12319
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.6.0
>         Environment: Problems apparent on BE, LE could be impacted too
>            Reporter: Adam Roberts
>            Priority: Critical
>
> JIRA to cover endian specific problems - since testing 1.6 I've noticed problems with
DataFrames on BE platforms, e.g. https://issues.apache.org/jira/browse/SPARK-9858
> [~joshrosen] [~yhuai]
> Current progress: using com.google.common.io.LittleEndianDataInputStream and com.google.common.io.LittleEndianDataOutputStream
within UnsafeRowSerializer fixes three test failures in ExchangeCoordinatorSuite but I'm concerned
around performance/wider functional implications
> "org.apache.spark.sql.DatasetAggregatorSuite.typed aggregation: class input with reordering"
fails as we expect "one, 1" but instead get "one, 9" - we believe the issue lies within BitSetMethods.java,
specifically around: return (wi << 6) + subIndex + java.lang.Long.numberOfTrailingZeros(word);




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message