Return-Path: X-Original-To: apmail-spark-issues-archive@minotaur.apache.org Delivered-To: apmail-spark-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CD4C418EAB for ; Mon, 5 Oct 2015 20:49:26 +0000 (UTC) Received: (qmail 61803 invoked by uid 500); 5 Oct 2015 20:49:26 -0000 Delivered-To: apmail-spark-issues-archive@spark.apache.org Received: (qmail 61771 invoked by uid 500); 5 Oct 2015 20:49:26 -0000 Mailing-List: contact issues-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@spark.apache.org Received: (qmail 61761 invoked by uid 99); 5 Oct 2015 20:49:26 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 05 Oct 2015 20:49:26 +0000 Date: Mon, 5 Oct 2015 20:49:26 +0000 (UTC) From: "Matt Cheah (JIRA)" To: issues@spark.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (SPARK-10877) Assertions fail straightforward DataFrame job due to word alignment MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/SPARK-10877?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D149= 44002#comment-14944002 ]=20 Matt Cheah commented on SPARK-10877: ------------------------------------ Can you turn off assertions when you spawn the shell? Assertions are off by= default for all JVMs but are turned on for unit tests. > Assertions fail straightforward DataFrame job due to word alignment > ------------------------------------------------------------------- > > Key: SPARK-10877 > URL: https://issues.apache.org/jira/browse/SPARK-10877 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.5.0 > Reporter: Matt Cheah > Attachments: SparkFilterByKeyTest.scala > > > I have some code that I=E2=80=99m running in a unit test suite, but the c= ode I=E2=80=99m running is failing with an assertion error. > I have translated the JUnit test that was failing, to a Scala script that= I will attach to the ticket. The assertion error is the following: > {code} > Exception in thread "main" org.apache.spark.SparkException: Job aborted d= ue to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failur= e: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.AssertionError:= lengthInBytes must be a multiple of 8 (word-aligned) > at org.apache.spark.unsafe.hash.Murmur3_x86_32.hashUnsafeWords(Murmur3_x8= 6_32.java:53) > at org.apache.spark.sql.catalyst.expressions.UnsafeArrayData.hashCode(Uns= afeArrayData.java:289) > at org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class= .hashCode(rows.scala:149) > at org.apache.spark.sql.catalyst.expressions.GenericMutableRow.hashCode(r= ows.scala:247) > at org.apache.spark.HashPartitioner.getPartition(Partitioner.scala:85) > at org.apache.spark.sql.execution.Exchange$$anonfun$doExecute$1$$anonfun$= 4$$anonfun$apply$4.apply(Exchange.scala:180) > at org.apache.spark.sql.execution.Exchange$$anonfun$doExecute$1$$anonfun$= 4$$anonfun$apply$4.apply(Exchange.scala:180) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > {code} > However, it turns out that this code actually works normally and computes= the correct result if assertions are turned off. > I traced the code and found that when hashUnsafeWords was called, it was = given a byte-length of 12, which clearly is not a multiple of 8. However, t= he job seems to compute correctly regardless of this fact. Of course, I can= =E2=80=99t just disable assertions for my unit test though. > A few things we need to understand: > 1. Why is the lengthInBytes of size 12? > 2. Is it actually a problem that the byte length is not word-aligned? If = so, how should we fix the byte length? If it's not a problem, why is the as= sertion flagging a false negative? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org