Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9B348200D1D for ; Sat, 30 Sep 2017 06:38:12 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 99BA01609EE; Sat, 30 Sep 2017 04:38:12 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id E07A01609D1 for ; Sat, 30 Sep 2017 06:38:11 +0200 (CEST) Received: (qmail 22564 invoked by uid 500); 30 Sep 2017 04:38:10 -0000 Mailing-List: contact issues-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@spark.apache.org Received: (qmail 22555 invoked by uid 99); 30 Sep 2017 04:38:10 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 30 Sep 2017 04:38:10 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 36AD91A44E8 for ; Sat, 30 Sep 2017 04:38:09 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id d2BzIL6Z3tPa for ; Sat, 30 Sep 2017 04:38:08 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 25F065FC99 for ; Sat, 30 Sep 2017 04:38:08 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 46C27E0D49 for ; Sat, 30 Sep 2017 04:38:07 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 76BC5242B7 for ; Sat, 30 Sep 2017 04:38:04 +0000 (UTC) Date: Sat, 30 Sep 2017 04:38:04 +0000 (UTC) From: "Sean Owen (JIRA)" To: issues@spark.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Closed] (SPARK-22163) Design Issue of Spark Streaming that Causes Random Run-time Exception MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Sat, 30 Sep 2017 04:38:12 -0000 [ https://issues.apache.org/jira/browse/SPARK-22163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen closed SPARK-22163. ----------------------------- > Design Issue of Spark Streaming that Causes Random Run-time Exception > --------------------------------------------------------------------- > > Key: SPARK-22163 > URL: https://issues.apache.org/jira/browse/SPARK-22163 > Project: Spark > Issue Type: Bug > Components: DStreams, Structured Streaming > Affects Versions: 2.2.0 > Environment: Spark Streaming > Kafka > Linux > Reporter: Michael N > Priority: Critical > > The application objects can contain List and can be modified dynamically as well. However, Spark Streaming framework asynchronously serializes the application's objects as the application runs. Therefore, it causes random run-time exception on the List when Spark Streaming framework happens to serializes the application's objects while the application modifies a List in its own object. > In fact, there are multiple bugs reported about > Caused by: java.util.ConcurrentModificationException > at java.util.ArrayList.writeObject > that are permutation of the same root cause. So the design issue of Spark streaming framework is that it should do this serialization asynchronously. Instead, it should either > 1. do this serialization synchronously. This is preferred to eliminate the issue completely. Or > 2. Allow it to be configured per application whether to do this serialization synchronously or asynchronously, depending on the nature of each application. > Also, Spark documentation should describe the conditions that trigger Spark to do this type of serialization asynchronously, so the applications can work around them until the fix is provided. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org