Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 62D642009A8 for ; Tue, 17 May 2016 20:34:54 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 616351607A8; Tue, 17 May 2016 18:34:54 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 80CC41609F5 for ; Tue, 17 May 2016 20:34:53 +0200 (CEST) Received: (qmail 17206 invoked by uid 500); 17 May 2016 18:34:52 -0000 Mailing-List: contact dev-help@apex.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@apex.incubator.apache.org Delivered-To: mailing list dev@apex.incubator.apache.org Received: (qmail 17178 invoked by uid 99); 17 May 2016 18:34:52 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 May 2016 18:34:52 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id D1935C0DD8 for ; Tue, 17 May 2016 18:34:51 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.279 X-Spam-Level: * X-Spam-Status: No, score=1.279 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=datatorrent-com.20150623.gappssmtp.com Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id 262ccdF3bJqq for ; Tue, 17 May 2016 18:34:49 +0000 (UTC) Received: from mail-oi0-f42.google.com (mail-oi0-f42.google.com [209.85.218.42]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTPS id 530A15F24C for ; Tue, 17 May 2016 18:34:49 +0000 (UTC) Received: by mail-oi0-f42.google.com with SMTP id x201so39288964oif.3 for ; Tue, 17 May 2016 11:34:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=datatorrent-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=h8LXiHyTLzWioS97YtRCgBVKG/qEDnEFXVBKYWW6OIo=; b=VZ7G7N2QnGOizhnZ4qk8vRarDyJwUjW9M8AMggiPWHFdlFE8cV+KUupsKU62UdYf/t azzPLBgSCVatGvSobjqqDzR9bWGL7pAZ8Y8nbUIhIWlua7k+joACC68xVmLehW60iCWo qrVPNCe7gjnOKNOgUYFNr+sDaW+Fcegau//juKuueOi9wFKEHBvcOMH7BX6h4zra61Pc R1lwZTyHv9XfriZJIjUfACGAyUkhkY7YJtV/JAoTd06gR/urUNFtlTfKvX2J277y+ao5 wOtNFKC6Gvz4J0Yr+EtS5BCmkipe5Rl3C/BP47b3+bRgn1q/pYctxETeH0Os69FH73za G3DQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=h8LXiHyTLzWioS97YtRCgBVKG/qEDnEFXVBKYWW6OIo=; b=O/5ZTmsH3NIidShYLaYu1x9ySeeMgBiXiJKLiN+DSp9PLZOGDHxyV8sOrWdEe3Nq3w b3PubglUpsphWrr5NpK9ekAz0LDC5SldVGVrSgRfdx6LQAqXbd0wSevKsn2ECd2lx/fH yU99mNQkBVrHMbfltquBjFCpA175gfdCVfup6JxyMxWI4gXEFjQndXJ0hHPr9GSW7fnP vSokcdX4GJaJzEiNK4DAVjbIik5WArypOIjNbDC4HFyHhPLtYpfQqy6vf1KBnOkXbrWz +p36deOomh2k1q5Gql06CeUfvGkDG7uqMYInx93StzMtrykPnqI2tJLj5DQ1X+8iLho8 fC7w== X-Gm-Message-State: AOPr4FUjJ0aMEXjAPFyUXzm+iuUSqXNg8M/eJFz04hiLmp69Ny5X6Df+0nhl0wxxRPTheqaRmgJlCOXpBJ5HWDnB X-Received: by 10.157.15.163 with SMTP id d32mr1746311otd.160.1463510082644; Tue, 17 May 2016 11:34:42 -0700 (PDT) MIME-Version: 1.0 Received: by 10.202.55.134 with HTTP; Tue, 17 May 2016 11:34:23 -0700 (PDT) In-Reply-To: <-1984445410349459728@unknownmsgid> References: <5d63056d-194f-a5b1-1485-fab7444db2a6@datatorrent.com> <-1984445410349459728@unknownmsgid> From: Bhupesh Chawda Date: Tue, 17 May 2016 11:34:23 -0700 Message-ID: Subject: Re: Serialization in Apex To: dev@apex.incubator.apache.org Content-Type: multipart/alternative; boundary=001a113d0f049a05b905330dfe58 archived-at: Tue, 17 May 2016 18:34:54 -0000 --001a113d0f049a05b905330dfe58 Content-Type: text/plain; charset=UTF-8 As Ram ans Sandesh pointed out, we do have @Bind and @DefaultSerializer annotations. However, these are tightly coupled with the field in question and do require modifying external code. Additionally it may also break other systems, if we are binding it to a JavaSerializer and perhaps there are systems which have other means of serializing the field. My point was more to do with user having to worry about what serializer to use and how to serialize objects. For example, I liked the approach that Storm takes by falling back to Java serialization automatically in case the target class does not have a default constructor. Of course, we can explore type based serialization. But this email was more about the usability aspect; to handle classes not having default constructors in general, not just POJO tuples. ~Bhupesh On Tue, May 17, 2016 at 9:53 AM, Pramod Immaneni wrote: > Can we do a test where we hard code a codec for a POJO and compare > performance against kryo. Thereafter we can dynamically compose a > codec via pojoutils and inject it. > > Thanks > > > On May 17, 2016, at 8:16 AM, Vlad Rozov wrote: > > > > +1 for type based serialization. Tuples in most cases are flat > records/pojo and it should be possible programmatically construct a codec > that will significantly outperform Kryo. It should also reduce amount of > data passed over the wire. I started to look in that direction as well as > Kryo serialization is one of bottlenecks that limits Apex throughput when > operators are deployed into different containers including NODE_LOCAL case. > > > > Thank you, > > Vlad > > > >> On 5/17/16 07:13, Sandesh Hegde wrote: > >> If it is possible to serialize, platform should do it automatically, it > >> reduces the tribal knowledge requirement to use the platform. Couples of > >> month back, I also sent out the similar email. > >> > >> Type based serialization may improve the performance. > >> > >>> On Tue, May 17, 2016, 6:06 AM Munagala Ramanath > wrote: > >>> > >>> Traditionally, we've recommended using > >>> "@DefaultSerializer(JavaSerializer.class)" or > >>> "@FieldSerializer.Bind(CustomSerializer.class)" as outlined at > >>> > >>> > http://docs.datatorrent.com/troubleshooting/#application-throwing-following-kryo-exception > >>> > >>> Can you describe why those approaches are not adequate ? > >>> > >>> Ram > >>> > >>> On Mon, May 16, 2016 at 11:46 PM, Bhupesh Chawda < > bhupesh@datatorrent.com> > >>> wrote: > >>> > >>>> Hi All, > >>>> > >>>> While working on the integration of Apex with Apache Samoa, I am > coming > >>>> across some scenarios where I have to add default constructors in some > >>>> external classes to make them Kryo serializable. Although this should > be > >>>> okay, we would like to avoid modifying external classes as far as > >>> possible. > >>>> Some other streaming engines have taken different approaches towards > >>>> serialization. > >>>> > >>>> I looked at Flink and Storm serialization mechanisms. > >>>> > >>>> Storm has a fall back mechanism on Java serialization. It does use > Kryo > >>> for > >>>> serialization due to performance. But, if the class is not > serializable > >>>> using Kryo, then it will try to serialize it using Java > serialization. If > >>>> even then it cannot serialize, then it throws an error. [1] > >>>> > >>>> Flink has its own serialization stack where it uses a serializer > based on > >>>> the type information known about the data. [2] > >>>> > >>>> What does the community think about the current state of > serialization in > >>>> Apex. Is there a need to explore some approaches which could avoid > >>>> serialization issues such as the one described above? Are there any > other > >>>> approaches one could use? > >>>> > >>>> 1. > >>> > http://storm.apache.org/releases/current/Serialization.html#java-serialization > >>>> 2. > >>> > https://cwiki.apache.org/confluence/display/FLINK/Type+System,+Type+Extraction,+Serialization > >>>> > >>>> ~Bhupesh > > > --001a113d0f049a05b905330dfe58--