Return-Path: X-Original-To: apmail-avro-dev-archive@www.apache.org Delivered-To: apmail-avro-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E236B188B1 for ; Thu, 17 Sep 2015 05:06:47 +0000 (UTC) Received: (qmail 30885 invoked by uid 500); 17 Sep 2015 05:06:47 -0000 Delivered-To: apmail-avro-dev-archive@avro.apache.org Received: (qmail 30655 invoked by uid 500); 17 Sep 2015 05:06:47 -0000 Mailing-List: contact dev-help@avro.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@avro.apache.org Delivered-To: mailing list dev@avro.apache.org Received: (qmail 30640 invoked by uid 99); 17 Sep 2015 05:06:47 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Sep 2015 05:06:47 +0000 Date: Thu, 17 Sep 2015 05:06:47 +0000 (UTC) From: "Lewis John McGibbney (JIRA)" To: dev@avro.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (AVRO-813) EOFException is thrown during normal operation MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/AVRO-813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14791584#comment-14791584 ] Lewis John McGibbney commented on AVRO-813: ------------------------------------------- Hi [~rdblue], I found my issue. In my application I was not defining any particular serialzation definitions and was subsequently serializing the wrong data with the wrong code! {code} io.serializations org.apache.hadoop.io.serializer.WritableSerialization,org.apache.hadoop.io.serializer.JavaSerialization A list of serialization classes that can be used for obtaining serializers and deserializers. {code} Thanks for reply. > EOFException is thrown during normal operation > ---------------------------------------------- > > Key: AVRO-813 > URL: https://issues.apache.org/jira/browse/AVRO-813 > Project: Avro > Issue Type: Bug > Components: java > Affects Versions: 1.5.0 > Reporter: Bruno Dumon > Assignee: Bruno Dumon > Labels: memex > Fix For: 1.8.0 > > Attachments: avro-813-patch.txt > > > In an application that uses Avro as RPC mechanism (with the NettyTransceiver, but that's irrelevant), I've noticed in jprofiler that during normal operation quite some time was spent creating EOFExceptions: > {noformat} > 5.4% - 2,004 ms org.apache.avro.ipc.generic.GenericResponder.readRequest > 5.0% - 1,871 ms org.apache.avro.generic.GenericDatumReader.read > 4.9% - 1,832 ms org.apache.avro.generic.GenericDatumReader.read > 4.9% - 1,832 ms org.apache.avro.generic.GenericDatumReader.readRecord > 4.5% - 1,670 ms org.apache.avro.generic.GenericDatumReader.read > 4.5% - 1,670 ms org.apache.avro.generic.GenericDatumReader.readRecord > 4.3% - 1,596 ms org.apache.avro.generic.GenericDatumReader.read > 2.8% - 1,048 ms org.apache.avro.generic.GenericDatumReader.readArray > 1.3% - 477 ms org.apache.avro.io.ValidatingDecoder.arrayNext > 1.3% - 471 ms org.apache.avro.io.BinaryDecoder.arrayNext > 1.3% - 466 ms org.apache.avro.io.BinaryDecoder.doReadItemCount > 1.3% - 466 ms org.apache.avro.io.BinaryDecoder.readLong > 1.3% - 466 ms org.apache.avro.io.BinaryDecoder.ensureBounds > 1.3% - 466 ms org.apache.avro.io.BinaryDecoder$ByteSource.compactAndFill > 1.3% - 466 ms org.apache.avro.io.BinaryDecoder$InputStreamByteSource.tryReadRaw > 1.3% - 466 ms org.apache.avro.util.ByteBufferInputStream.read > 1.3% - 466 ms org.apache.avro.util.ByteBufferInputStream.getBuffer > 1.3% - 466 ms java.io.EOFException. > 1.3% - 466 ms java.io.IOException. > 1.2% - 460 ms java.lang.Exception. > 1.2% - 460 ms java.lang.Throwable. > 1.2% - 460 ms java.lang.Throwable.fillInStackTrace > {noformat} > These exceptions are produced by the ByteBufferInputStream (which modifies InputStream's contract: return -1 at eof), but are catched higher up by the tryReadRaw method. > What happens is this: > The message in question has an (empty) array at the end of its message, thus the reader tries to read the size of this array in BinaryDecoder.readLong. This calls ensureBounds(10), whose contract is that it should read 10 bytes if they are available, and otherwise be quiet. ensureBounds calls via compactAndFill the tryReadRaw method. It is this method which catches the EOFException, because it only 'tries' to read so many bytes. > Note that InputStreamByteSource.readRaw (without the 'try' part) does itself check if read < 0 in order to throw EOFException, making the throwing of EOFException in ByteBufferInputStream unnecessary (for this particular usage). > There was some talk about EOFException in AVRO-392 too, though it seems this particular common case was not mentioned there. When using Avro RPC, or more in general, when using Avro to read small messages rather than large files, it seems like one can very easily run into this EOFException situation, which hurts performance. > I'll attach a patch which simply removes the throwing of EOFException in ByteBufferInputStream, but this will likely break other cases which rely on the EOFException being thrown (haven't researched this to the bottom). -- This message was sent by Atlassian JIRA (v6.3.4#6332)