avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (AVRO-24) benchmark bulk data
Date Wed, 30 Sep 2009 21:33:25 GMT

     [ https://issues.apache.org/jira/browse/AVRO-24?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Doug Cutting resolved AVRO-24.
------------------------------

       Resolution: Fixed
    Fix Version/s: 1.2.0

I just committed a simple benchmark that uses the new HTTP transport.  This defines a protocol
with two methods, one sends a ByteBuffer and one that receives a ByteBuffer.

Running it on my laptop, making 100k requests each of 100 bytes, reports the following:

{noformat}
java  -Dtest.count=100000 -Dtest.size=100 org.apache.avro.TestBulkData
READ
seconds = 18
requests/second = 5367
MB = 9
MB/second = 0
WRITE
seconds = 15
requests/second = 6655
MB = 9
MB/second = 0
{noformat}

Increasing the bytes-per-request to 100k slows the request rate:

{noformat}
java -Dtest.count=100000 -Dtest.size=100000 org.apache.avro.TestBulkData
READ
seconds = 40
requests/second = 2460
MB = 9536
MB/second = 234
WRITE
seconds = 61
requests/second = 1637
MB = 9536
MB/second = 156
{noformat}


> benchmark bulk data
> -------------------
>
>                 Key: AVRO-24
>                 URL: https://issues.apache.org/jira/browse/AVRO-24
>             Project: Avro
>          Issue Type: Task
>          Components: java
>            Reporter: Doug Cutting
>            Assignee: Doug Cutting
>             Fix For: 1.2.0
>
>
> It would be good to validate that the RPC wire format is capable of transmitting bulk
data efficiently.  In particular, to be used for HDFS file access, it must be able to, when
including file data in an RPC response, or writing file data in an RPC request:
>  - saturate a disk's throughput or a network interface; and
>  - not consume much CPU.
> In other words, Avro's RPC should not be a bottleneck in the transfer of file data from
a remote disk to an application or vice versa, and moreover it should leave the vast majority
of the CPU for the application.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message