hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5598) Implement a pure Java CRC32 calculator
Date Fri, 19 Jun 2009 14:58:07 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721804#action_12721804
] 

Owen O'Malley commented on HADOOP-5598:
---------------------------------------

I should have commented earlier on this. I think the right solution is to use a pure Java
impl if we can get the performance comparable in the "normal" case. If use a C implementation
in libhadoop, it should use DirectByteBuffers and pool those buffers. Furthermore, it should
be a different jira, since there are a lot more issues there.

I'd also veto any code that dynamically switches implementations based on anything other that
whether libhadoop is present. (ie. switching based on the size of the input is going to be
unmaintainable)

I can upload the code that I wrote for the pure java, if you want to see a third implementation.

> Implement a pure Java CRC32 calculator
> --------------------------------------
>
>                 Key: HADOOP-5598
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5598
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Owen O'Malley
>            Assignee: Todd Lipcon
>         Attachments: crc32-results.txt, hadoop-5598-evil.txt, hadoop-5598-hybrid.txt,
hadoop-5598.txt, hadoop-5598.txt, PureJavaCrc32.java, PureJavaCrc32.java, PureJavaCrc32.java,
TestCrc32Performance.java, TestCrc32Performance.java, TestCrc32Performance.java, TestPureJavaCrc32.java
>
>
> We've seen a reducer writing 200MB to HDFS with replication = 1 spending a long time
in crc calculation. In particular, it was spending 5 seconds in crc calculation out of a total
of 6 for the write. I suspect that it is the java-jni border that is causing us grief.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message