hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3205) FSInputChecker and FSOutputSummer should allow better access to user buffer
Date Tue, 03 Nov 2009 18:43:32 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773109#action_12773109
] 

Todd Lipcon commented on HADOOP-3205:
-------------------------------------

Hi Hong,

Thanks for the input. How do others feel about using a separate CRC32 path for the "bulk checksum
checking" in the read path, probably through JNI when available? I had suggested this in HADOOP-6148
and people said it would be unmaintainable. Given that checksum algorithms rarely change and
are easy to verify, I disagree, but would like to have some +1s for this direction before
I spend the time writing the code.

Regarding the other points in your blog post, it seems to imply that we'd have to change around
a lot of the APIs to work with ByteBuffers rather than byte[], potentially all the way down
to the user-facing layer. This would be a big API change. Where is a good place to start,
and what kind of backwards compatibility layer will we need?

> FSInputChecker and FSOutputSummer should allow better access to user buffer
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-3205
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3205
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>            Reporter: Raghu Angadi
>            Assignee: Raghu Angadi
>
> Implementations of FSInputChecker and FSOutputSummer like DFS do not have access to full
user buffer. At any time DFS can access only up to 512 bytes even though user usually reads
with a much larger buffer (often controlled by io.file.buffer.size). This requires implementations
to double buffer data if an implementation wants to read or write larger chunks of data from
underlying storage.
> We could separate changes for FSInputChecker and FSOutputSummer into two separate jiras.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message