hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3205) Read multiple chunks directly from FSInputChecker subclass into user buffers
Date Thu, 03 Dec 2009 04:12:21 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785171#action_12785171
] 

Todd Lipcon commented on HADOOP-3205:
-------------------------------------

I did some further investigation on this:

- I ran the 1GB cat test using hprof=cpu=times to get accurate invocation counts for the various
read calls. Increasing MAX_CHUNKS from 1 to 16 does exactly what's expected and reduces the
number of calls to readChunks (and thus the input stream reads, etc) by exactly a factor of
16. The same is true of 128 - no noticeable differences. This is because System.arraycopy
doesn't get accounted by hprof in this mode, for whatever reason.

- I imported a copy of the BufferedInputStream source and made BufferedFSInputStream extend
from it rather than from the java.io one. I added a System.err printout right before the System.arraycopy
inside read1(). When I changed MAX_CHUNKS over from 127 to 128, I verified that it correctly
avoided these copies and read directly into the buffer. So the goal of the JIRA to get rid
of a copy was indeed accomplished.

Now the confusing part: eliminating this copy does nothing in terms of performance. Comparing
the MAX_CHUNKS=127 to MAX_CHUNKS=128 has no statistically significant effect on the speed
of catting 1G from RAM.

So my best theory right now on why it's faster is that it's simply doing fewer function calls,
each of which does more work with longer loops. This is better for loop unrolling, instruction
cache locality, and avoiding function call overhead. Perhaps it inspires the JIT to work harder
as well - who knows what black magic lurks there :)

I think at this point I've sufficiently investigated this, unless anyone has questions. I'll
make the changes that Eli suggested and upload a new patch.

> Read multiple chunks directly from FSInputChecker subclass into user buffers
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-3205
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3205
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>            Reporter: Raghu Angadi
>            Assignee: Todd Lipcon
>         Attachments: hadoop-3205.txt, hadoop-3205.txt, hadoop-3205.txt, hadoop-3205.txt
>
>
> Implementations of FSInputChecker and FSOutputSummer like DFS do not have access to full
user buffer. At any time DFS can access only up to 512 bytes even though user usually reads
with a much larger buffer (often controlled by io.file.buffer.size). This requires implementations
to double buffer data if an implementation wants to read or write larger chunks of data from
underlying storage.
> We could separate changes for FSInputChecker and FSOutputSummer into two separate jiras.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message