hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uma Maheswara Rao G (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4190) Read complete block into memory once in BlockScanning and reduce concurrent disk access
Date Thu, 15 Nov 2012 19:31:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13498265#comment-13498265
] 

Uma Maheswara Rao G commented on HDFS-4190:
-------------------------------------------

{code}
it comes out to basically the same speed as large chunked read() calls, and sometimes slower
in multi-threaded environments
{code}
I am curious to know. You have already written the code for unmapping the mmapped buffers
(we have to write JNI call for that right. Currently I just used cleaner class from Sun package
to clean) ? If we don't clean them, I have seen bad behavior and after some time OOM errors
as native memory won't be cleaned until fullGC or you used in C code directly. I have tried
mmapped buffers in write path also. I have tried with 36 threads, I got around ~20% improvement.
I did not try increasing them, not sure if it goes bad if i increase them.(here other improvement
involved that, I have allocated complete block once, added the packet content to that buffers.
So, allocating consecutive memory location would improve somewhat I guess like using fallocate).
CLeaning that memmapped buffer is again heavy operation. So, I cleaned them in asynchronous
thread. That gives me that improvement. If I clean them sequentially then result is in negative
:( . 

I have one question here: 
fadvise options are global in DFS right? so, when I have random small reads, then readahead
may create unnecessary load data? (I am not expert in fadvise options internal behaviors :-)
)


{quote}
Turns out I was remembering wrong - that's what I get for going on JIRA before I've had any
coffee! I was thinking of HDFS-3529, but that's for the write path. A similar improvement
could be made on the read path to avoid a memcpy for the non-transferTo case.
{quote}
Ok. Yes, we can do similar.

                
> Read complete block into memory once in BlockScanning and reduce concurrent disk access
> ---------------------------------------------------------------------------------------
>
>                 Key: HDFS-4190
>                 URL: https://issues.apache.org/jira/browse/HDFS-4190
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>    Affects Versions: 3.0.0
>            Reporter: Uma Maheswara Rao G
>
> When we perform bulk write operations to DFS we observed that block scan is one bottleneck
for concurrent disk access.
> To see real load on disks, keep single data node and local client flushing data to DFS.
> When we switch off block scanning we have seen >10% improvement. I will update real
figures in comment.
> Even though I am doing only write operation, implicitly there will be a read operation
for each block due to block scanning. Next scan will happen only after 21 days, but once scan
will happen after adding the block. This will be the concurrent access to disks.
> Other point to note is that, we will read the block, packet by packet in block scanning
as well. We know that, we have to read&scan complete block, so, it may be correct to load
complete block once and do checksums verification for that data?
> I tried with MemoryMappedBuffers:
> mapped the complete block once in blockScanning and does the checksum verification with
that. Seen good improvement in that bulk write scenario.
> But we don't have any API to clean the mapped buffer immediately. With my experiment
I just used, Cleaner class from sun package. That will not be correct to use in production.
So, we have to write JNI call to clean that mmapped buffer.
> I am not sure I missed something here. please correct me If i missed some points.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message