hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pete Wyckoff (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4397) fuse-dfs causes corruptions on multi-threaded access
Date Mon, 13 Oct 2008 18:34:46 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639156#action_12639156

Pete Wyckoff commented on HADOOP-4397:

looking at the code actually i think the others are safe as they all either use local variables
or for getattr, the place where is sets globalFS = hdfsConnect ... is actually an impossible
condition to hit since globalFs is initialized in dfs_init and never set again.

the only thing is the mutex is global whereas the problem is only on specific file handles,
so it is somewhat more restrictive than need be.  this would be a problem when many files
are being read from. but may be ok for now in practice.

+1 on this patch with the caveat of the above problem


> fuse-dfs causes corruptions on multi-threaded access
> ----------------------------------------------------
>                 Key: HADOOP-4397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4397
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fuse-dfs
>    Affects Versions: 0.18.1
>            Reporter: Brian Bockelman
>             Fix For: 0.18.2
>         Attachments: hadoop-4397.patch
> If multiple threads in the same process perform file system reads, then fuse-dfs causes
various problems due to the per-context buffer.  I've seen this reflected in segmentation
violations and corruptions.
> I'll attach a proposed patch which takes the "easy way" out - I surround all calls to
dfs_read with a mutex.  You will obviously get performance degradations through thrashing
if the threads are reading different parts of the file (but for our application, the multi-threaded
reads are very, very infrequent.
> If we want to have fuse-dfs writes/reads in 0.19 or 0.20, we'll probably need to do the
same thing with writes.
> This patch could be easily integrated as stands, or a more elaborate approach could be
taken - per-thread buffers maybe?
> Thanks as always for looking into this,
> Brian

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message