hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brian Bockelman (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4775) FUSE crashes reliably on 0.19.0
Date Wed, 10 Dec 2008 21:00:46 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12655353#action_12655353
] 

Brian Bockelman commented on HADOOP-4775:
-----------------------------------------

Could this code be the problem in line 857 of fuse_dfs.c:

  if (size >= dfs->rdbuffer_size) {
    int num_read;
    int total_read = 0;
    while (size - total_read > 0 && (num_read = hdfsPread(fh->fs, fh->hdfsFH,
offset + total_read, buf + total_read, size - total_read)) > 0) {
      total_read += num_read;
    }
    return total_read;
  } 

Notice that if hdfsPread fails, then it will still return successfully.  In other instances
of hdfsPread, we have:

      if (num_read < 0) {
        // invalidate the buffer 
        fh->bufferSize = 0;
        syslog(LOG_ERR, "Read error - pread failed for %s with return code %d %s:%d", path,
(int)num_read, __FILE__, __LINE__);
        ret = -EIO;


> FUSE crashes reliably on 0.19.0
> -------------------------------
>
>                 Key: HADOOP-4775
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4775
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fuse-dfs
>            Reporter: Brian Bockelman
>            Priority: Critical
>         Attachments: fuse_lotsofmem_bt.txt, fuse_lotsofmem_pmap.txt
>
>
> Every morning I come in and find many nodes which have developed the dreaded "Transport
endpoint not connected" error overnight.  This has only started after the 0.19.0 upgrade.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message