hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HADOOP-6849) Have LocalDirAllocator.AllocatorPerContext.getLocalPathForWrite fail more meaningfully
Date Wed, 30 Jul 2014 21:54:42 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-6849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Allen Wittenauer resolved HADOOP-6849.

    Resolution: Fixed

A lot of this has been changed. Closing this as stale.

> Have LocalDirAllocator.AllocatorPerContext.getLocalPathForWrite fail more meaningfully
> --------------------------------------------------------------------------------------
>                 Key: HADOOP-6849
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6849
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 0.20.2
>            Reporter: Steve Loughran
>            Priority: Minor
> A stack trace makes it way to me, of a reduce failing
> {code}
> Caused by: org\.apache\.hadoop\.util\.DiskChecker$DiskErrorException: Could not find
any valid local directory for file:/mnt/data/dfs/data/mapred/local/taskTracker/jobcache/job_201007011427_0001/attempt_201007011427_0001_r_000000_1/output/map_96\.out
>       at org\.apache\.hadoop\.fs\.LocalDirAllocator$AllocatorPerContext\.getLocalPathForWrite(LocalDirAllocator\.java:343)
>       at org\.apache\.hadoop\.fs\.LocalDirAllocator\.getLocalPathForWrite(LocalDirAllocator\.java:124)
>       at org\.apache\.hadoop\.mapred\.ReduceTask$ReduceCopier$LocalFSMerger\.run(ReduceTask\.java:2434)
> {code}
> We're probably running out of HDD space, if not its configuration problems. Either way,
some more hints in the exception would be handy.
> # Include the size of the output file looked for if known
> # Include the list of dirs examined and their reason for rejection (not found or if not
enough room, available space).
> This would make it easier to diagnose problems after the event, with nothing but emailed
logs for diagnostics.

This message was sent by Atlassian JIRA

View raw message