hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leo Alekseyev <dnqu...@gmail.com>
Subject How to manually retrieve data for a particular input split / how to get block ID
Date Mon, 25 Oct 2010 00:52:42 GMT
I am attempting to debug some tasks that are failing on a particular
input (the tasks hang until they time out and die).  By examining
$TMP/mapred/local/taskTracker/jobcache/ directory for the offending
task and looking inside split.dta, I see the following input split
location: hdfs://namenode-rd.imageshack.us:9000/user/hive/warehouse/img833_input/00034.tab417:10:32 Everything before the 417:10:32 part is just a path to a
file in HDFS.  How do I use "417:10:32" to give me the
address of the particular block, and how can I dump the block using
hadoop shell into a file? I assume there's a direct mapping between
this and the block ID values that I see when I browse to that file in
HDFS web UI (e.g., 8691049584976946484:



View raw message