hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boyu Zhang <boyuzhan...@gmail.com>
Subject Corrupted input data to map
Date Fri, 15 Oct 2010 21:02:08 GMT
Hi all,

I am running a program with input 1 million lines of data, among the 1
million, 5 or 6 lines data are corrupted. The way the are corrupted is: in
the position which a float number is expected, like 3.4 , instead of a float
number, something like this is there: 3.4.5.6 . So when the map runs, it
throws a multiple point in num exception.

My question is: the map tasks that have the exception are marked failure,
how about the data processed by the same map before the exception, do they
reach the reduce task? or they are treated like garbage? Thank you very much
any help is appreciated.

Boyu

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message