hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From madhu phatak <phatak....@gmail.com>
Subject InputFormat for a big file
Date Fri, 17 Dec 2010 15:58:30 GMT
I have a very large file of size 1.4 GB. Each line of the file is a number .
I want to find the sum all those numbers.
I wanted to use NLineInputFormat as a InputFormat but it sends only one line
to the Mapper which is very in efficient.
So can you guide me to write a InputFormat which splits the file
into multiple Splits and each mapper can read multiple
line from each split


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message