avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Deepak Nettem <deepaknet...@gmail.com>
Subject Mapper Only Avro, Read from Local File System
Date Thu, 15 Mar 2012 16:47:00 GMT

I have a use case, wherein I need to write a Mapper Only job reads the file
from disk, and writes to HDFS in Avro serialized format. (I want to do this
because  I want the Mapper instances to actually download data from
somewhere onto local FS, and load that data in HDFS).

1. The job won't have any HDFS Inputpath or OutputPath.
2. I want to be able to set the number of Mappers depending on my internet
bandwidth. So the number of mappers shouldn't be calculated based on

Any suggestions on how to do this? I would really appreciate any example


View raw message