hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mapred Learn <mapred.le...@gmail.com>
Subject Sequence file format in python and serialization
Date Thu, 02 Jun 2011 07:06:23 GMT
I have a question regarding using sequence file input format in hadoop
streaing jar with mappers and reducers written in python.

If i use sequence file as input format for streaming jar and use mappers
written in python, can I take care of serialization and de-serialization in
mapper/reducer code ? For eg, if i have complex data-types in sequence
file's values, can I de-serialize them in python and run map-red job using
streaming jar.

Thanks in advance,

View raw message