avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fengyun RAO <raofeng...@gmail.com>
Subject Can Avro serialize and compress in parallel?
Date Mon, 06 Jan 2014 02:30:20 GMT
Hi, all

I have some IIS log files whose format depends on "#Fields" line inside the
log, which make the file not splitable and not suitable for MR job. So I
want to preprocess the files to Avro files. It's simple and fast to
transform each line to an Avro record, but the serialization and
compression is too slow.

Is there a way that the serialize and compress in parallel, while write
sequentially? In principle I could even split the records to several files,
which could serialize and compress in parallel, but I can't find a way to
combine them.

 any suggestions? Thanks!

Mime
View raw message