hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Skippin those gost darn 0 byte diles
Date Tue, 22 Jul 2014 20:39:25 GMT
I have two processes. One that writes sequence files directly to hdfs, the
other that is a hive table that reads these files.

All works well with the exception that I am only flushing the files
periodically. SequenceFile input format gets angry when it encounters
0-bytes seq files.

I was considering flush and sync on first record write. Also was thinking
should just be able to hack sequence file input format to skip 0 byte files
and not throw exception on readFully() which it sometimes does.

Anyone ever tackled this?

View raw message