hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Sivachenko <trtrmi...@gmail.com>
Subject Re: Writing output from streaming task without dealing with key/value
Date Wed, 10 Sep 2014 18:28:38 GMT

On 10 сент. 2014 г., at 22:19, Rich Haase <rdhaase@gmail.com> wrote:

> You can write a custom output format

Any clues how can this can be done?

> , or you can write your mapreduce job in Java and use a NullWritable as Susheel recommended.
> grep (and every other *nix text processing command) I can think of would not be limited
by a trailing tab character.  It's even quite easy to strip away that tab character if you
don't want it during the post processing steps you want to perform with *nix commands. 

Problem is that the line itself contains a TAB in the middle, there will not be extra trailing
TAB at the end.
So it is not that simple.
You never know if it is a TAB from the original line or it is extra TAB added by TextOutputFormat.

View raw message