beam-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jesse Anderson <je...@smokinghand.com>
Subject Writing Out List<String>
Date Fri, 20 May 2016 04:00:23 GMT
I'm trying to write out a List<String> with TextIO.Write. The only
supported type is String. I ended up writing an anonymous coder.

I want to check if there is a a coder that I couldn't find that would just
take an object and write out out the .toString() of it.

I tried this:
orderedList.apply(TextIO.Write.withCoder(ListCoder.of(StringDelegateCoder.of(String.class))).to("output/result"));

But a VarInt is encoded along with everything. I'm looking for a coder that
only writes out the UTF8.

This functionality would be similar to Hadoop TextOutputFormat. It just
runs a .toString before writing it out.

In the anonymous coder I wrote, I hit a weird issue. This code just writes
out a bunch of "\n". Yes, value is populated with data.
          dataOutputStream.writeUTF(value);
          dataOutputStream.writeUTF("\n");

This code works:
          byte[] bytes = value.getBytes(StandardCharsets.UTF_8);
          dataOutputStream.write(bytes);
          dataOutputStream.writeUTF("\n");

I took this from the string coder. What's odd is that DOS' writeUTF should
work too. Is there a reason why?

Thanks,

jesse

Mime
View raw message