flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Elliot West <tea...@gmail.com>
Subject Strategies for reading structured file formats as POJO DataSets
Date Thu, 05 Mar 2015 09:18:51 GMT

As a new Flink user I wondered if there are any existing approaches or
practices for reading file formats such as CSV, TSV, etc. as DataSets or
POJOs? My current approach can be illustrated with a contrived example:

// Simulating a TSV file DataSet

DataSet<String> tsvRatings = env.fromElements("category-1\t10");

// Mapping to a POJO

DataSet<Rating> ratings = tsvRatings.map(line -> {
  String[] elements = line.split("\t");
  return new Rating(elements[0], Integer.parseInt(elements[1]));     });

While such a mapping could be implemented in a more general form, I'm keen
to avoid wheel reinvention and therefore wonder if there are already good
ways of doing this?

Thanks - Elliot.

View raw message