flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saliya Ekanayake <esal...@gmail.com>
Subject Re: Reading Binary Data (Matrix) with Flink
Date Wed, 20 Jan 2016 14:16:45 GMT
Thank you, I saw the readHadoopFile, but I was not sure how it can be used
to the following, which is what I need. The logic of the code requires an
entire row to operate on, so in our current implementation with P tasks,
each of them will read a rectangular block of (N/P) x N from the matrix. Is
this possible with readHadoopFile? Also, the file may not be in hdfs, so is
it possible to refer to local disk in doing this?

Thank you

On Wed, Jan 20, 2016 at 1:31 AM, Chiwan Park <chiwanpark@apache.org> wrote:

> Hi Saliya,
>
> You can use the input format from Hadoop in Flink by using readHadoopFile
> method. The method returns a dataset which of type is Tuple2<Key, Value>.
> Note that MapReduce equivalent transformation in Flink is composed of map,
> groupBy, and reduceGroup.
>
> > On Jan 20, 2016, at 3:04 PM, Suneel Marthi <smarthi@apache.org> wrote:
> >
> > Guess u r looking for Flink's BinaryInputFormat to be able to read
> blocks of data from HDFS
> >
> >
> https://ci.apache.org/projects/flink/flink-docs-release-0.10/api/java/org/apache/flink/api/common/io/BinaryInputFormat.html
> >
> > On Wed, Jan 20, 2016 at 12:45 AM, Saliya Ekanayake <esaliya@gmail.com>
> wrote:
> > Hi,
> >
> > I am trying to use Flink perform a parallel batch operation on a NxN
> matrix represented as a binary file. Each (i,j) element is stored as a Java
> Short value. In a typical MapReduce programming with Hadoop, each map task
> will read a block of rows of this matrix and perform computation on that
> block and emit result to the reducer.
> >
> > How is this done in Flink? I am new to Flink and couldn't find a binary
> reader so far. Any help is greatly appreciated.
> >
> > Thank you,
> > Saliya
> >
> > --
> > Saliya Ekanayake
> > Ph.D. Candidate | Research Assistant
> > School of Informatics and Computing | Digital Science Center
> > Indiana University, Bloomington
> > Cell 812-391-4914
> > http://saliya.org
> >
>
> Regards,
> Chiwan Park
>
>


-- 
Saliya Ekanayake
Ph.D. Candidate | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
Cell 812-391-4914
http://saliya.org

Mime
View raw message