hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Tamowski" <tamowsk...@gmail.com>
Subject Hadoop For Image Analysis/Vectorization
Date Fri, 21 Mar 2008 15:29:47 GMT

Forgive me if I am missing something in the documentation, but nothing is
jumping out at me.

I am exploring the use of Hadoop for image analysis and/or image
vectorization and have a few questions. I anticipate that there will be a
large collection of image files as input with an equal number of output
files. All files will be in raw binary format and are independent of each
other. What I am trying to figure it is:

-Does Hadoop/MR offer a clean abstraction for both consuming and producing a
large number of files? (I know it can handily consume a large number of
fies, but all examples of output seem to form a single file)
-Does Hadoop provide the input/output formats relevant to this or would I
have to create my own? (e.g non-splittable binary input, and binary output)
-Is this issue even well-suited to Hadoop in the first place? This type of
job may only need the map phase, and not the reduce phase, so maybe I'm
looking in the wrong place.

Thank you for your time. Also, I only subscribe to the digest, if you have
questions for me regarding this, please cc me at tamowski.d@gmail.com.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message