hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom Deutsch <tdeut...@us.ibm.com>
Subject Re: Hadoop and image processing?
Date Thu, 03 Mar 2011 15:00:59 GMT
Along with Brian I'd also suggest it depends on what you are doing with 
the images, but we used Hadoop specifically for this purpose in several 
solutions we build to do advanced imaging processing. Both scale out 
ability to large data volumes and (in our case) compute to do the image 
classification was well suited to Hadoop.


------------------------------------------------
Tom Deutsch
Program Director
CTO Office: Information Management
Hadoop Product Manager / Customer Exec
IBM
3565 Harbor Blvd
Costa Mesa, CA 92626-1420
tdeutsch@us.ibm.com




Brian Bockelman <bbockelm@cse.unl.edu> 
03/03/2011 06:42 AM
Please respond to
common-user@hadoop.apache.org


To
common-user@hadoop.apache.org
cc

Subject
Re: Hadoop and image processing?







On Mar 3, 2011, at 1:23 AM, nigelsandever@btconnect.com wrote:

> How applicable would Hadoop be to the processing of thousands of large 
(60-100MB) 3D image files accessible via NFS, using a 100+ machine 
cluster?
> 
> Does the idea have any merit at all?
> 

It may be a good idea.  If you think the above is a viable architecture 
for data processing, then you likely don't "need" Hadoop because your 
problem is small enough, or you spent way too much money on your NFS 
server.

Whether or not you "need" Hadoop for data scalability - petabytes of data 
moved at gigabytes a second - is a small aspect of the question.

Hadoop is a good data processing platform in its own right.  Traditional 
batch systems tend to have very Unix-friendly APIs for data processing 
(you'll find yourself writing perl script that create text submit files, 
shell scripts, and C code), but appear clumsy to "modern developers" (this 
is speaking as someone who lives and breathes batch systems).  Hadoop has 
"nice" Java APIs and is Java developer friendly, has a lot of data 
processing concepts built in compared to batch systems, and extends OK to 
other langauges.

If you write your image processing in Java, it would be silly to not 
consider Hadoop.  If you currently run a bag full of shell scripts and C++ 
code, it's a tougher decision to make.

Brian

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message