hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Evans <ev...@yahoo-inc.com>
Subject Re: Using HADOOP for Processing Videos
Date Wed, 21 Sep 2011 17:41:02 GMT
Dr. Rajen Bhatt,

I don't now of any example code that will split a video file for Hadoop.  If you write one
then please start a git-hub project or something like it and let us know because I can see
it being extremely useful for others wanting to do multimedia processing.

You might want to look at ffmpeg(libavformat/libavcodec)  I am not sure how this would work
completely, but in the splitter you could open up the file and get out the file's length in
total time or frames and divided it up per mapper, then in each mapper have libavformat/libavcodec
open up the file again and seek to their start time or frame.  I am not sure how efficient
libavformat/libavcodec are in seeking.  It would be bad if they actually read the entire file
to get to a given point.  It would also be good if the in the splitter you could try and get
what the offset into the file is for a given time or frame so that you can give hits to Hadoop
to know where to place the mapper.  This also assumes that you can set up something to bridge
HDFS to libavformat probably using JNI.

--Bobby Evans

On 9/20/11 11:12 PM, "Rajen Bhatt (RBEI/EST1)" <Rajen.Bhatt@in.bosch.com> wrote:

Dear Bobby:
Thanks for your reply.
My problem is that I have just one file, which is very large typically in terabytes or petabytes
and then I want to split them, feed to mappers, and process on worker nodes. Is some example
class is available with somebody, it shall help the student.
Thanks and Regards,

Dr. Rajen Bhatt
(Corporate Research @ Robert Bosch, India)
Off: +91-80-4191-6699
Mob: +91-9901241005

From: Robert Evans [mailto:evans@yahoo-inc.com]
Sent: Tuesday, 20. September 2011 11:32 PM
To: mapreduce-user@hadoop.apache.org
Subject: Re: Using HADOOP for Processing Videos

Another thing to think about is that you may not need to split the videos at all.  If you
have lots of video files instead of a few big ones, and each can more or less be processed
independently then you can use something like nline input format, not nline itself necessarily
but something like it, to process each video separately.  You would have to write the code
to read in the video file, but there are APIs to do that, like OpenCV.  This is what I did
in the past to train and score machine learned classifiers on image and video files using

--Bobby Evans

On 9/19/11 11:54 PM, "Swathi V" <swathiv@zinniasystems.com> wrote:
This link might help you...
example <http://musicmachinery.com/2011/09/04/how-to-process-a-million-songs-in-20-minutes/>

On Tue, Sep 20, 2011 at 9:52 AM, Rajen Bhatt (RBEI/EST1) <Rajen.Bhatt@in.bosch.com>
Dear MapReduce User Groups:
We want to process large amount of videos (typically 30 days old storage with size around
1TB) using Hadoop.
Can somebody point me to code samples or classes which can take video files in its original
compressed format (H.264, MPEG-4) and then process using Mappers?
Thanks and Regards,

Dr. Rajen Bhatt
(Corporate Research @ Robert Bosch, India)
Off: +91-80-4191-6699
Mob: +91-9901241005

View raw message