hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Usman Waheed" <usm...@opera.com>
Subject Re: Extracting data from HDFS and displaying stats to a webpage
Date Thu, 09 Jul 2009 06:26:10 GMT
Thanks Christophe, Amr and Ted for your recommendations.

> Hey Usman, your second approach is on the right track. You don't want
> to have your end users interacting directly with HDFS. The latency is
> too high, and it wasn't designed for this.
> OTOH, running a "script" (a mapreduce, streaming, pig or hive job) on
> a regular basis and populating a database table is common practice and
> a great way to provide interactive access to summary/stats data. You
> can use the DBOutputFormat to make this even easier. You'll find
> DBOutputFormat and other database tools like Sqoop in Cloudera's
> Distro.
> Cheers,
> Christophe
> On Wed, Jul 8, 2009 at 3:26 PM, Usman Waheed<usmanw@opera.com> wrote:
>> Hi All,
>> Is there a recommended way on how to extract data from HDFS and perform  
>> some
>> computations on the data in order to display the results on a webpage.  
>> One
>> thing that comes to my mind is to write simple CGI perl scripts that  
>> extract
>> the data from HDFS and perform computational work on the data before  
>> sending
>> the results to the browser.
>> or
>> Maybe run some scripts in the background that summarize the data in  
>> HDFS and
>> insert into a DB table. Can then write a web GUI that interacts with  
>> the DB
>> table and displays the desired stats with graphs using ploticus. Our  
>> data
>> set in HDFS will eventually grow so speed will be important.
>> Thanks,
>> Usman
>> --
>> Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

View raw message