hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christian Schäfer <syrious3...@yahoo.de>
Subject AW: Hadoop Case Studies with interactive applications...an antagonism?
Date Fri, 19 Aug 2011 18:44:22 GMT
Hi Bobby,

thanks for the information provided :)

I'm glad there are some possibilities to use hadoop+hbase....was a bit afraid a 
had to discard that mighty tool (in my project)

As I'm still at the beginning of learning hadoop I just got one basic question: 
Is every query i send via hive to hbase in the background realized as a 
map/reduce-job or does it work in another (more efficient) kind? (I know RTFM 
would be an appropriate answer...but it still searched...and did not find the 
"answer" yet.

the mesos and storm stuff looks interesting..will take it into account for my 
evaluation if possible.

somehow I think pig + hive + cloudera tools will be implemented later because of 
proven tech, high level, tooling and possibility of getting support.

But I will check out the spark and storm as they seem to have some interesting 
concepts :)

regards
Christian







________________________________
Von: Robert Evans <evans@yahoo-inc.com>
An: "general@hadoop.apache.org" <general@hadoop.apache.org>
Gesendet: Freitag, den 19. August 2011, 17:35:08 Uhr
Betreff: Re: Hadoop Case Studies with interactive applications...an antagonism?

Christian,

Hadoop is best for batch processing because it is optimized for that use case.  
It is not that it cannot handle small jobs.  Those jobs tend to be some what 
slower then other systems and also not as consistent in their processing time as 
some use cases really need.  You can get around this some what by over 
provisioning your grid.

If you want to do monitoring of sensor data Hadoop should be able to handle it, 
so long as your SLAs are not extremely tight.  This is especially true as the 
size of your data grows.  You might want to look at HBase.  It can be very fast 
and interactive, and because it stores the data in HDFS you can process it with 
Map/Reduce if you need to.  There are a number of interactive/fast processing 
solutions on top of HDFS too that are either available now or should be soon 
once MRV2 stabilizes some more.  Look at Spark which is part of the mesos 
project at Berkley (www.mesosproject.org).  Another thing to look at is Hive or 
Pig if you want to be able to query the data with a higher level language.

Another solution that looks very interesting once it is released as open source 
is storm 
http://engineering.twitter.com/2011/08/storm-is-coming-more-details-and-plans.html
 It looks like it could be modified a bit to run under YARN (MRV2) and then you 
can store your modules state in HBase.  That would compliment Hadoop's MapReduce 
processing very nicely and do a lot of what you are looking at doing in real 
time.

--Bobby

On 8/19/11 8:06 AM, "Christian Schäfer" <syrious3000@yahoo.de> wrote:

Hi Hadoopians,

I'm a noob in hadoop (what a rhyme) ....and got some questions relating to the
white papers posted on cloudera.com as follows:


  in IQT  QUARTERLY: HADOOP: Scalable, Flexible Data Storage  and Analysis - By
Mike Olson

    I got an antagonism when comparing case studies and the following pros&cons
of hadoop.

    pros: hadoop(M/R) mostly used in batch operation (running mins or hours to
complete)
    cons: hadoop(M/R) not usable for interactive applications

    and the case study: OpenPDC where it is used for monitoring and to be able
to react quickly:
        "Close monitoring and rapid response to changes in the state of the grid
allow utilities to minimize or prevent blackouts,"

    another case study from "Ten Common Hadoopable Problems - Real-World Hadoop
Use Cases":
        "Fast detection allows the bank to protect itself from considerable
losses."

If there is a better non-commercial place to ask this questions please let me
know.

Background: I'm intending to set up a system for another domain where lots of
sensordata need to be stored
and queried to implement monitoring an detect problem situations

kind regards
Christian
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message