hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhishek Shivkumar <abhisheksgum...@gmail.com>
Subject Re: Compare Hadoop and Pig Map\Reduce
Date Tue, 31 Jul 2012 17:13:18 GMT
Hi Manoj,

   Pig is basically a data-flow language used to perform high-level simple
operations such as summarizations and basic analysis on top of the data
residing on HDFS. It uses a language called Pig-Latin. It gives your HDFS a
datawarehouse kind of perspective, and lets you do a data analysis job by
writing simple scripts.

   Pig Latin is easy to learn and one necessarily doesn't need to know
mapreduce to write and run Pig Latin. It is important to note that once you
write the Pig scripts, when they are run, internally they generate
mapreduce jobs to run the scripts. So, eventually, you are using mapreduce
internally.

    On the other hand, you use mapreduce to perform a job that is not as
simple to be written using a script in pig Latin. for this, you will need
to design the mapreduce job by deciding how many reducers do you need,
designing the combiner, partitioner and  grouping class for various
performance issues.

    Of course it is easy to run jobs using pig scripts, but it may not be
possible to write everything in Pig.

Hope it is fine.

Thank you!

With Regards,
Abhishek S


On Tue, Jul 31, 2012 at 10:37 PM, Manoj Babu <manoj444@gmail.com> wrote:

> Hi,
>
> It would be great if any of you compare Pig and Hadoop map reduce. When we
> should go for Hadoop or Pig?
> I love to program using java but peoples were arguing that can be
> easily achieved in ping with very few lines of code even my boss too...
> I am a fresh developer for Hadoop. Could kindly provide the pros and cons?
>
> Cheers!
> Manoj.
>
>
>

Mime
View raw message