hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohan Krishna <>
Subject Re: Question
Date Thu, 04 Dec 2014 03:25:09 GMT
Hi Gabriel/Bill,
I completely see that PIG and Hive are alternatives for MapReduce which
help Users working BigData Systems who dont have any JAVA knowledge. As we
all know that MAPReduce handles both Struct and unstruct data, i just
wanted to know which one among PIG/Hive handles Unstructured Data and which
one handles Structured Data.

Also,Please clarify the below

!) As Hive resembles SQL in processing data, can you please let me know
what are all the mail differences between Hive and SQL.

2) Why Hive came in to picture when we have SQL in the market ?

I Would be glad if the queries answered at the earliest, Thanks

On Thu, Dec 4, 2014 at 8:19 AM, Gabriel Eisbruch <>

> Mohan,
>    I think that depends on your use case and how you feel with the
> technologies, I am not a pig expert but as Bill said, Hive and Pig are ways
> to do easier the way to process the data, you can write mapreduce to do any
> hive or pig task and more , it's definitely more versatile but more hard to
> use too. Because of that, I prefer to said that we should use the correct
> tool for the correct problem. If you feel great with sql, your data can be
> mapped to hive schema and your process could be solved with sql, hive could
> be a great tool, If you feel better with scripting PIG could be a great
> tool, in other case, if you need to do more complex processing,
> map-reducer, spark or other could be greater.
> Gabriel.
> 2014-12-03 23:02 GMT-03:00 Mohan Krishna <>:
> Thankyou Gabriel
>> Your answers are very useful to me. Thanks a lot
>> On Thu, Dec 4, 2014 at 7:29 AM, Bill Busch <> wrote:
>>> MapReduce can be used for both structure and unstructured data.   Hive
>>> is a storage and retrieval mechanism (e.g. database).   The trouble with
>>> RDBMS is that you either have to parse the unstructured data into a
>>> structured row /column format OR store it as an object.  There are issues
>>> both performance and semantically .  Hence, there is a whole world of NoSQL
>>> databases out there that have been developed that are not row-column
>>> structured.  These databases can handle more schema-less/unstructured
>>> objects and will allow you to more eloquently manipulate your information.
>>>      I would check out the Wikipedia page on NoSQL databases and focus on
>>> Key - Value, Columnar, or Document databases.
>>> ------------------------------
>>> Date: Thu, 4 Dec 2014 07:06:16 +0530
>>> Subject: Re: Question
>>> From:
>>> To:
>>> Thanks Gabriel for the prompt response
>>> I see in online blogs saying  MapReduce for Unstructured Data , Pig for
>>> Semi Sturctured Data and Hive is only for Structured Data. Can you please
>>> justify this?
>>> Thanks in advance
>>> On Thu, Dec 4, 2014 at 6:56 AM, Gabriel Eisbruch <
>>>> wrote:
>>> Hi Mohan,
>>>    We are using hive for unstructured (or semi structured data) using
>>> map columns, for example, we use for fixed data standard columns and form
>>> dynamic data map columns.
>>> Gabriel.
>>> 2014-12-03 22:19 GMT-03:00 Mohan Krishna <>:
>>> Hive is  for only structured data or it handles Unstructured data as
>>> well ?

View raw message