hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gabriel Eisbruch <>
Subject Re: Question
Date Thu, 04 Dec 2014 02:49:05 GMT
   I think that depends on your use case and how you feel with the
technologies, I am not a pig expert but as Bill said, Hive and Pig are ways
to do easier the way to process the data, you can write mapreduce to do any
hive or pig task and more , it's definitely more versatile but more hard to
use too. Because of that, I prefer to said that we should use the correct
tool for the correct problem. If you feel great with sql, your data can be
mapped to hive schema and your process could be solved with sql, hive could
be a great tool, If you feel better with scripting PIG could be a great
tool, in other case, if you need to do more complex processing,
map-reducer, spark or other could be greater.


2014-12-03 23:02 GMT-03:00 Mohan Krishna <>:

> Thankyou Gabriel
> Your answers are very useful to me. Thanks a lot
> On Thu, Dec 4, 2014 at 7:29 AM, Bill Busch <> wrote:
>> MapReduce can be used for both structure and unstructured data.   Hive is
>> a storage and retrieval mechanism (e.g. database).   The trouble with RDBMS
>> is that you either have to parse the unstructured data into a structured
>> row /column format OR store it as an object.  There are issues both
>> performance and semantically .  Hence, there is a whole world of NoSQL
>> databases out there that have been developed that are not row-column
>> structured.  These databases can handle more schema-less/unstructured
>> objects and will allow you to more eloquently manipulate your information.
>>      I would check out the Wikipedia page on NoSQL databases and focus on
>> Key - Value, Columnar, or Document databases.
>> ------------------------------
>> Date: Thu, 4 Dec 2014 07:06:16 +0530
>> Subject: Re: Question
>> From:
>> To:
>> Thanks Gabriel for the prompt response
>> I see in online blogs saying  MapReduce for Unstructured Data , Pig for
>> Semi Sturctured Data and Hive is only for Structured Data. Can you please
>> justify this?
>> Thanks in advance
>> On Thu, Dec 4, 2014 at 6:56 AM, Gabriel Eisbruch <
>>> wrote:
>> Hi Mohan,
>>    We are using hive for unstructured (or semi structured data) using map
>> columns, for example, we use for fixed data standard columns and form
>> dynamic data map columns.
>> Gabriel.
>> 2014-12-03 22:19 GMT-03:00 Mohan Krishna <>:
>> Hive is  for only structured data or it handles Unstructured data as well
>> ?

View raw message