hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sanjay Subramanian <>
Subject Re: Compression in Hive
Date Tue, 11 Jun 2013 23:06:53 GMT
1. We use LZO compression in our MR jobs that create LZO files (these are NOT sequence files)
 that are the feeder files for Hive
2. Then we we use Hive data (LZO files) and run aggregation reports

Hope this helps
Good luck

From: "Ravi Mummulla (BIG DATA)" <<>>
Reply-To: "<>" <<>>
Date: Monday, June 10, 2013 6:14 AM
To: "<>" <<>>
Subject: RE: Compression in Hive

Documentation is here
Performance overhead is trivial for larger amounts of data but may be magnified as data size
gets smaller. Typically where you gain is data transfers between nodes and disk reads/writes.
Again, the larger the data size the more the gain.


From: Sachin Sudarshana []
Sent: Sunday, June 9, 2013 11:04 PM
Subject: Compression in Hive


I have been testing the usefulness of compression in Hive. I have a general question,

I would like to know if there are any particular cases where compression in hive can actually
prove useful while running any MR jobs.

Any pointers/examples would really be useful!

Thank you,

This email message and any attachments are for the exclusive use of the intended recipient(s)
and may contain confidential and privileged information. Any unauthorized review, use, disclosure
or distribution is prohibited. If you are not the intended recipient, please contact the sender
by reply email and destroy all copies of the original message along with any attachments,
from your computer system. If you are the intended recipient, please be advised that the content
of this message is subject to access, review and disclosure by the sender's Email System Administrator.

View raw message