hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Rapplean <robert.rappl...@trueffect.com>
Subject RE: Viewing snappy compressed files
Date Wed, 22 May 2013 15:32:03 GMT
Thanks! This shortcuts my current process considerably, and should take the pressure off for
the short term. I'd still like to be able to analyze the data in a python script without having
to make a local copy, but that can wait.

Best,

Robert Rapplean
Senior Software Engineer
303-872-2256  direct  | 303.438.9597  main | www.trueffect.com

From: Sanjay Subramanian [mailto:Sanjay.Subramanian@wizecommerce.com]
Sent: Tuesday, May 21, 2013 11:56 AM
To: user@hadoop.apache.org
Subject: Re: Viewing snappy compressed files

+1 Thanks Rahul-da

Or u can use
hdfs dfs -text /path/to/dir/on/hdfs/part-r-00000.snappy | less


From: Rahul Bhattacharjee <rahul.rec.dgp@gmail.com<mailto:rahul.rec.dgp@gmail.com>>
Reply-To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>" <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Date: Tuesday, May 21, 2013 9:52 AM
To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>" <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Subject: Re: Viewing snappy compressed files

I haven't tried this with snappy , but you can try using hadoop fs -text <path>

On Tue, May 21, 2013 at 8:28 PM, Robert Rapplean <robert.rapplean@trueffect.com<mailto:robert.rapplean@trueffect.com>>
wrote:
Hey, there. My Google skills have failed me, and I hope someone here can point me in the right
direction.


We're storing data on our Hadoop cluster in Snappy compressed format. When we pull a raw file
down and try to read it, however, the Snappy libraries don't know how to read the files. They
tell me that the stream is missing the snappy identifier. I tried inserting 0xff 0x06 0x00
0x00 0x73 0x4e 0x61 0x50 0x70 0x59 into the beginning of the file, but that didn't do it.

Can someone point me to resources for figuring out how to uncompress these files without going
through Hadoop?

________________________________________
Robert Rapplean
Senior Software Engineer
303-872-2256<tel:303-872-2256>  direct  | 303.438.9597<tel:303.438.9597>  main
| www.trueffect.com<http://www.trueffect.com>



CONFIDENTIALITY NOTICE
======================
This email message and any attachments are for the exclusive use of the intended recipient(s)
and may contain confidential and privileged information. Any unauthorized review, use, disclosure
or distribution is prohibited. If you are not the intended recipient, please contact the sender
by reply email and destroy all copies of the original message along with any attachments,
from your computer system. If you are the intended recipient, please be advised that the content
of this message is subject to access, review and disclosure by the sender's Email System Administrator.

Mime
View raw message