hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sanjay Subramanian <Sanjay.Subraman...@wizecommerce.com>
Subject Re: Snappy with HIve
Date Thu, 23 May 2013 16:49:00 GMT
Thanks Bejoy…I tracked down the issue..there was an earlier table (with leo definition) that
I had not dropped and recreated - hence giving input snappy to that was giving issues
Regards
sanjay

From: "bejoy_ks@yahoo.com<mailto:bejoy_ks@yahoo.com>" <bejoy_ks@yahoo.com<mailto:bejoy_ks@yahoo.com>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>>,
"bejoy_ks@yahoo.com<mailto:bejoy_ks@yahoo.com>" <bejoy_ks@yahoo.com<mailto:bejoy_ks@yahoo.com>>
Date: Thursday, May 23, 2013 7:31 AM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: Re: Snappy with HIve

Hi

Please find responses below.

Do I have to give some INPUTFORMAT directive to make the Hive Table read Snappy Codec files
?
For example for LZO its
STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"

Bejoy : No custom input format required. Add the snappy codec in io.compression.codecs.

QUESTION 2
For Hive scripts that will READ Snappy files and Output Snappy Files to Hive Tables are the
following settings enough ?
SET hive.exec.compress.output=true;
SET mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec;
SET mapred.output.compression.type=BLOCK;

Bejoy: It should be fine. If it shows any issues add mapred.output.compress=true as well
Regards
Bejoy KS

Sent from remote device, Please excuse typos
________________________________
From: Sanjay Subramanian <Sanjay.Subramanian@wizecommerce.com<mailto:Sanjay.Subramanian@wizecommerce.com>>
Date: Tue, 21 May 2013 23:30:09 +0000
To: user@hive.apache.org<mailto:user@hive.apache.org><user@hive.apache.org<mailto:user@hive.apache.org>>
ReplyTo: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: Snappy with HIve

Hi guys

QUESTION 1
I have an MR job that creates Snappy Codec Output files.
My table definition is as follows
CREATE EXTERNAL TABLE IF NOT EXISTS outpdir_header_hive_only(hbase_pk STRING,header_servername_donotquerySTRING,header_date_donotquery
STRING, header_id STRING, header_hbpk STRING,header_channelId INT,header_searchAnnotation
STRING,header_continuedSearchFlag INT,header_prodLow INT,header_prodTotal INT,header_sort
INT,header_view INT,header_adNodes INT,header_spellingSuggestion STRING,header_queryType INT,header_nodeId
INT,header_pinpointPtitleId INT,header_firedSearchRulesSTRING,header_rbAbsentSellers INT,header_shuffled
INT,header_searchSessionId STRING,header_normalizationFlag STRING,header_relatedItemResultCount
INT,header_unrankedSelectedPtitleIds INT,header_normKeyword STRING,header_kplEntry INT,header_isSaved
STRING,header_rawProfileScore DOUBLE,header_normalizedProfileScore INT,header_scorerInfo STRING,header_contextNode
INT,header_fbId STRING,norm_stem_keyword STRING, attrs_origNodeId INT,attrs_mfrId INT,attrs_sellerId
INT,attrs_otherAttrs STRING,attrs_ptitleId INT,cached_date STRING,cached_recordId STRING,cached_visitorId
STRING,cached_visit_id STRING,cached_appStyle STRING,cached_publisherId INT,cached_IP STRING,cached_source
STRING,cached_refkw STRING,cached_pixeled INT,cached_searchRefineAttrImps STRING,cached_pageType
STRING,cached_zipCode STRING,cached_zipType STRING,cached_perpage INT) PARTITIONED BY (header_date
STRING, header_servername STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'

Do I have to give some INPUTFORMAT directive to make the Hive Table read Snappy Codec files
?
For example for LZO its
STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"


QUESTION 2
For Hive scripts that will READ Snappy files and Output Snappy Files to Hive Tables are the
following settings enough ?
SET hive.exec.compress.output=true;
SET mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec;
SET mapred.output.compression.type=BLOCK;

Thanks

sanjay

CONFIDENTIALITY NOTICE
======================
This email message and any attachments are for the exclusive use of the intended recipient(s)
and may contain confidential and privileged information. Any unauthorized review, use, disclosure
or distribution is prohibited. If you are not the intended recipient, please contact the sender
by reply email and destroy all copies of the original message along with any attachments,
from your computer system. If you are the intended recipient, please be advised that the content
of this message is subject to access, review and disclosure by the sender's Email System Administrator.

CONFIDENTIALITY NOTICE
======================
This email message and any attachments are for the exclusive use of the intended recipient(s)
and may contain confidential and privileged information. Any unauthorized review, use, disclosure
or distribution is prohibited. If you are not the intended recipient, please contact the sender
by reply email and destroy all copies of the original message along with any attachments,
from your computer system. If you are the intended recipient, please be advised that the content
of this message is subject to access, review and disclosure by the sender's Email System Administrator.

Mime
View raw message