Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 110EA17B78 for ; Fri, 1 May 2015 08:07:55 +0000 (UTC) Received: (qmail 19767 invoked by uid 500); 1 May 2015 08:07:48 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 19657 invoked by uid 500); 1 May 2015 08:07:47 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 19646 invoked by uid 99); 1 May 2015 08:07:47 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 May 2015 08:07:47 +0000 X-ASF-Spam-Status: No, hits=3.4 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_REPLY,HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: message received from 54.191.145.13 which is an MX secondary for user@hadoop.apache.org) Received: from [54.191.145.13] (HELO mx1-us-west.apache.org) (54.191.145.13) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 May 2015 08:07:42 +0000 Received: from mail-ie0-f178.google.com (mail-ie0-f178.google.com [209.85.223.178]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 1ADAE2AB71 for ; Fri, 1 May 2015 07:10:37 +0000 (UTC) Received: by iedfl3 with SMTP id fl3so95099496ied.1 for ; Fri, 01 May 2015 00:10:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=vNgtqFKgMjV0AOEm234kSain3pCVSW4twjHymiO2saA=; b=jNhRuyw7/IlMin4yC1Lfy9OdLQo5fGDoyxd+tK904otckXArkXSgjbiq6CZvpVtMmi X6+pSuWzJwm9uNKJYMCJTEUYh0pmJNmNoCZ7BQXrQZtn3v6zk18YbGhfqsswCEVWoKW8 NJtsj9/JR6yJURf0Nsj234fV2m/PTZ0qDnE3HtmmeTBGag/HYVNcTEzvMPAO6UN7EEl/ UfA8jWSdg9Djpfbp2VntOhL0mTRmYkY64lzEBm96nW29JqRS2FHBdV4H5OQv1LyRIpzo +tUVNOnOnDdA/g7Hb0v/SntbXrZQlViPkLRpRCrYV02NvDWnCoQcUzmWITaytOxa5qST 79CA== MIME-Version: 1.0 X-Received: by 10.50.79.164 with SMTP id k4mr8571356igx.23.1430464236600; Fri, 01 May 2015 00:10:36 -0700 (PDT) Received: by 10.64.86.227 with HTTP; Fri, 1 May 2015 00:10:36 -0700 (PDT) In-Reply-To: References: Date: Fri, 1 May 2015 12:40:36 +0530 Message-ID: Subject: Re: how to load data From: Nitin Pawar To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=089e01294784afc47a0514ffe865 X-Virus-Checked: Checked by ClamAV on apache.org --089e01294784afc47a0514ffe865 Content-Type: text/plain; charset=UTF-8 Jay can you give first 3 lines of your gz file On Fri, May 1, 2015 at 10:53 AM, Kumar Jayapal wrote: > Alex, > > > I followed the same steps as mentioned in the site. Once I load data into > table which is create below > > > > Created table CREATE TABLE raw (line STRING) PARTITIONED BY (FISCAL_YEAR > smallint, FISCAL_PERIOD smallint) > STORED AS TEXTFILE; > > and loaded it with data. > > LOAD DATA LOCAL INPATH '/tmp/weblogs/20090603-access.log.gz' INTO TABLE > raw; > > > > when I say select * from raw it shows all null values. > > > NULLNULLNULLNULLNULLNULLNULLNULL > NULLNULLNULLNULLNULLNULLNULLNULL > NULLNULLNULLNULLNULLNULLNULLNULL > NULLNULLNULLNULLNULLNULLNULLNULL > Why is not show showing the actual data in file. will it show once I load > it to parque table? > > Please let me know if I am doing anything wrong. > > I appreciate your help. > > > Thanks > jay > > > > Thank you very much for you help Alex, > > > On Wed, Apr 29, 2015 at 3:43 PM, Alexander Pivovarov > wrote: > >> 1. Create external textfile hive table pointing to /extract/DBCLOC and >> specify CSVSerde >> >> if using hive-0.14 and newer use this >> https://cwiki.apache.org/confluence/display/Hive/CSV+Serde >> if hive-0.13 and older use https://github.com/ogrodnek/csv-serde >> >> You do not even need to unzgip the file. hive automatically unzgip data >> on select. >> >> 2. run simple query to load data >> insert overwrite table >> select * from >> >> On Wed, Apr 29, 2015 at 3:26 PM, Kumar Jayapal >> wrote: >> >>> Hello All, >>> >>> >>> I have this table >>> >>> >>> CREATE TABLE DBCLOC( >>> BLwhse int COMMENT 'DECIMAL(5,0) Whse', >>> BLsdat string COMMENT 'DATE Sales Date', >>> BLreg_num smallint COMMENT 'DECIMAL(3,0) Reg#', >>> BLtrn_num int COMMENT 'DECIMAL(5,0) Trn#', >>> BLscnr string COMMENT 'CHAR(1) Scenario', >>> BLareq string COMMENT 'CHAR(1) Act Requested', >>> BLatak string COMMENT 'CHAR(1) Act Taken', >>> BLmsgc string COMMENT 'CHAR(3) Msg Code') >>> PARTITIONED BY (FSCAL_YEAR smallint, FSCAL_PERIOD smallint) >>> STORED AS PARQUET; >>> >>> have to load from hdfs location /extract/DBCLOC/DBCL0301P.csv.gz to >>> the table above >>> >>> >>> Can any one tell me what is the most efficient way of doing it. >>> >>> >>> Thanks >>> Jay >>> >> >> > -- Nitin Pawar --089e01294784afc47a0514ffe865 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Jay can you give first 3 lines of your gz file=C2=A0
=

On Fri, May 1, 20= 15 at 10:53 AM, Kumar Jayapal <kjayapal17@gmail.com> wrot= e:
Alex,

<= div>
I followed the same steps as mentioned in the site. Once= I load data into table which is create below


=

Creat= ed table =C2=A0CREATE TABLE raw (line STRING)= =C2=A0PARTITIONED BY (FISCAL_YEAR =C2=A0smallint, = FISCAL_PERIOD smallint)
STORED AS=C2=A0TEXTFILE;
<= div style=3D"font-size:12.8000001907349px">
and loaded it with data.

= LOAD DATA LOCAL INPATH=C2=A0'/t= mp/weblogs/20090603-access.log.gz'=C2=A0INTO TABLE raw;



when I say select * from raw it shows a= ll null values.

=

NULLNULLNULLNULLNULLNULLNULLNULL
NU= LLNULLNULLNULLNULLNULLNULLNULL

NULLNULLNULLNULLNULLNULLNULLNULL

NULLNULLNULLNULLNULLNULLNULLNUL= L
Why is not show showing the actual data in file.= will it show once I load it to parque table?

Please let= me know if I am doing anything wrong.

I appreciate your h= elp.


Thanks
jay=C2=A0
=



Thank you very much= for you help Alex,

On Wed, Apr 29, 2015 at 3:43 = PM, Alexander Pivovarov <apivovarov@gmail.com> wrote:
=
1. Create external textfile hive table pointing to <= span style=3D"font-size:12.8px">/ex= tract/DBCLOC and specify CSV= Serde

if using hive-0.14 and newer use this https://cwiki.apache.org/confluence/display/Hive/CSV+Serde<= br>
if hive-0.13 and older use https://github.com/ogrodnek/csv-serde
<= br>You do not even need to unzgip the file. hive automatically unzgip data = on select.

2. run simple query to load data
insert ov= erwrite table <orc_table>
select * from <csv_table>

= On Wed, Apr 29, 2015 at 3:26 PM, Kumar Jayapal <kjayapal17@gmail.com> wrote:
Hello All,
<= br>

I have this table=


CREATE =C2=A0TABLE DBCLOC( =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0
= =C2=A0 =C2=A0BLwhse int COMMENT 'DECIMAL(5,0) Whse', =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0
=C2=A0 =C2=A0BLsda= t string COMMENT 'DATE Sales Date', =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0
=C2=A0 =C2=A0BLreg_num smallint COMM= ENT 'DECIMAL(3,0) Reg#', =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0
=C2=A0 =C2=A0BLtrn_num int COMMENT 'DECIMAL(5,0) Trn#', =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0
=C2=A0 =C2=A0BLscnr st= ring COMMENT 'CHAR(1) Scenario', =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0
=C2=A0 =C2=A0BLareq string COMMENT 'CHAR(= 1) Act Requested', =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0=C2=A0
=C2= =A0 =C2=A0BLatak string COMMENT 'CHAR(1) Act Taken', =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0
<= div>=C2=A0 =C2=A0BLmsgc string= COMMENT 'CHAR(3) Msg Code') =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0
PARTITIONED BY (FSCAL_YEAR =C2=A0smallint, FSCAL_= PERIOD smallint)
STORED AS PARQUET;

have = to load from hdfs location =C2=A0/extract/DBCL= OC/DBCL0301P.csv.gz to = the table above=C2=A0


Can any= one tell me what is the most efficient way of doing it.
<= br>

Thanks
= Jay





--
Nitin Pawar
--089e01294784afc47a0514ffe865--