hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nitin Pawar <nitinpawar...@gmail.com>
Subject Re: how to load data
Date Fri, 01 May 2015 13:08:46 GMT
as Akex suggested, Please use row format in your query like
CREATE  TABLE DBCLOC(....) ROW FORMAT DELIMITED   FIELDS TERMINATED BY ','
and give it a try

On Fri, May 1, 2015 at 6:33 PM, Kumar Jayapal <kjayapal17@gmail.com> wrote:

> 106,"2003-02-03",20,2,"A","2","2","037"
> 106,"2003-02-03",20,3,"A","2","2","037"
> 106,"2003-02-03",8,2,"A","2","2","037"
>
>
>
>
>
>
>
>
>
>
> Thanks
> Jay
>
> On Fri, May 1, 2015 at 12:10 AM, Nitin Pawar <nitinpawar432@gmail.com>
> wrote:
>
>> Jay can you give first 3 lines of your gz file
>>
>> On Fri, May 1, 2015 at 10:53 AM, Kumar Jayapal <kjayapal17@gmail.com>
>> wrote:
>>
>>> Alex,
>>>
>>>
>>> I followed the same steps as mentioned in the site. Once I load data
>>> into table which is create below
>>>
>>>
>>>
>>> Created table  CREATE TABLE raw (line STRING) PARTITIONED BY
>>> (FISCAL_YEAR  smallint, FISCAL_PERIOD smallint)
>>> STORED AS TEXTFILE;
>>>
>>> and loaded it with data.
>>>
>>> LOAD DATA LOCAL INPATH '/tmp/weblogs/20090603-access.log.gz' INTO TABLE
>>> raw;
>>>
>>>
>>>
>>> when I say select * from raw it shows all null values.
>>>
>>>
>>> NULLNULLNULLNULLNULLNULLNULLNULL
>>> NULLNULLNULLNULLNULLNULLNULLNULL
>>> NULLNULLNULLNULLNULLNULLNULLNULL
>>> NULLNULLNULLNULLNULLNULLNULLNULL
>>> Why is not show showing the actual data in file. will it show once I
>>> load it to parque table?
>>>
>>> Please let me know if I am doing anything wrong.
>>>
>>> I appreciate your help.
>>>
>>>
>>> Thanks
>>> jay
>>>
>>>
>>>
>>> Thank you very much for you help Alex,
>>>
>>>
>>> On Wed, Apr 29, 2015 at 3:43 PM, Alexander Pivovarov <
>>> apivovarov@gmail.com> wrote:
>>>
>>>> 1. Create external textfile hive table pointing to /extract/DBCLOC and
>>>> specify CSVSerde
>>>>
>>>> if using hive-0.14 and newer use this
>>>> https://cwiki.apache.org/confluence/display/Hive/CSV+Serde
>>>> if hive-0.13 and older use https://github.com/ogrodnek/csv-serde
>>>>
>>>> You do not even need to unzgip the file. hive automatically unzgip data
>>>> on select.
>>>>
>>>> 2. run simple query to load data
>>>> insert overwrite table <orc_table>
>>>> select * from <csv_table>
>>>>
>>>> On Wed, Apr 29, 2015 at 3:26 PM, Kumar Jayapal <kjayapal17@gmail.com>
>>>> wrote:
>>>>
>>>>> Hello All,
>>>>>
>>>>>
>>>>> I have this table
>>>>>
>>>>>
>>>>> CREATE  TABLE DBCLOC(
>>>>>    BLwhse int COMMENT 'DECIMAL(5,0) Whse',
>>>>>    BLsdat string COMMENT 'DATE Sales Date',
>>>>>    BLreg_num smallint COMMENT 'DECIMAL(3,0) Reg#',
>>>>>    BLtrn_num int COMMENT 'DECIMAL(5,0) Trn#',
>>>>>    BLscnr string COMMENT 'CHAR(1) Scenario',
>>>>>    BLareq string COMMENT 'CHAR(1) Act Requested',
>>>>>    BLatak string COMMENT 'CHAR(1) Act Taken',
>>>>>    BLmsgc string COMMENT 'CHAR(3) Msg Code')
>>>>> PARTITIONED BY (FSCAL_YEAR  smallint, FSCAL_PERIOD smallint)
>>>>> STORED AS PARQUET;
>>>>>
>>>>> have to load from hdfs location  /extract/DBCLOC/DBCL0301P.csv.gz to
>>>>> the table above
>>>>>
>>>>>
>>>>> Can any one tell me what is the most efficient way of doing it.
>>>>>
>>>>>
>>>>> Thanks
>>>>> Jay
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Nitin Pawar
>>
>
>


-- 
Nitin Pawar

Mime
View raw message