hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Pivovarov <apivova...@gmail.com>
Subject Re: how to load data
Date Fri, 01 May 2015 06:50:57 GMT
if your file is csv file then create table statement should specify
CSVSerde - look at the examples under the links I sent you

On Thu, Apr 30, 2015 at 10:23 PM, Kumar Jayapal <kjayapal17@gmail.com>
wrote:

> Alex,
>
>
> I followed the same steps as mentioned in the site. Once I load data into
> table which is create below
>
>
>
> Created table  CREATE TABLE raw (line STRING) PARTITIONED BY (FISCAL_YEAR
>  smallint, FISCAL_PERIOD smallint)
> STORED AS TEXTFILE;
>
> and loaded it with data.
>
> LOAD DATA LOCAL INPATH '/tmp/weblogs/20090603-access.log.gz' INTO TABLE
> raw;
>
>
>
> when I say select * from raw it shows all null values.
>
>
> NULLNULLNULLNULLNULLNULLNULLNULL
> NULLNULLNULLNULLNULLNULLNULLNULL
> NULLNULLNULLNULLNULLNULLNULLNULL
> NULLNULLNULLNULLNULLNULLNULLNULL
> Why is not show showing the actual data in file. will it show once I load
> it to parque table?
>
> Please let me know if I am doing anything wrong.
>
> I appreciate your help.
>
>
> Thanks
> jay
>
>
>
> Thank you very much for you help Alex,
>
>
> On Wed, Apr 29, 2015 at 3:43 PM, Alexander Pivovarov <apivovarov@gmail.com
> > wrote:
>
>> 1. Create external textfile hive table pointing to /extract/DBCLOC and
>> specify CSVSerde
>>
>> if using hive-0.14 and newer use this
>> https://cwiki.apache.org/confluence/display/Hive/CSV+Serde
>> if hive-0.13 and older use https://github.com/ogrodnek/csv-serde
>>
>> You do not even need to unzgip the file. hive automatically unzgip data
>> on select.
>>
>> 2. run simple query to load data
>> insert overwrite table <orc_table>
>> select * from <csv_table>
>>
>> On Wed, Apr 29, 2015 at 3:26 PM, Kumar Jayapal <kjayapal17@gmail.com>
>> wrote:
>>
>>> Hello All,
>>>
>>>
>>> I have this table
>>>
>>>
>>> CREATE  TABLE DBCLOC(
>>>    BLwhse int COMMENT 'DECIMAL(5,0) Whse',
>>>    BLsdat string COMMENT 'DATE Sales Date',
>>>    BLreg_num smallint COMMENT 'DECIMAL(3,0) Reg#',
>>>    BLtrn_num int COMMENT 'DECIMAL(5,0) Trn#',
>>>    BLscnr string COMMENT 'CHAR(1) Scenario',
>>>    BLareq string COMMENT 'CHAR(1) Act Requested',
>>>    BLatak string COMMENT 'CHAR(1) Act Taken',
>>>    BLmsgc string COMMENT 'CHAR(3) Msg Code')
>>> PARTITIONED BY (FSCAL_YEAR  smallint, FSCAL_PERIOD smallint)
>>> STORED AS PARQUET;
>>>
>>> have to load from hdfs location  /extract/DBCLOC/DBCL0301P.csv.gz to
>>> the table above
>>>
>>>
>>> Can any one tell me what is the most efficient way of doing it.
>>>
>>>
>>> Thanks
>>> Jay
>>>
>>
>>
>

Mime
View raw message