hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kumar Jayapal <kjayapa...@gmail.com>
Subject Re: how to load data
Date Fri, 01 May 2015 05:23:42 GMT
Alex,


I followed the same steps as mentioned in the site. Once I load data into
table which is create below



Created table  CREATE TABLE raw (line STRING) PARTITIONED BY (FISCAL_YEAR
 smallint, FISCAL_PERIOD smallint)
STORED AS TEXTFILE;

and loaded it with data.

LOAD DATA LOCAL INPATH '/tmp/weblogs/20090603-access.log.gz' INTO TABLE raw;



when I say select * from raw it shows all null values.


NULLNULLNULLNULLNULLNULLNULLNULL
NULLNULLNULLNULLNULLNULLNULLNULL
NULLNULLNULLNULLNULLNULLNULLNULL
NULLNULLNULLNULLNULLNULLNULLNULL
Why is not show showing the actual data in file. will it show once I load
it to parque table?

Please let me know if I am doing anything wrong.

I appreciate your help.


Thanks
jay



Thank you very much for you help Alex,


On Wed, Apr 29, 2015 at 3:43 PM, Alexander Pivovarov <apivovarov@gmail.com>
wrote:

> 1. Create external textfile hive table pointing to /extract/DBCLOC and
> specify CSVSerde
>
> if using hive-0.14 and newer use this
> https://cwiki.apache.org/confluence/display/Hive/CSV+Serde
> if hive-0.13 and older use https://github.com/ogrodnek/csv-serde
>
> You do not even need to unzgip the file. hive automatically unzgip data on
> select.
>
> 2. run simple query to load data
> insert overwrite table <orc_table>
> select * from <csv_table>
>
> On Wed, Apr 29, 2015 at 3:26 PM, Kumar Jayapal <kjayapal17@gmail.com>
> wrote:
>
>> Hello All,
>>
>>
>> I have this table
>>
>>
>> CREATE  TABLE DBCLOC(
>>    BLwhse int COMMENT 'DECIMAL(5,0) Whse',
>>    BLsdat string COMMENT 'DATE Sales Date',
>>    BLreg_num smallint COMMENT 'DECIMAL(3,0) Reg#',
>>    BLtrn_num int COMMENT 'DECIMAL(5,0) Trn#',
>>    BLscnr string COMMENT 'CHAR(1) Scenario',
>>    BLareq string COMMENT 'CHAR(1) Act Requested',
>>    BLatak string COMMENT 'CHAR(1) Act Taken',
>>    BLmsgc string COMMENT 'CHAR(3) Msg Code')
>> PARTITIONED BY (FSCAL_YEAR  smallint, FSCAL_PERIOD smallint)
>> STORED AS PARQUET;
>>
>> have to load from hdfs location  /extract/DBCLOC/DBCL0301P.csv.gz to the
>> table above
>>
>>
>> Can any one tell me what is the most efficient way of doing it.
>>
>>
>> Thanks
>> Jay
>>
>
>

Mime
View raw message