hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From no jihun <jees...@gmail.com>
Subject Add partition data to an external ORC table.
Date Thu, 11 Feb 2016 18:48:29 GMT
hello.

I wanna know this could be possible or not.

There would be an table which created by

create external table test (
date_string String,
message String)
STORED AS ORC
PARTIONED BY (date_string STRING)
LOCATION '/message';

with this table
I will never add row by 'insert' statement
but want to
#1. add data of each day to hdfs's partition location directly.
  e.g /message/20160212
  ( by $ hadoop fs -put )
#2. then i will add partition everyday morning.
ALTER TABLE test
ADD PARTITION (date_string=’20160212’)
location '/message/20160212';
#3. query for the added data.

with this scenario what or how can I prepare the ORC formatted data in
step#1? when stored format is textfile I just need to copy raw file to
partition directory, but with orc table I dont think this possible so
easily.

raw application log is json formatted and each day may have 1M json rows.

Actually I do this jobs on my cluster with textfile table not ORC. now I am
trying to table format.

Any advise would be great.
thanks

Mime
View raw message