incubator-hcatalog-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Timothy Potter <thelabd...@gmail.com>
Subject Re: HCatStorer and appending to partition
Date Wed, 27 Mar 2013 21:29:32 GMT
Hi Christian,

We do something similar but there's no append to an existing partition
afaik - I'm surprised it's not failing to write the new when it already
exists. We use a more granular partition scheme or re-write the entire
partition each time.

Cheers,
Tim

On Wed, Mar 27, 2013 at 3:07 PM, Christian <engrean@gmail.com> wrote:

> Hi,
>
> I am trying to run a pig job every few minutes that should end up using
> HCat's automatic partitioning to store the data in the correct directory
> (/apps/hive/warehouse/ntp_hcat/request_date=2013-03-27/)
>
> I've set the partition column and I can successfully write data and it
> goes to the correct place. The problem I am having is that every time I run
> the job, it is deleting the existing data in the directory (partition).
>
> My store call is simply:
>
> STORE complete INTO 'ntp_hcat' USING org.apache.hcatalog.pig.HCatStorer();
>
> My table definition in Hive is:
>
> CREATE TABLE ntp_hcat(
>     year INT,
>     month INT,
>     day INT,
>     date_time STRING,
>     hour INT,
>     minute INT,
>     second INT,
>     seconds_in_day BIGINT,
>     ip STRING,
>     method STRING,
>     path STRING,
>     original_path STRING,
>     is_static_resource STRING,
>     is_page STRING,
>     status INT,
>     referrer_host STRING,
>     referrer STRING,
>     original_referrer STRING,
>     agent STRING,
>     content_length BIGINT,
>     response_time FLOAT,
>     web_server STRING,
>     app_server STRING,
>     session_id STRING,
>     sold_to_party_num STRING,
>     customer_name STRING,
>     login_id STRING,
>     employee_id STRING,
>     first_name STRING,
>     last_name STRING,
>     session_start_date STRING,
>     browser STRING,
>     browser_version STRING,
>     is_slow_response STRING)
> COMMENT 'This is the ntp apache requests table'
> partitioned by (request_date string)
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
> STORED AS TEXTFILE;
>
> I am using HDP 1.2.1. What am I doing wrong?
>
> Thank you,
> Christian
>

Mime
View raw message