hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bichonfrise74 <bichonfris...@gmail.com>
Subject Apache Log Date Format
Date Fri, 06 May 2011 22:47:48 GMT
Hi,

I am using this to load the apache log into Hadoop via Hive (my version is
0.4.1).

CREATE TABLE apache_log (
  ...
  logdate STRING,
  ...
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
  "input.regex" = "([^ ]*) ([^ ]*) ([^ ]*)
\\[(\\w+\/\\w+\/\\w+)\:(\\d+:\\d+:\\d+) ...
...

The date is coming in this format: dd/mmm/yyyy.
I would like to be able to load the data using this date format:
yyyy-mmm-dd.

1. Has anyone done this before loading the date in a different a different
format?
2. Also, how do you specify in the create table statement above that the
partition is the logdate?
3. And when I tried to convert the old date into unixtime format via this
sql, hive complains.

hive> select from_unixtime( unix_timestamp( logdate, 'dd/MMM/yyyy')) from
apache_log;
FAILED: Error in semantic analysis: line 1:7 Function Argument Type Mismatch
from_unixtime: Looking for UDF "from_unixtime" with parameters [class
org.apache.hadoop.io.LongWritable]

Has anyone encountered these issues before?

Thanks.

Mime
View raw message