hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bichonfrise74 <bichonfris...@gmail.com>
Subject Apache Web Log Question
Date Fri, 18 Mar 2011 22:40:22 GMT
Hi,

I am trying to use this:

add jar ../build/contrib/hive_contrib.jar;

CREATE TABLE apachelog (
  host STRING,
  identity STRING,
  user STRING,
  time STRING,
  request STRING,
  status STRING,
  size STRING,
  referer STRING,
  agent STRING)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
  "input.regex" = "([^ ]*) ([^ ]*) ([^ ]*) (-|\\[[^\\]]*\\]) ([^
\"]*|\"[^\"]*\") (-|[0-9]*) (-|[0-9]*)(?: ([^ \"]*|\".*\") ([^
\"]*|\".*\"))?",
  "output.format.string" = "%1$s %2$s %3$s %4$s %5$s %6$s %7$s %8$s %9$s"
)
STORED AS TEXTFILE;


And it works great. My problem is how query the table with respect to the
time column since it still has this format: [01/Mar/2011:13:01:10 -0700]

So, I do not know how to execute these kinds of queries:

1. select time, count(*) from weblog where time < '01/Mar/2010';
2. select hours, count(*) from weblog group by hours;

Mime
View raw message