hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guy Doulberg <Guy.Doulb...@conduit.com>
Subject RE: Creating a hive table for a custom log
Date Sun, 18 Sep 2011 06:16:51 GMT
If it makes more sense you could also store your lines with with the default serde, and extract
the you intend to query using a UDF

For example you could use parse_url(string urlString, string partToExtract [, string keyToExtract])
to parse url stuff....

Good luck

-----Original Message-----
From: Raimon Bosch [mailto:raimon.bosch@gmail.com] 
Sent: Friday, September 16, 2011 10:36 PM
To: core-user@hadoop.apache.org
Subject: Re: Creating a hive table for a custom log

Any Ideas? 

The most common aproach will be writting your own serde and plug it to your
hive like:


But I'm wondering if there is some work already done in this area.

Raimon Bosch wrote:
> Hi,
> I'm trying to create a table similar to apache_log but I'm trying to avoid
> to write my own map-reduce task because I don't want to have my HDFS files
> twice.
> So if you're working with log lines like this:
> [31/Aug/2011:00:10:41 +0000] "GET
> /client/action1/?transaction_id=8002&user_id=871793100001248&ts=1314749223525&item1=271&item2=6045&environment=2
> HTTP/1.1"
> [31/Aug/2011:00:10:41 +0000] "GET
> /client/action1/?transaction_id=9002&ts=1314749223525&user_id=9048871793100&item2=6045&item1=271&environment=2
> HTTP/1.1"
> [31/Aug/2011:00:10:41 +0000] "GET
> /client/action2/?transaction_id=9022&ts=1314749223525&user_id=9048871793100&item2=6045&item1=271&environment=2
> HTTP/1.1"
> And having in mind that the parameters could be in different orders. Which
> will be the best strategy to create this table? Write my own
> org.apache.hadoop.hive.contrib.serde2? Is there any resource already
> implemented that I could use to perform this task?
> In the end the objective is convert all the parameters in fields and use
> as type the "action". With this big table I will be able to perform my
> queries, my joins or my views.
> Any ideas?
> Thanks in Advance,
> Raimon Bosch.

View this message in context: http://old.nabble.com/Creating-a-hive-table-for-a-custom-log-tp32379849p32481457.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

View raw message