hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Bond <bond.b...@gmail.com>
Subject Re: how to write custom log loader and store in JSON format
Date Mon, 06 Jul 2015 04:26:49 GMT
I am not sure about Pig, but its easily achievable in MapReduce. We had a
similar requirement, we had to convert logs from RFC syslog format (5424)
into JSON. We have a MR job which does this for us. The reason why we chose
MR was mainly for Error Handling - like missing fields in some records,
removing some blacklisted fields (like SSN etc) which we thought was easier
to do it in MR than pig.


On Sat, Jul 4, 2015 at 10:50 AM, Divya Gehlot <divya.htconex@gmail.com>

> Hi,
> I am new to pig and I have a log file in below format
> (Message,NIL,2015-07-01,22:58:53.66,E,xxxxxxxxxx.xxx.xxxxx.xxx,12,0xd6,BIZ,Componentname,0,0.0,key_1=value&KEY_2=1111&KEY_3=VALUE&KEY_4=AU&KEY_5=COMPANY&KEY_6=VALUE&KEY_7=12222222&KEY_8=VALUE&KEY_9=VALUE&KEY_10=VALUE&KEY_10=VALUE)
> for which I need to write pig script and store in below JSON format
> {Message1:Message,date:2015-07-01,Time:22:58:53.66,E:E,machine
> :xxxxxxxxxx.xxx.xxxxx.xxx,data:{key_1:value,key_2:value,key_3:value,key_3:value,key_3:value,key_5:value.....}
> }
> Can somebody help me in writing custom loader .
> would really appreciate your help.
> thanks,

View raw message