hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Grover <mgro...@oanda.com>
Subject Re: Efficient ways to parse xml from hive column(for selection/filters based on xml node values)
Date Fri, 23 Dec 2011 15:50:12 GMT
You might want to take a look at this:
https://cwiki.apache.org/Hive/languagemanual-xpathudf.html


Mark Grover, Business Intelligence Analyst
OANDA Corporation 

www: oanda.com www: fxtrade.com 
e: mgrover@oanda.com 

"Best Trading Platform" - World Finance's Forex Awards 2009. 
"The One to Watch" - Treasury Today's Adam Smith Awards 2009. 


----- Original Message -----
From: "ravikumar visweswara" <talk2hadoop@gmail.com>
To: user@hive.apache.org
Sent: Friday, December 23, 2011 10:35:59 AM
Subject: Efficient ways to parse xml from hive column(for selection/filters based on xml node
values)

Hello All, 

One of my hive columns has text data in xml format. What are all the efficient ways to parse
the xml and query based on certain node values. Biz User select/filter Query requirements
are based 6 or 7 nodes in xml. Is there any built-in support or supporting libraries for this
in HIVE? 
I have used SerDe for unstructured log parsing, but wanted to check the most efficient way
without writing specific UDFS which can parse the xml. 

Could some of you share your experiences and best practices? 

Thanks and Regards 
R 

Mime
View raw message