hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Robertson <>
Subject A GIS contains() for Hive?
Date Fri, 16 Mar 2012 09:21:54 GMT
Hi all,

I need to perform a lot of "point in polygon" checks and want to use Hive
(currently I mix Hive, Sqoop and PostGIS in an Oozie workto do this).

In an ideal world, I would like to create a Hive table from a Shapefile
containing polygons, and then do the likes of the following:

  SELECT, FROM points p, polygons pp WHERE pp.contains(geom,

Has anyone done anything along these lines?

Alternatively I am capable of doing a UDF that would read the shape file
into memory and basically do a map side join using something like a slab
decomposition technique.  It is more limited but would meet my needs
allowing e.g.:

  SELECT contains(,p.lng, '/data/shapefiles/countries.shp') FROM

Before I start I thought I'd ask folks as I suspect people are doing this
kind of thing on Hive by now (thinking FB and user profiling by political
boundaries etc)

I'd love to hear from anyone who's investigated this or could provide any


View raw message