Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6D4E910081 for ; Sat, 12 Oct 2013 18:33:46 +0000 (UTC) Received: (qmail 29277 invoked by uid 500); 12 Oct 2013 18:33:40 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 29188 invoked by uid 500); 12 Oct 2013 18:33:35 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 29176 invoked by uid 99); 12 Oct 2013 18:33:32 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 12 Oct 2013 18:33:32 +0000 X-ASF-Spam-Status: No, hits=1.3 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS,URI_HEX X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of michael_segel@hotmail.com designates 65.55.111.112 as permitted sender) Received: from [65.55.111.112] (HELO blu0-omc2-s37.blu0.hotmail.com) (65.55.111.112) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 12 Oct 2013 18:33:28 +0000 Received: from BLU0-SMTP451 ([65.55.111.72]) by blu0-omc2-s37.blu0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Sat, 12 Oct 2013 11:33:07 -0700 X-TMN: [qQp6d7ubzCnWAbS3WzzIxSG4YaoRwDvb] X-Originating-Email: [michael_segel@hotmail.com] Message-ID: Received: from 173-15-87-33-illinois.hfc.comcastbusiness.net ([173.15.87.33]) by BLU0-SMTP451.blu0.hotmail.com over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Sat, 12 Oct 2013 11:33:05 -0700 Content-Type: text/plain; charset="windows-1252" MIME-Version: 1.0 (Mac OS X Mail 6.6 \(1510\)) Subject: Re: Spatial data posting in HBase From: Michael Segel In-Reply-To: Date: Sat, 12 Oct 2013 13:33:03 -0500 Content-Transfer-Encoding: quoted-printable References: <1380021342486-4051123.post@n3.nabble.com> To: user@hbase.apache.org X-Mailer: Apple Mail (2.1510) X-OriginalArrivalTime: 12 Oct 2013 18:33:05.0649 (UTC) FILETIME=[7DD64E10:01CEC779] X-Virus-Checked: Checked by ClamAV on apache.org Adrien,=20 In terms of efficiency...=20 A general solution that can be applied to all problems in all areas is = going to be best.=20 Geohash gets ugly when you're around the equator. You can have two = points literally a couple of km away that would have two very different = geo hashes.=20 So if you tile the globe, depending on the size of the tile, you = calculate the tile, its surrounding tiles (if necessary) and then sweep = through the data to find your object.=20 I'm not suggesting you not to use geohash, just that its not going to be = the most efficient.=20 Note that the the downside to tiling is that if you're doing a = geospatial index... your data volume explodes because you are storing = references to the data at different tile levels. Its a trade off.=20 On Oct 12, 2013, at 2:34 AM, Adrien Mogenet = wrote: > Michael, don't you think Geohashes can be satisfying and well-suited = for > many cases anyway? Searching in a bounding box or arbitrary polygon is = not > that heavy with Geohash, even on edge conditions. The biggest risk = IMHO is > to have to deal with tons of invalid extra points if the geohash query = is > not accurate enough and your points distribution is very sparse so = that > many points will be found in a geohash despite they don't respond to = your > query criteria. >=20 > However, if your query embeds enough bits of precision, Geohashes = offer > some nice guarantees for distributed databases and your queries should > remain efficient enough. >=20 > Another worst case of course is to look for K-NN since Geohash is not = a > real longest-common-prefix algorithm but once again, if your points > distribution is approximately well balanced, this works not that bad > without doing lots of recursive queries or fetching tons of useless = data > (but I do agree looking into your tiles would probably be more = appropriate > in that case). >=20 > I'm planning to write an article on that points, so further technical > arguments are welcome :-} >=20 > On Thu, Oct 10, 2013 at 7:51 PM, Michael Segel = wrote: >=20 >> HBase in Action goes through great depth of showing you how you could >> implement GIS information in HBase. >>=20 >> Unfortunately there are issues with Geohash and edge conditions which = make >> it difficult to use when you're dealing with data on an edge of a = quadrant. >>=20 >> A better way would be to create a point (geospatial point object) and >> store it in a single column. >> (This goes beyond the example of what's in the book. ) And then index = the >> data by tiles. >>=20 >>=20 >> The downside is that you end up creating a lot more data=85 >>=20 >> Take a look at some of the stuff Boris Lublinsky published on InfoQ. = There >> are also other articles on the net=85. >>=20 >> On Oct 9, 2013, at 1:35 PM, Otis Gospodnetic = >> wrote: >>=20 >>> The point is that there are options (multiple different hammers) if >>> HBase support for geospatial is not there or doesn't meet OP's = needs. >>>=20 >>> Otis >>> -- >>> Solr & ElasticSearch Support -- http://sematext.com/ >>> Performance Monitoring -- http://sematext.com/spm >>>=20 >>>=20 >>>=20 >>> On Wed, Oct 9, 2013 at 11:14 AM, Michael Segel >>> wrote: >>>> And Solr has what to do with storing data in HBase? >>>>=20 >>>> I guess its true=85 if all you have is a hammer=85 >>>>=20 >>>> The point I was raising was that geohash isn't the most efficient = way >> to go when you look at the problem at a global level=85 >>>>=20 >>>> On Oct 9, 2013, at 9:34 AM, Otis Gospodnetic < >> otis.gospodnetic@gmail.com> wrote: >>>>=20 >>>>> Consider using Solr, which provides a lot of geospatial search = support. >>>>>=20 >>>>> Otis >>>>> Solr & ElasticSearch Support >>>>> http://sematext.com/ >>>>> On Sep 24, 2013 8:29 AM, "cto" wrote: >>>>>=20 >>>>>> Hi , >>>>>>=20 >>>>>> I am very new in HBase. Could you please let me know , how to = insert >>>>>> spatial >>>>>> data (Latitude / Longitude) in HBase using Java . >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>> -- >>>>>> View this message in context: >>>>>>=20 >> = http://apache-hbase.679495.n3.nabble.com/Spatial-data-posting-in-HBase-tp4= 051123.html >>>>>> Sent from the HBase User mailing list archive at Nabble.com. >>>>>>=20 >>>>=20 >>>=20 >>=20 >>=20 >=20 >=20 > --=20 > Adrien Mogenet > http://www.borntosegfault.com The opinions expressed here are mine, while they may reflect a cognitive = thought, that is purely accidental.=20 Use at your own risk.=20 Michael Segel michael_segel (AT) hotmail.com