cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From malsmith <malsm...@treehousesystems.com>
Subject Re: How to perform queries on Cassandra?
Date Sat, 10 Apr 2010 00:01:00 GMT


It's sort of an interesting problem - in RDBMS one relatively simple
approach would be calculate a rectangle that is X km by Y km with User
1's location at the center.  So the rectangle is UserX - 10KmX ,
UserY-10KmY to UserX+10KmX , UserY+10KmY

Then you could query the database for all other users where that each
user considered is curUserX > UserX-10Km and curUserX < UserX+10KmX and
curUserY > UserY-10KmY and curUserY < UserY+10KmY  
* Not the 10KmX and 10KmY are really a translation from Kilometers to
degrees of  lat and longitude  (that you can find on a google search)

With the right indexes this query actually runs pretty well.   

Translating that to Cassandra seems a bit complex at first - but you
could try something like pre-calculating a grid with the right
resolution (like a square of 5KM per side) and assign every user to a
particular grid ID.  That way you just calculate with grid ID User1 is
in then do a direct key lookup to get a list of the users in that same
grid id. 

A second approach would be to have to column families -- one that maps a
Latitude to a list of users who are at that latitude and a second that
maps users who are at a particular longitude.  You could do the same
rectange calculation above then do a get_slice range lookup to get a
list of users from range of latitude and a second list from the range of
longitudes.    You would then need to do a in-memory nested loop to find
the list of users that are in both lists.  This second approach could
cause some trouble depending on where you search and how many users you
really have -- some latitudes and longitudes have many many people in
them

So, it seems some version of a chunking / grid id thing would be the
better approach.   If you let people zoom in or zoom out - you could
just have different column families for each level of zoom.


I'm stuck on a stopped train so -- here is even more code:

static Decimal GetLatitudeMiles(Decimal lat) 
{
Decimal f = 0.0M;
lat = Math.Abs(lat);
f = 68.99M;
         if (lat >= 0.0M && lat < 10.0M) { f = 68.71M; } 
else if (lat >= 10.0M && lat < 20.0M) { f = 68.73M; }
else if (lat >= 20.0M && lat < 30.0M) { f = 68.79M; }
else if (lat >= 30.0M && lat < 40.0M) { f = 68.88M; }
else if (lat >= 40.0M && lat < 50.0M) { f = 68.99M; }
else if (lat >= 50.0M && lat < 60.0M) { f = 69.12M; }
else if (lat >= 60.0M && lat < 70.0M) { f = 69.23M; }
else if (lat >= 70.0M && lat < 80.0M) { f = 69.32M; }
else if (lat >= 80.0M) { f = 69.38M; }

return f;
}


Decimal MilesPerDegreeLatitude = GetLatitudeMiles(zList[0].Latitude);
Decimal MilesPerDegreeLongitude = ((Decimal) Math.Abs(Math.Cos((Double)
zList[0].Latitude))) * 24900.0M / 360.0M;
                        dRadius = 10.0M  // ten miles
Decimal deltaLat = dRadius / MilesPerDegreeLatitude;
Decimal deltaLong = dRadius / MilesPerDegreeLongitude;

ps.TopLatitude = zList[0].Latitude - deltaLat;
ps.TopLongitude = zList[0].Longitude - deltaLong;
ps.BottomLatitude = zList[0].Latitude + deltaLat;
ps.BottomLongitude = zList[0].Longitude + deltaLong;



On Fri, 2010-04-09 at 16:30 -0700, Paul Prescod wrote: 

> 2010/4/9 Onur AKTAS <onur.aktas@live.com>:
> > ...
> > I'm trying to find out how do you perform queries with calculations on the
> > fly without inserting the data as calculated from the beginning.
> > Lets say we have latitude and longitude coordinates of all users and we have
> >  Distance(from_lat, from_long, to_lat, to_long) function which
> > gives distance between lat/longs pairs in kilometers.
> 
> I'm not an expert, but I think that it boils down to "MapReduce" and "Hadoop".
> 
> I don't think that there's any top-down tutorial on those two words,
> you'll have to research yourself starting here:
> 
>  * http://en.wikipedia.org/wiki/MapReduce
> 
>  * http://hadoop.apache.org/
> 
>  * http://wiki.apache.org/cassandra/HadoopSupport
> 
> I don't think it is all documented in any one place yet...
> 
>  Paul Prescod



Mime
View raw message