incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dir dir <sikerasa...@gmail.com>
Subject Re: How to perform queries on Cassandra?
Date Sat, 10 Apr 2010 03:00:49 GMT
Does Cassandra has a default query language such as SQL in RDBMS
and Object Query in OODBMS?  Thank you.

Dir.

On Sat, Apr 10, 2010 at 7:01 AM, malsmith <malsmith@treehousesystems.com>wrote:

>
>
> It's sort of an interesting problem - in RDBMS one relatively simple
> approach would be calculate a rectangle that is X km by Y km with User 1's
> location at the center.  So the rectangle is UserX - 10KmX , UserY-10KmY to
> UserX+10KmX , UserY+10KmY
>
> Then you could query the database for all other users where that each user
> considered is curUserX > UserX-10Km and curUserX < UserX+10KmX and curUserY
> > UserY-10KmY and curUserY < UserY+10KmY
> * Not the 10KmX and 10KmY are really a translation from Kilometers to
> degrees of  lat and longitude  (that you can find on a google search)
>
> With the right indexes this query actually runs pretty well.
>
> Translating that to Cassandra seems a bit complex at first - but you could
> try something like pre-calculating a grid with the right resolution (like a
> square of 5KM per side) and assign every user to a particular grid ID.  That
> way you just calculate with grid ID User1 is in then do a direct key lookup
> to get a list of the users in that same grid id.
>
> A second approach would be to have to column families -- one that maps a
> Latitude to a list of users who are at that latitude and a second that maps
> users who are at a particular longitude.  You could do the same rectange
> calculation above then do a get_slice range lookup to get a list of users
> from range of latitude and a second list from the range of longitudes.
> You would then need to do a in-memory nested loop to find the list of users
> that are in both lists.  This second approach could cause some trouble
> depending on where you search and how many users you really have -- some
> latitudes and longitudes have many many people in them
>
> So, it seems some version of a chunking / grid id thing would be the better
> approach.   If you let people zoom in or zoom out - you could just have
> different column families for each level of zoom.
>
>
> I'm stuck on a stopped train so -- here is even more code:
>
> static Decimal GetLatitudeMiles(Decimal lat)
> {
> Decimal f = 0.0M;
> lat = Math.Abs(lat);
> f = 68.99M;
>          if (lat >= 0.0M && lat < 10.0M) { f = 68.71M; }
> else if (lat >= 10.0M && lat < 20.0M) { f = 68.73M; }
> else if (lat >= 20.0M && lat < 30.0M) { f = 68.79M; }
> else if (lat >= 30.0M && lat < 40.0M) { f = 68.88M; }
> else if (lat >= 40.0M && lat < 50.0M) { f = 68.99M; }
> else if (lat >= 50.0M && lat < 60.0M) { f = 69.12M; }
> else if (lat >= 60.0M && lat < 70.0M) { f = 69.23M; }
> else if (lat >= 70.0M && lat < 80.0M) { f = 69.32M; }
> else if (lat >= 80.0M) { f = 69.38M; }
>
> return f;
> }
>
>
> Decimal MilesPerDegreeLatitude = GetLatitudeMiles(zList[0].Latitude);
> Decimal MilesPerDegreeLongitude = ((Decimal) Math.Abs(Math.Cos((Double)
> zList[0].Latitude))) * 24900.0M / 360.0M;
>                         dRadius = 10.0M  // ten miles
> Decimal deltaLat = dRadius / MilesPerDegreeLatitude;
> Decimal deltaLong = dRadius / MilesPerDegreeLongitude;
>
> ps.TopLatitude = zList[0].Latitude - deltaLat;
> ps.TopLongitude = zList[0].Longitude - deltaLong;
> ps.BottomLatitude = zList[0].Latitude + deltaLat;
> ps.BottomLongitude = zList[0].Longitude + deltaLong;
>
>
>
>
> On Fri, 2010-04-09 at 16:30 -0700, Paul Prescod wrote:
>
> 2010/4/9 Onur AKTAS <onur.aktas@live.com>:
> > ...
> > I'm trying to find out how do you perform queries with calculations on the
> > fly without inserting the data as calculated from the beginning.
> > Lets say we have latitude and longitude coordinates of all users and we have
> >  Distance(from_lat, from_long, to_lat, to_long) function which
> > gives distance between lat/longs pairs in kilometers.
>
> I'm not an expert, but I think that it boils down to "MapReduce" and "Hadoop".
>
> I don't think that there's any top-down tutorial on those two words,
> you'll have to research yourself starting here:
>
>  * http://en.wikipedia.org/wiki/MapReduce
>
>  * http://hadoop.apache.org/
>
>  * http://wiki.apache.org/cassandra/HadoopSupport
>
> I don't think it is all documented in any one place yet...
>
>  Paul Prescod
>
>
>

Mime
View raw message