incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lucifer Dignified <vineetdan...@gmail.com>
Subject Re: How to perform queries on Cassandra?
Date Sun, 11 Apr 2010 07:54:35 GMT
Benjamin I quite agree to you, but what in case of duplicate usernames,
suppose if I am not using unique names as in email id's . If we have
duplicacy in usernames we cannot use it for key, so what should be the
solution. I think keeping incremental numeric id as key and keeping the name
and value same in the column family.

Example :
User1 has password as 123456

Cassandra structure :

1 as key
           user1 - column name
           value - user1
           123456 - column name
            value - 123456

I m thinking of doing it this way for my applicaton, this way i can run
different sorts of queries too. Any feedback on this is welcome.

On Sun, Apr 11, 2010 at 1:13 PM, Benjamin Black <b@b3k.us> wrote:

> You would have a Column Family, not a column for that; let's call it
> the Users CF.  You'd use username as the row key and have a column
> called 'password'.  For your example query, you'd retrieve row key
> 'usr2', column 'password'.  The general pattern is that you create CFs
> to act as indices for each query you want to perform.  There is no
> equivalent to a relational store to perform arbitrary queries.  You
> must structure things to permit the queries of interest.
>
>
> b
>
> On Sat, Apr 10, 2010 at 8:34 PM, dir dir <sikerasakti@gmail.com> wrote:
> > I have already read the API spesification. Honestly I do not understand
> > how to use it. Because there are not an examples.
> >
> > For example I have a column like this:
> >
> > UserName    Password
> > usr1                abc
> > usr2                xyz
> > usr3                opm
> >
> > suppose I want query the user's password using SQL in RDBMS
> >
> >       Select Password From Users Where UserName = "usr2";
> >
> > Now I want to get the password using OODBMS DB4o Object Query  and Java
> >
> >      ObjectSet QueryResult = db.query(new Predicate()
> >      {
> >             public boolean match(Users Myusers)
> >             {
> >                  return Myuser.getUserName() == "usr2";
> >             }
> >      });
> >
> > After we get the Users instance in the QueryResult, hence we can get the
> > usr2's password.
> >
> > How we perform this query using Cassandra API and Java??
> > Would you tell me please??  Thank You.
> >
> > Dir.
> >
> >
> > On Sat, Apr 10, 2010 at 11:06 AM, Paul Prescod <paul@prescod.net> wrote:
> >>
> >> No. Cassandra has an API.
> >>
> >> http://wiki.apache.org/cassandra/API
> >>
> >> On Fri, Apr 9, 2010 at 8:00 PM, dir dir <sikerasakti@gmail.com> wrote:
> >> > Does Cassandra has a default query language such as SQL in RDBMS
> >> > and Object Query in OODBMS?  Thank you.
> >> >
> >> > Dir.
> >> >
> >> > On Sat, Apr 10, 2010 at 7:01 AM, malsmith
> >> > <malsmith@treehousesystems.com>
> >> > wrote:
> >> >>
> >> >>
> >> >> It's sort of an interesting problem - in RDBMS one relatively simple
> >> >> approach would be calculate a rectangle that is X km by Y km with
> User
> >> >> 1's
> >> >> location at the center.  So the rectangle is UserX - 10KmX ,
> >> >> UserY-10KmY to
> >> >> UserX+10KmX , UserY+10KmY
> >> >>
> >> >> Then you could query the database for all other users where that each
> >> >> user
> >> >> considered is curUserX > UserX-10Km and curUserX < UserX+10KmX
and
> >> >> curUserY
> >> >> > UserY-10KmY and curUserY < UserY+10KmY
> >> >> * Not the 10KmX and 10KmY are really a translation from Kilometers
to
> >> >> degrees of  lat and longitude  (that you can find on a google search)
> >> >>
> >> >> With the right indexes this query actually runs pretty well.
> >> >>
> >> >> Translating that to Cassandra seems a bit complex at first - but you
> >> >> could
> >> >> try something like pre-calculating a grid with the right resolution
> >> >> (like a
> >> >> square of 5KM per side) and assign every user to a particular grid
> ID.
> >> >> That
> >> >> way you just calculate with grid ID User1 is in then do a direct key
> >> >> lookup
> >> >> to get a list of the users in that same grid id.
> >> >>
> >> >> A second approach would be to have to column families -- one that
> maps
> >> >> a
> >> >> Latitude to a list of users who are at that latitude and a second
> that
> >> >> maps
> >> >> users who are at a particular longitude.  You could do the same
> >> >> rectange
> >> >> calculation above then do a get_slice range lookup to get a list of
> >> >> users
> >> >> from range of latitude and a second list from the range of
> longitudes.
> >> >> You would then need to do a in-memory nested loop to find the list
of
> >> >> users
> >> >> that are in both lists.  This second approach could cause some
> trouble
> >> >> depending on where you search and how many users you really have --
> >> >> some
> >> >> latitudes and longitudes have many many people in them
> >> >>
> >> >> So, it seems some version of a chunking / grid id thing would be the
> >> >> better approach.   If you let people zoom in or zoom out - you could
> >> >> just
> >> >> have different column families for each level of zoom.
> >> >>
> >> >>
> >> >> I'm stuck on a stopped train so -- here is even more code:
> >> >>
> >> >> static Decimal GetLatitudeMiles(Decimal lat)
> >> >> {
> >> >> Decimal f = 0.0M;
> >> >> lat = Math.Abs(lat);
> >> >> f = 68.99M;
> >> >>          if (lat >= 0.0M && lat < 10.0M) { f = 68.71M;
}
> >> >> else if (lat >= 10.0M && lat < 20.0M) { f = 68.73M; }
> >> >> else if (lat >= 20.0M && lat < 30.0M) { f = 68.79M; }
> >> >> else if (lat >= 30.0M && lat < 40.0M) { f = 68.88M; }
> >> >> else if (lat >= 40.0M && lat < 50.0M) { f = 68.99M; }
> >> >> else if (lat >= 50.0M && lat < 60.0M) { f = 69.12M; }
> >> >> else if (lat >= 60.0M && lat < 70.0M) { f = 69.23M; }
> >> >> else if (lat >= 70.0M && lat < 80.0M) { f = 69.32M; }
> >> >> else if (lat >= 80.0M) { f = 69.38M; }
> >> >>
> >> >> return f;
> >> >> }
> >> >>
> >> >>
> >> >> Decimal MilesPerDegreeLatitude = GetLatitudeMiles(zList[0].Latitude);
> >> >> Decimal MilesPerDegreeLongitude = ((Decimal)
> Math.Abs(Math.Cos((Double)
> >> >> zList[0].Latitude))) * 24900.0M / 360.0M;
> >> >>                         dRadius = 10.0M  // ten miles
> >> >> Decimal deltaLat = dRadius / MilesPerDegreeLatitude;
> >> >> Decimal deltaLong = dRadius / MilesPerDegreeLongitude;
> >> >>
> >> >> ps.TopLatitude = zList[0].Latitude - deltaLat;
> >> >> ps.TopLongitude = zList[0].Longitude - deltaLong;
> >> >> ps.BottomLatitude = zList[0].Latitude + deltaLat;
> >> >> ps.BottomLongitude = zList[0].Longitude + deltaLong;
> >> >>
> >> >>
> >> >>
> >> >> On Fri, 2010-04-09 at 16:30 -0700, Paul Prescod wrote:
> >> >>
> >> >> 2010/4/9 Onur AKTAS <onur.aktas@live.com>:
> >> >> > ...
> >> >> > I'm trying to find out how do you perform queries with calculations
> >> >> > on
> >> >> > the
> >> >> > fly without inserting the data as calculated from the beginning.
> >> >> > Lets say we have latitude and longitude coordinates of all users
> and
> >> >> > we
> >> >> > have
> >> >> >  Distance(from_lat, from_long, to_lat, to_long) function which
> >> >> > gives distance between lat/longs pairs in kilometers.
> >> >>
> >> >> I'm not an expert, but I think that it boils down to "MapReduce" and
> >> >> "Hadoop".
> >> >>
> >> >> I don't think that there's any top-down tutorial on those two words,
> >> >> you'll have to research yourself starting here:
> >> >>
> >> >>  * http://en.wikipedia.org/wiki/MapReduce
> >> >>
> >> >>  * http://hadoop.apache.org/
> >> >>
> >> >>  * http://wiki.apache.org/cassandra/HadoopSupport
> >> >>
> >> >> I don't think it is all documented in any one place yet...
> >> >>
> >> >>  Paul Prescod
> >> >>
> >> >
> >> >
> >
> >
>

Mime
View raw message