incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tamas Marki <tma...@gmail.com>
Subject Re: Storing and querying IP ranges in Cassandra
Date Tue, 01 Nov 2011 16:59:01 GMT
It's unpredictable, there could be hundreds of thousands, even millions
(but unlikely).

Tamas

On Tue, Nov 1, 2011 at 5:38 PM, Zach Richardson <j.zach.richardson@gmail.com
> wrote:

> How many total ranges to you expect to have long term?
>
> On Tue, Nov 1, 2011 at 11:17 AM, Tamas Marki <tmarki@gmail.com> wrote:
> > Hello,
> >
> > I'm new to the list and also to Cassandra. I found it when I was
> searching
> > for something to replace our busy mysql server.
> >
> > One of the things we use the server for is filtering IPs based on a list
> of
> > IP ranges. These ranges can be small and big, and there are about 50k of
> > them in the database.
> >
> > In mysql this is pretty quick: they are stored as integers, and the query
> > basically looks like (say ip is the ip we want to find the all the ranges
> > for):
> >
> > select range from rangelist where ip_start<=ip and ip_end>=ip;
> >
> > I tried to move this schema to Cassandra, but it turned out to be very
> slow,
> > even with indexes on both columns. Since I also had to have an EQ
> expression
> > in the query, I added an indexed text field which was the same for all
> rows,
> > so the query in cassandra was something like this:
> >
> > select range from rangelist where type='ip' and ip_start<=ip and
> ip_end>=ip;
> >
> > This was very slow, and I imagine it is because it has to scan through
> all
> > the rows, making the index useless.
> >
> > The second thing I tried was to just expand the ranges and store
> individual
> > IPs as the keys to a column family. This is very fast to query, but the
> > problem is that I now have over 2.7 million rows, because some of the
> ranges
> > are quite large.
> >
> > As the number of ranges could change, this method could be a problem -
> > imagine we add a whole A-class range, it would explode into millions of
> > rows.
> >
> > My question is, is there a more sane way to store this information, while
> > still being able to find all the IP ranges that have the given IP in
> them?
> >
> > I've been only dealing with Cassandra for a week or two, so I don't know
> > about the inner details of what can be done, but I do have programming
> > experience and am not afraid to get my hands dirty, in case it can be
> solved
> > by writing some extension to Cassandra.
> >
> > Looking forward to any suggestions.
> >
> > Thanks,
> > Tamas
> >
> >
>



-- 
Tamas Marki

Mime
View raw message