incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John <john.logs...@netdev.co.uk>
Subject Large lists of data
Date Sat, 24 Jul 2010 09:37:56 GMT
Hi 

I'm currently evaluating couchdb as a candidate to replace the relational databases as used
in our Telecom Applications.
For most of our data I can see a good fit and we already expose our service provisioning as
json over REST so we're well positioned for a migration.
One area that concerns me though is whether this technology is suitable for our list data.
An example of this is Mobile Number Portability where we have millions of rows of data representing
ported numbers with some atrributes against each.

We use the standard Relational approach to this and have an entries table that has a foreign
key reference to a parent list. 

On our web services we do something like this:

Create a List:

PUT /cie-rest/provision/accounts/netdev/lists/mylist
{ "type": "NP"}

To add a row to a list 
PUT /cie-rest/provision/accounts/netdev/lists/mylist/entries/0123456789
{ "status":"portedIn", "operatorId":1234}

If we want to add a lot of rows we just POST a document to the list.

The list data is used when processing calls and it requires a fast lookup on the entries table
which is obviously indexed.

Anyway, I'd be interested in getting some opinions on:

1) Is couchdb the *right* technology for this job? (I know it can do it!)

2) I presume that the relationship I currently have in my relational database would remain
the same for couch i.e. The entry document would ref the list document but maybe there's a
better way to do this?

3) Number portability requires 15 min, 1 hour and daily syncs with a central number portability
database. This can result in bulk updates of thousands of numbers. I'm concerned with how
long it takes to build a couchdb index and to incrementally update it when the number of changes
is large (Adds/removes).  
What does this mean to the availability of the number? i.e. Is the entry in the db but its
unavailable to the application as it's entry in the index hasnt been built yet?

4) Telephone numbers like btrees so the index building should be quite fast and efficient
I would of thought but does someone have anything more concrete in terms of how long it would
take typically? I think that the bottleneck is the disk i/o and therefore it may be vastly
different between my laptop and one of our beefy production servers but again I'd be interested
in other peoples experience.

Bit of a long one so thanks if you've read it to this point! There's a lot to like with couchdb
(esp the replication for our use case) so I'm hoping that what i've asked above is feasible!

Thanks

John



Mime
View raw message