incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manjunath Somashekhar <manjunath_somashek...@yahoo.com>
Subject Suggestions on optimizing document look up
Date Wed, 01 Apr 2009 19:34:44 GMT

hi All,

Buoyed by the response i got to my previous mail (Suggestions on View performance optimization/improvement),
i am asking another question for optimizing document look up based on _id.

Let us say we have a db containing a million documents each with _id generated by us [1.....1000000].
If we have to get all the documents one by one (assuming the search/lookup code will get random
inputs of [1..1000000]), wat would work best?

As of now wat we are doing is a simple look up like:
def getDocById(self, id):
     return self.db[id]

For doing a million lookups like this it takes about 50-60 mins on my laptop. Is there a better
way of doing the same? Thought of fetching a bunch of keys in one go caching them (LRU style)
and looking up the cache first before hitting the db, but given that the input 'id' randomly
varies between [1..1000000], it has not been a great success.

Any thoughts? Ideas? Suggestions?

Environment details:
Couchdb - 0.9.0a757326
Erlang - 5.6.5
Linux kernel - 2.6.24-23-generic #1 SMP Mon Jan 26 00:13:11 UTC 2009 i686 GNU/Linux
Ubuntu distribution
Centrino Dual core, 4GB RAM laptop

Thanks
Manju


      

Mime
View raw message