incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Gonzalez <gonva...@gonvaled.com>
Subject Re: general question about couch performance
Date Thu, 17 Jan 2013 23:13:14 GMT
Also, in order to improve view performance, it is better if you use a short
and monotonically increasing id: this is what I am using for one of my
databases with millions of documents:

class MonotonicalID:

    def __init__(self, cnt = 0):
        self.cnt = cnt
        self.base62 =
BaseConverter('ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789abcdefghijklmnopqrstuvwxyz')
        # This alphabet is better for couchdb, since it represents the
Unicode Collation Algorithm
        self.base64_couch =
BaseConverter('-@0123456789aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ')

    def get(self):
        res = self.base64_couch.from_decimal(self.cnt)
        self.cnt += 1
        return res

Doing this will:
- save space in the database, since the id starts small: take into account
that the id is used in lots of internal data structures in couchdb, so
making it short will save lots of space in a big database
- making it ordered (in the couchdb sense) will speed up certain operations

Drawback: you can only do this if you are in control of the IDs (you know
that nobody else is going to be generating IDs)

On Thu, Jan 17, 2013 at 8:00 PM, Mark Hahn <mark@hahnca.com> wrote:

> Thanks for the tips.  Keep them coming.
>
> I'm going to try everything I can.  If I find anything surprising I'll let
> everyone know.
>
>
> On Thu, Jan 17, 2013 at 4:54 AM, Daniel Gonzalez <gonvaled@gonvaled.com
> >wrote:
>
> > Are you doing single writes or batch writes?
> > I managed to improve the write performance by collecting the documents
> and
> > sending them in a single access.
> > The same applies for read accesses.
> >
> > On Wed, Jan 16, 2013 at 9:17 PM, Mark Hahn <mark@hahnca.com> wrote:
> >
> > > My couchdb is seeing a typical request rate of about 100/sec when it is
> > > maxed out.  This is typically 10 reads/write.  This is disappointing.
>  I
> > > was hoping to 3 to 5 ms per op, not 10 ms.  What performance numbers
> are
> > > others seeing?
> > >
> > > I have 35 views with only 50 to 100 entries per view.  My db is less
> > than a
> > > gigabyte with a few thousand active docs.
> > >
> > > I'm running on a medium ec2 instance with ephemeral disk.  I assume I
> am
> > IO
> > > bound as the cpu is not maxing out.
> > >
> > > How much worse would this get if the db also had to handle replication
> > > between multiple servers?
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message