couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Hahn <m...@hahnca.com>
Subject Re: general question about couch performance
Date Thu, 17 Jan 2013 23:20:42 GMT
> you can only do this if you are in control of the IDs

This wouldn't work with multiple servers replicating, would it?


On Thu, Jan 17, 2013 at 3:15 PM, Daniel Gonzalez <gonvaled@gonvaled.com>wrote:

> And here you have BaseConverter:
>
> """
> Convert numbers from base 10 integers to base X strings and back again.
>
> Sample usage:
>
> >>> base20 = BaseConverter('0123456789abcdefghij')
> >>> base20.from_decimal(1234)
> '31e'
> >>> base20.to_decimal('31e')
> 1234
> """
>
> class BaseConverter(object):
>     decimal_digits = "0123456789"
>
>     def __init__(self, digits):
>         self.digits = digits
>
>     def from_decimal(self, i):
>         return self.convert(i, self.decimal_digits, self.digits)
>
>     def to_decimal(self, s):
>         return int(self.convert(s, self.digits, self.decimal_digits))
>
>     def convert(number, fromdigits, todigits):
>         # Based on http://code.activestate.com/recipes/111286/
>         if str(number)[0] == '-':
>             number = str(number)[1:]
>             neg = 1
>         else:
>             neg = 0
>
>         # make an integer out of the number
>         x = 0
>         for digit in str(number):
>            x = x * len(fromdigits) + fromdigits.index(digit)
>
>         # create the result in base 'len(todigits)'
>         if x == 0:
>             res = todigits[0]
>         else:
>             res = ""
>             while x > 0:
>                 digit = x % len(todigits)
>                 res = todigits[digit] + res
>                 x = int(x / len(todigits))
>             if neg:
>                 res = '-' + res
>         return res
>     convert = staticmethod(convert)
>
>
> On Fri, Jan 18, 2013 at 12:13 AM, Daniel Gonzalez <gonvaled@gonvaled.com
> >wrote:
>
> > Also, in order to improve view performance, it is better if you use a
> > short and monotonically increasing id: this is what I am using for one of
> > my databases with millions of documents:
> >
> > class MonotonicalID:
> >
> >     def __init__(self, cnt = 0):
> >         self.cnt = cnt
> >         self.base62 =
> >
> BaseConverter('ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789abcdefghijklmnopqrstuvwxyz')
> >         # This alphabet is better for couchdb, since it represents the
> > Unicode Collation Algorithm
> >         self.base64_couch =
> >
> BaseConverter('-@0123456789aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ')
> >
> >     def get(self):
> >         res = self.base64_couch.from_decimal(self.cnt)
> >         self.cnt += 1
> >         return res
> >
> > Doing this will:
> > - save space in the database, since the id starts small: take into
> account
> > that the id is used in lots of internal data structures in couchdb, so
> > making it short will save lots of space in a big database
> > - making it ordered (in the couchdb sense) will speed up certain
> operations
> >
> > Drawback: you can only do this if you are in control of the IDs (you know
> > that nobody else is going to be generating IDs)
> >
> > On Thu, Jan 17, 2013 at 8:00 PM, Mark Hahn <mark@hahnca.com> wrote:
> >
> >> Thanks for the tips.  Keep them coming.
> >>
> >> I'm going to try everything I can.  If I find anything surprising I'll
> let
> >> everyone know.
> >>
> >>
> >> On Thu, Jan 17, 2013 at 4:54 AM, Daniel Gonzalez <gonvaled@gonvaled.com
> >> >wrote:
> >>
> >> > Are you doing single writes or batch writes?
> >> > I managed to improve the write performance by collecting the documents
> >> and
> >> > sending them in a single access.
> >> > The same applies for read accesses.
> >> >
> >> > On Wed, Jan 16, 2013 at 9:17 PM, Mark Hahn <mark@hahnca.com> wrote:
> >> >
> >> > > My couchdb is seeing a typical request rate of about 100/sec when
it
> >> is
> >> > > maxed out.  This is typically 10 reads/write.  This is
> disappointing.
> >>  I
> >> > > was hoping to 3 to 5 ms per op, not 10 ms.  What performance numbers
> >> are
> >> > > others seeing?
> >> > >
> >> > > I have 35 views with only 50 to 100 entries per view.  My db is less
> >> > than a
> >> > > gigabyte with a few thousand active docs.
> >> > >
> >> > > I'm running on a medium ec2 instance with ephemeral disk.  I assume
> I
> >> am
> >> > IO
> >> > > bound as the cpu is not maxing out.
> >> > >
> >> > > How much worse would this get if the db also had to handle
> replication
> >> > > between multiple servers?
> >> > >
> >> >
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message