couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Gonzalez <gonva...@gonvaled.com>
Subject Re: general question about couch performance
Date Thu, 17 Jan 2013 23:15:24 GMT
And here you have BaseConverter:

"""
Convert numbers from base 10 integers to base X strings and back again.

Sample usage:

>>> base20 = BaseConverter('0123456789abcdefghij')
>>> base20.from_decimal(1234)
'31e'
>>> base20.to_decimal('31e')
1234
"""

class BaseConverter(object):
    decimal_digits = "0123456789"

    def __init__(self, digits):
        self.digits = digits

    def from_decimal(self, i):
        return self.convert(i, self.decimal_digits, self.digits)

    def to_decimal(self, s):
        return int(self.convert(s, self.digits, self.decimal_digits))

    def convert(number, fromdigits, todigits):
        # Based on http://code.activestate.com/recipes/111286/
        if str(number)[0] == '-':
            number = str(number)[1:]
            neg = 1
        else:
            neg = 0

        # make an integer out of the number
        x = 0
        for digit in str(number):
           x = x * len(fromdigits) + fromdigits.index(digit)

        # create the result in base 'len(todigits)'
        if x == 0:
            res = todigits[0]
        else:
            res = ""
            while x > 0:
                digit = x % len(todigits)
                res = todigits[digit] + res
                x = int(x / len(todigits))
            if neg:
                res = '-' + res
        return res
    convert = staticmethod(convert)


On Fri, Jan 18, 2013 at 12:13 AM, Daniel Gonzalez <gonvaled@gonvaled.com>wrote:

> Also, in order to improve view performance, it is better if you use a
> short and monotonically increasing id: this is what I am using for one of
> my databases with millions of documents:
>
> class MonotonicalID:
>
>     def __init__(self, cnt = 0):
>         self.cnt = cnt
>         self.base62 =
> BaseConverter('ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789abcdefghijklmnopqrstuvwxyz')
>         # This alphabet is better for couchdb, since it represents the
> Unicode Collation Algorithm
>         self.base64_couch =
> BaseConverter('-@0123456789aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ')
>
>     def get(self):
>         res = self.base64_couch.from_decimal(self.cnt)
>         self.cnt += 1
>         return res
>
> Doing this will:
> - save space in the database, since the id starts small: take into account
> that the id is used in lots of internal data structures in couchdb, so
> making it short will save lots of space in a big database
> - making it ordered (in the couchdb sense) will speed up certain operations
>
> Drawback: you can only do this if you are in control of the IDs (you know
> that nobody else is going to be generating IDs)
>
> On Thu, Jan 17, 2013 at 8:00 PM, Mark Hahn <mark@hahnca.com> wrote:
>
>> Thanks for the tips.  Keep them coming.
>>
>> I'm going to try everything I can.  If I find anything surprising I'll let
>> everyone know.
>>
>>
>> On Thu, Jan 17, 2013 at 4:54 AM, Daniel Gonzalez <gonvaled@gonvaled.com
>> >wrote:
>>
>> > Are you doing single writes or batch writes?
>> > I managed to improve the write performance by collecting the documents
>> and
>> > sending them in a single access.
>> > The same applies for read accesses.
>> >
>> > On Wed, Jan 16, 2013 at 9:17 PM, Mark Hahn <mark@hahnca.com> wrote:
>> >
>> > > My couchdb is seeing a typical request rate of about 100/sec when it
>> is
>> > > maxed out.  This is typically 10 reads/write.  This is disappointing.
>>  I
>> > > was hoping to 3 to 5 ms per op, not 10 ms.  What performance numbers
>> are
>> > > others seeing?
>> > >
>> > > I have 35 views with only 50 to 100 entries per view.  My db is less
>> > than a
>> > > gigabyte with a few thousand active docs.
>> > >
>> > > I'm running on a medium ec2 instance with ephemeral disk.  I assume I
>> am
>> > IO
>> > > bound as the cpu is not maxing out.
>> > >
>> > > How much worse would this get if the db also had to handle replication
>> > > between multiple servers?
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message