incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Gonzalez <gonva...@gonvaled.com>
Subject Re: Size of couchdb documents
Date Thu, 15 Mar 2012 15:14:53 GMT
Hi Matthieu,

This really seems to help. I am using now a base62 encoded monotonically
increasing integer, which means my doc_id goes from "0" onwards, using the
alphabet:

ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789abcdefghijklmnopqrstuvwxyz

I am getting now 3000 docs/s, more or less stable, and the size of my
documents has decreased from 3KB to 0.4 KB.
I am not sure whether this metrics will worsen when the database grows, but
my feeling is that the situation has improved a lot just by changing the
doc_id.

I have one more question. Is the alphabet I have shown above "ordered" for
couchdb?

Thanks,
Daniel

On Thu, Mar 15, 2012 at 3:09 PM, Matthieu Rakotojaona <
matthieu.rakotojaona@gmail.com> wrote:

> On Thu, Mar 15, 2012 at 3:00 PM, Daniel Gonzalez <gonvaled@gonvaled.com>
> wrote:
> > I understand the overheads that you are referring to, but it still
> schocks
> > me that Couchdb needs 8 times as much space to store the data.
> >
> > Are there any guidelines on what to do/avoid in order to get a lower
> > overhead ratio?
>
> I got surprisingly good results when changing the _id design. I advise
> you to follow what is written in this page :
> http://wiki.apache.org/couchdb/Performance#File_size
>
> Basically :
> - use shorter _ids
> - use sequential _ids. If you cannot (eg because you have multiple
> disconnected parts that will have to merge often and that would cause
> too many clashes), you can use couchdb's own semi-sequential generated
> uuids. Yes, uuids are contradictory to the first point.
>
>
> --
> Matthieu RAKOTOJAONA
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message