incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <rnew...@apache.org>
Subject Re: Size of couchdb documents
Date Fri, 16 Mar 2012 14:51:13 GMT
Any reason you can't use the built-in, default UUID algorithm that
produces collision-resistant but sequential values?

B.

On 16 March 2012 14:41, Daniel Gonzalez <gonvaled@gonvaled.com> wrote:
>>
>> If memory serves the database's by_id tree uses Erlang term sorting for collation
instead of ICU.  ICU is of course the default collation option for MR views.  Regards,
>>
>> Adam
>
>
> That is interesting. I will try to confirm that, because that would
> mean that the dictionary that I am using now:
>
> "-@0123456789aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ"
>
> which is ICU ordered, would not be optimal for the doc_ids. Can you
> tell me what would an "Erlang term order" base64 dictionary look like?
>
> Anyway, I am curious: I understand that the size of doc_id is going to
> have big impact in performance and size of the database, since the
> doc_id is going to be present in a lot of internal structures. What I
> do not fully understand is why *ordering* of doc_ids when inserting
> documents in the database is going to have any effect in insert speed,
> or view generation. In my naive view of couchdb, the documents are
> just written to a big file system file as they are POSTed to couchdb,
> in the order that they arrive. How would the doc_id order affect this
> process?

Mime
View raw message