couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Newson (JIRA)" <j...@apache.org>
Subject [jira] Commented: (COUCHDB-465) Produce sequential, but unique, document id's
Date Thu, 20 Aug 2009 09:00:14 GMT

    [ https://issues.apache.org/jira/browse/COUCHDB-465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745358#action_12745358
] 

Robert Newson commented on COUCHDB-465:
---------------------------------------


I like most of Paul's changes though I thought we'd agreed on IRC to change the default to
sequential and I'd still like to see that happen.

I would also like to see a way to detect very quick delete/create scenarios, though I don't
know if database uuids are the only solution there. If a global _changes feed would emit "deleted"
and "created" events for databases in the correct order, then couchdb-lucene could work correctly
without database uuids.

Antony's suggestion of a fourth algorithm, where the prefix is completely static, is simple
enough to add. This patch allows the deployer to decide how much he cares about predictability
and server origin, so I don't see a reason not to add it. It is distinct from the sequential
algorithm, though. The prefix there is only used for around 8000 ids and is then never reused,
there is also no correlation between prefix and origin server.



> Produce sequential, but unique, document id's
> ---------------------------------------------
>
>                 Key: COUCHDB-465
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-465
>             Project: CouchDB
>          Issue Type: Improvement
>            Reporter: Robert Newson
>         Attachments: couch_uuids.patch, uuid_generator.patch
>
>
> Currently, if the client does not specify an id (POST'ing a single document or using
_bulk_docs) a random 16 byte value is created. This kind of key is particularly brutal on
b+tree updates and the append-only nature of couchdb files.
> Attached is a patch to change this to a two-part identifier. The first part is a random
12 byte value and the remainder is a counter. The random prefix is rerandomized when the counter
reaches its maximum. The rollover in the patch is at 16 million but can obviously be changed.
The upshot is that the b+tree is updated in a better fashion, which should lead to performance
benefits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message