couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Smith <jason.h.sm...@gmail.com>
Subject Re: NPM, CouchDB and big attachments
Date Wed, 27 Nov 2013 14:14:18 GMT
Damien used to say, "there are vitamins and there are pain pills."

Bigcouch is a vitamin[1]: a long-term fix to the general health and
robustness of the system.

npm needs a pain pill. And it is going to get one. Why do I respect the
Node.js community? Certainly not because of the language! No, because they
get things done and move quickly. I expect a fix for this problem to be in
production before we could even wrap up a discussion about architectural
changes.

Fortunately, this fix will be right where it should be: in the application.
There is nothing wrong with storing URLs as document data and having the
client fetch those itself, as long as you understand the trade-offs, which
npm does.

[1]: I am aware that there is not a shred of evidence that multivitamin
supplements improve health of normal people. But you know what I mean.



On Wed, Nov 27, 2013 at 6:59 PM, Robert Newson <rnewson@apache.org> wrote:

> I think NPM mostly struggle with disk issues (all attachments in the
> same file, it's 100G) and replication (a document with lots of
> attachments has to be transferred fully in the same connection without
> interruption or else it starts over).
>
> Both of these are fixable without taking the extreme measure of moving
> the attachments out of couchdb entirely. That would pretty much
> eliminate the point of using CouchDB for this registry. That's a
> perfectly reasonable thing for the registry owners to do but changing
> CouchDB is going too far. I've previously advocated for "external"
> attachments, whether that's a file-per-attachment or a separate .att
> file of all attachments. I've since recanted, it's not compelling
> enough to compensate for the extra failure conditions (the .couch file
> exists but the .att file is gone, say).
>
> For the actual problems, the bigcouch merge will bring sharding (a
> q=10 database would consist of ten 10G files, each individually
> compactable, can be hosted on different machines, etc). CouchDB 1.5.0
> improved replication behaviour around attachments but there's
> definitely more work to be done. Particularly, we could make
> attachment replication resumable. Currently, if we replicate 99.9% of
> a large attachment, lose our connection, and resume, we'll start over
> from byte 0. This is why, elsewhere, there's a suggestion of 'one
> attachment per document'. That is a horrible and artificial constraint
> just to work around replicator deficiencies. We should encourage sane
> design (related attachments together in the same document) and fix the
> bugs that prevent heavy users from following it.
>
> B.
>
>
> On 27 November 2013 07:27, Benoit Chesneau <bchesneau@gmail.com> wrote:
> > On Wed, Nov 27, 2013 at 8:26 AM, Benoit Chesneau <bchesneau@gmail.com
> >wrote:
> >
> >>
> >>
> >>
> >> On Wed, Nov 27, 2013 at 8:14 AM, Alexander Shorin <kxepal@gmail.com
> >wrote:
> >>
> >>> http://blog.nodejs.org/2013/11/26/npm-post-mortem/
> >>>
> >>> > Move attachments out of CouchDB: Work has begun to move the package
> >>> tarballs out of
> >>> > CouchDB and into Joyent's Manta service. Additionally, MaxCDN has
> >>> generously offered to
> >>> > provide CDN services for npm, once the tarballs are moved out of the
> >>> registry database.
> >>> > This will help improve delivery speed, while dramatically reducing
> the
> >>> file system I/O load on
> >>> > the CouchDB servers. Work is progressing slowly, because at each
> stage
> >>> in the plan, we are
> >>> > making sure that current replication users are minimally impacted.
> >>>
> >>> I wonder is it CouchDB non-optimal I/O and/or can 769 issue fix it?
> >>>
> >>> https://issues.apache.org/jira/browse/COUCHDB-769
> >>>
> >>> There is alpha-patch attached. May be it's good time to push it
> >>> forward? What things are left for it?
> >>>
> >>> --
> >>> ,,,^..^,,,
> >>>
> >>
> >> I would say a better API internally , I am also interrested to work on
> that
> >>
> >
> > also +1
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message