couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Garren Smith <gar...@apache.org>
Subject Re: [DISCUSSION] Limiting the allowed size for documents
Date Fri, 08 Jul 2016 14:08:15 GMT
Great discussion Robert. I agree setting a hard limit is a good idea.

In terms of fixing the indexer to supporting larger files. I would rather
see us set a max size limit. Then make sure the indexer can always handle
that size. Then we can look to do incremental improvements to the indexers
to support larger sizes with each release.

I would think an approach like that would be more helpful to the user.

On Friday, 08 July 2016, Alexander Shorin <kxepal@gmail.com> wrote:

> On Fri, Jul 8, 2016 at 12:44 AM, Robert Kowalski <rok@kowalski.gd
> <javascript:;>> wrote:
> > Couch 1.x and Couch 2.x will choke as soon as the indexer tries to
> > process a too large document that was added. The indexing stops and
> > you have to manually remove the doc. In the best case you built an
> > automatic process around the process. The automatic process removes
> > the document instead of the human.
>
> Automatic process of removing stored data in production? You might be
> kidding (:
>
> Limiting the document size here sound like a wrong way to some the
> indexer issue when it cannot handle such documents. Two solutions
> comes on mind:
>
> - Indexer ignores big documents generating enough of loud to help user
> notice the problem;
> - Indexer is fixed to handle big documents;
>
> From user side the second option is the only right because it's my
> data, I put it to database, I trust database in the way it can process
> it, it shouldn't fail me.
>
> What should user do when he hit the limit and cannot store the
> document, because indexer is buggy, but he need this data to be
> processed? He becomes very annoying. Because he need that data as is
> and any attempts to split it into multiple documents may be impossible
> (because we don't have cross documents links and transactions). What's
> the next step for him? Change a database for sure.
>
> I think that the indexer argument is quite weak and strange. More
> strong one is about to cut off possibility of uploading bloat data
> when by design there are some sane boundaries for the stored data. If
> all your documents are avg. 1MiB and your database receives data from
> the world, you would like to explicitly drop anomalies of dozens and
> hundreds MiB because that's not a data you're working with.
>
> See also: https://github.com/apache/couchdb-chttpd/pull/114 - Tony Sun
> made some attempts to add such limit to CouchDB.
>
> There are couple of problems to actually implement such limit in
> predictable and lightweight way because we have awesome _update
> functions (; But I believe that all of them could be overcome.
>
> --
> ,,,^..^,,,
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message