incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Gonzalez <gonva...@gonvaled.com>
Subject Re: bulk update failing when document has attachments?
Date Wed, 11 Dec 2013 13:18:22 GMT
Funny that you do not need to escape it. The spec says you should:

char
any-Unicode-character-
    except-"-or-\-or-
    control-character
\"
\\
\/
\b
\f
\n
\r
\t
\u four-hex-digits

Anyway, my problem has been solved. I am not escaping anything in the
content_type: the json library is problably doing that. What I need to do
is to attach real base64 encoded data, which has solved my problem.




On Wed, Dec 11, 2013 at 2:15 PM, Robert Newson <rnewson@apache.org> wrote:

> ➜  ~  curl localhost:5984/db1/_bulk_docs
> -Hcontent-type:application/json -d
> '{"docs":[{"_attachments":{"foo":{"data":"aGVsbG8="}}}]}'
>
> [{"ok":true,"id":"b5d2060479624e483a8fe4747f001dbe","rev":"1-12c665c499a525a3a1a9ad35c90604a1"}]
>
> ➜  ~  curl localhost:5984/db1/b5d2060479624e483a8fe4747f001dbe
>
> {"_id":"b5d2060479624e483a8fe4747f001dbe","_rev":"1-12c665c499a525a3a1a9ad35c90604a1","_attachments":{"foo":{"content_type":"application/octet-stream","revpos":1,"digest":"md5-XUFAKrxLKna5cZ2REBfFkg==","length":5,"stub":true}}}
>
> ➜  ~  curl localhost:5984/db1/b5d2060479624e483a8fe4747f001dbe/foo
> hello%
>
> Maybe you left escaped newlines in your base64 input?
>
> B.
>
>
> On 11 December 2013 13:11, Robert Newson <rnewson@apache.org> wrote:
> > http://json.org/string.gif talks escaping back slash, not forward
> > slash. The PDF page 194 talks about escaping forward slash within a
> > RegExp statement in Javascript, which is not JSON.
> >
> > B.
> >
> >
> > On 11 December 2013 12:58, Daniel Gonzalez <gonvaled@gonvaled.com>
> wrote:
> >> It is not an artifact: I am taking that from the couchdb documentation.
> >>
> >> And according to Alexander Shorin forward slashes **really** need to be
> >> escaped in json. But it is not me who must do that, but the library
> >> converting the python objects to couchdb, so that can not be my problem.
> >>
> >> Now I am left with a badarg exception, which I can not relate to my
> input
> >> data:
> >>
> >> Exception > Problems updating list of documents (length = 1): (500,
> >> ('badarg', '46'))
> >>
> >> What does that '46' mean?
> >>
> >>
> >> On Wed, Dec 11, 2013 at 1:47 PM, Robert Newson <rnewson@apache.org>
> wrote:
> >>
> >>> I think your "image\/png" is just an artifact of your printing method,
> >>> you don't need to escape the forward slash in content_type, see
> >>> example below;
> >>>
> >>>
> >>>
> {"_id":"doc1","_rev":"1-96e2a6c78b8bfb227e79e1fbb16873f9","_attachments":{"att1":{"content_type":"image/png","revpos":1,"digest":"md5-XUFAKrxLKna5cZ2REBfFkg==","length":5,"stub":true}}}
> >>>
> >>> B.
> >>>
> >>>
> >>> On 11 December 2013 12:35, Daniel Gonzalez <gonvaled@gonvaled.com>
> wrote:
> >>> > That would work *only* for that prefix (data:image/png;base64,), or
> any
> >>> > prefix which happens to have the same length. Not very robust.
> >>> >
> >>> > I just discovered that the data coming from the front-end comes in
> >>> data-uri
> >>> > format (rfc2397). This should handle any rfc2397 prefix:
> >>> > http://stackoverflow.com/a/20518589/647991 (maybe buggy, just
> >>> implemented).
> >>> >
> >>> > Another question: even after removing the data-uri prefix, I am still
> >>> > getting problems. I think my content type is not right.
> >>> >
> >>> > Must content_type be escaped? That is:
> >>> >
> >>> > 'content_type': 'image/png', -> 'content_type': 'image\/png',
> >>> >
> >>> > The only reference I see to that is an example here:
> >>> > http://wiki.apache.org/couchdb/HTTP_Document_API#Inline_Attachments
> >>> >
> >>> > But no real explanation of why. It seems no other strings must be
> escaped
> >>> > for couchdb. The only requirement that couchdb seems to impose on
> top of
> >>> > json is that the data in the attachment must be in base64 format.
> >>> >
> >>> > But now it seems that the content_type must escape the slashes (/).
> Why?
> >>> It
> >>> > does not seem to be a json feature: slashes are fine in any json
> string.
> >>> So
> >>> > what is that?
> >>> >
> >>> > I would like to know the specificacion for the format expected for
> >>> > content_type. Does that have a name? I am calling it "escaped
> mediatype".
> >>> > Is it part of a more generic escaping process expected by couchdb,
or
> >>> only
> >>> > the content_type is affected? Is there an official name for that?
> >>> >
> >>> >
> >>> > On Wed, Dec 11, 2013 at 12:18 PM, Johannes Jörg Schmidt <
> >>> > schmidt@netzmerk.com> wrote:
> >>> >
> >>> >> data.slice(22)
> >>> >>
> >>> >> 2013/12/11 Daniel Gonzalez <gonvaled@gonvaled.com>:
> >>> >> > Thanks, I just realized about this. The base64 is coming from
the
> >>> >> > javascript frontend (chose file in a form). So I need to remove
> the
> >>> >> > prefix "data:image/png;base64,".
> >>> >> > Not sure how to do this without rolling my own regexes though.
> >>> >> >
> >>> >> >
> >>> >> > On Wed, Dec 11, 2013 at 12:01 PM, Alexander Shorin <
> kxepal@gmail.com>
> >>> >> wrote:
> >>> >> >
> >>> >> >> Hi,
> >>> >> >>
> >>> >> >> _attachments data should be valid base64 encoded string,
while
> you
> >>> have:
> >>> >> >> >
> >>> >> >>
> >>> >>
> >>>
> ...
> >>> >> >>
> >>> >> >> Chars : and , are invalid for base64.
> >>> >> >> --
> >>> >> >> ,,,^..^,,,
> >>> >> >>
> >>> >> >>
> >>> >> >> On Wed, Dec 11, 2013 at 2:49 PM, Daniel Gonzalez <
> >>> gonvaled@gonvaled.com
> >>> >> >
> >>> >> >> wrote:
> >>> >> >> > Hi,
> >>> >> >> >
> >>> >> >> > (SO reference: http://stackoverflow.com/q/20516980/647991.
I
> post
> >>> >> there
> >>> >> >> > because formatting makes things much easier to read,
replies /
> >>> >> comments
> >>> >> >> are
> >>> >> >> > well organized, and the up/downvote mechanism works)
> >>> >> >> >
> >>> >> >> > I am performing the following operation:
> >>> >> >> >
> >>> >> >> > 1. Prepare some documents: `docs = [ doc1, doc2,
... ]`. The
> >>> documents
> >>> >> >> have
> >>> >> >> > *maybe* attachments
> >>> >> >> > 2. I `POST` to `_bulk_docs` the list of documents
> >>> >> >> > 3. I get an `Exception > Problems updating list
of documents
> >>> (length =
> >>> >> >> 1):
> >>> >> >> > (500, ('badarg', '58'))`
> >>> >> >> >
> >>> >> >> > My `bulk_docs` is (in this case just one):
> >>> >> >> >
> >>> >> >> >     [   {   '_attachments': {   'image.png': {
> 'content_type':
> >>> >> >> > 'image/png',
> >>> >> >> >                                                 
'data':
> >>> >> >> >
> >>> >> >>
> >>> >>
> >>>
> '...'}},
> >>> >> >> >             '_id': '08b8fc66-cd90-47a1-9053-4f6fefabdfe3',
> >>> >> >> >             '_rev': '15-ff3d0e8baa56e5ad2fac4937264fb3f6',
> >>> >> >> >             'docmeta': {   'created': '2013-10-01
> 14:48:24.311257',
> >>> >> >> >                            'updated': [   '2013-10-01
> >>> >> 14:48:24.394157',
> >>> >> >> >                                           '2013-12-11
> >>> >> 08:19:47.271812',
> >>> >> >> >                                           '2013-12-11
> >>> >> 08:25:05.662546',
> >>> >> >> >                                           '2013-12-11
> >>> >> 10:38:56.116145']},
> >>> >> >> >             'org_id': 45345,
> >>> >> >> >             'outputs_id': None,
> >>> >> >> >             'properties': {   'auto-t2s': False,
> >>> >> >> >                               'content_type': 'image/png',
> >>> >> >> >                               'lang': 'es',
> >>> >> >> >                               'name': 'dfasdfasdf',
> >>> >> >> >                               'text': 'erwerwerwrwerwr'},
> >>> >> >> >             'subtype': 'voicemail-st',
> >>> >> >> >             'tags': ['RRR-ccc-dtjkqx'],
> >>> >> >> >             'type': 'recording'}]
> >>> >> >> >
> >>> >> >> > This is the detailed exception:
> >>> >> >> >
> >>> >> >> >     Traceback (most recent call last):
> >>> >> >> >       File "portal_support_ut.py", line 470, in test_UpdateDoc
> >>> >> >> >         self.ps.UpdateDoc(self.org_id, what, doc_id,
new_data)
> >>> >> >> >       File
> >>> >> >> >
> >>> >> >>
> >>> >>
> >>>
> "/home/gonvaled/projects/new-wavilon-portal/python_modules/wav/ps/complex_ops.py",
> >>> >> >> > line 349, in UpdateDoc
> >>> >> >> >         success, doc = database.UpdateDoc(doc_id,
new_data)
> >>> >> >> >       File
> >>> >> >> >
> >>> >> >>
> >>> >>
> >>>
> "/home/gonvaled/projects/new-wavilon-portal/python_modules/wav/cdb/core/updater.py",
> >>> >> >> > line 38, in UpdateDoc
> >>> >> >> >         res = self.SaveDoc(doc_id, doc)
> >>> >> >> >       File
> >>> >> >> >
> >>> >> >>
> >>> >>
> >>>
> "/home/gonvaled/projects/new-wavilon-portal/python_modules/wav/cdb/core/saver.py",
> >>> >> >> > line 88, in SaveDoc
> >>> >> >> >         else      : self.bulk_append(doc, flush,
> update_revision)
> >>> >> >> >       File
> >>> >> >> >
> >>> >> >>
> >>> >>
> >>>
> "/home/gonvaled/projects/new-wavilon-portal/python_modules/wav/cdb/core/bulker.py",
> >>> >> >> > line 257, in bulk_append
> >>> >> >> >         if force_send or flush or not self.timer.use_timer
:
> >>> >> >> > self.BulkSend(show_progress=True)
> >>> >> >> >       File
> >>> >> >> >
> >>> >> >>
> >>> >>
> >>>
> "/home/gonvaled/projects/new-wavilon-portal/python_modules/wav/cdb/core/bulker.py",
> >>> >> >> > line 144, in BulkSend
> >>> >> >> >         results = self.UpdateDocuments(self.bulk)
> >>> >> >> >       File
> >>> >> >> >
> >>> >> >>
> >>> >>
> >>>
> "/home/gonvaled/projects/new-wavilon-portal/python_modules/wav/cdb/core/bulker.py",
> >>> >> >> > line 67, in UpdateDocuments
> >>> >> >> >         results = self.db.update(bulkdocs)
> >>> >> >> >       File
> >>> >> >> >
> >>> >> >>
> >>> >>
> >>>
> "/home/gonvaled/.virtualenvs/python2.7.3-wavilon1/local/lib/python2.7/site-packages/couchdb/client.py",
> >>> >> >> > line 764, in update
> >>> >> >> >         _, _, data = self.resource.post_json('_bulk_docs',
> >>> >> body=content)
> >>> >> >> >       File
> >>> >> >> >
> >>> >> >>
> >>> >>
> >>>
> "/home/gonvaled/.virtualenvs/python2.7.3-wavilon1/local/lib/python2.7/site-packages/couchdb/http.py",
> >>> >> >> > line 527, in post_json
> >>> >> >> >         **params)
> >>> >> >> >       File
> >>> >> >> >
> >>> >> >>
> >>> >>
> >>>
> "/home/gonvaled/.virtualenvs/python2.7.3-wavilon1/local/lib/python2.7/site-packages/couchdb/http.py",
> >>> >> >> > line 546, in _request_json
> >>> >> >> >         headers=headers, **params)
> >>> >> >> >       File
> >>> >> >> >
> >>> >> >>
> >>> >>
> >>>
> "/home/gonvaled/.virtualenvs/python2.7.3-wavilon1/local/lib/python2.7/site-packages/couchdb/http.py",
> >>> >> >> > line 542, in _request
> >>> >> >> >         credentials=self.credentials)
> >>> >> >> >       File
> >>> >> >> >
> >>> >> >>
> >>> >>
> >>>
> "/home/gonvaled/.virtualenvs/python2.7.3-wavilon1/local/lib/python2.7/site-packages/couchdb/http.py",
> >>> >> >> > line 398, in request
> >>> >> >> >         raise ServerError((status, error))
> >>> >> >> >     ServerError: (500, ('badarg', '58'))
> >>> >> >> >
> >>> >> >> >
> >>> >> >> > What does that `badarg` mean? Is it possible to send
> attachments
> >>> when
> >>> >> >> doing
> >>> >> >> > `_bulk_docs`?
> >>> >> >> >
> >>> >> >> > Thanks,
> >>> >> >> > Daniel
> >>> >> >>
> >>> >>
> >>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message