incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <rnew...@apache.org>
Subject Re: bulk update failing when document has attachments?
Date Wed, 11 Dec 2013 13:27:46 GMT
forward slash is a
Unicode-character-except-"-or-\-or-control-character. The picture does
show that you *can* escape a forward slash with \/ but the 'any' track
allows an unescaped forward slash. It's not news that JSON (while
ostensibly simple) is not well-defined, though. I suggest we all just
have some cake.

B.

On 11 December 2013 13:18, Daniel Gonzalez <gonvaled@gonvaled.com> wrote:
> Funny that you do not need to escape it. The spec says you should:
>
> char
> any-Unicode-character-
>     except-"-or-\-or-
>     control-character
> \"
> \\
> \/
> \b
> \f
> \n
> \r
> \t
> \u four-hex-digits
>
> Anyway, my problem has been solved. I am not escaping anything in the
> content_type: the json library is problably doing that. What I need to do
> is to attach real base64 encoded data, which has solved my problem.
>
>
>
>
> On Wed, Dec 11, 2013 at 2:15 PM, Robert Newson <rnewson@apache.org> wrote:
>
>> ➜  ~  curl localhost:5984/db1/_bulk_docs
>> -Hcontent-type:application/json -d
>> '{"docs":[{"_attachments":{"foo":{"data":"aGVsbG8="}}}]}'
>>
>> [{"ok":true,"id":"b5d2060479624e483a8fe4747f001dbe","rev":"1-12c665c499a525a3a1a9ad35c90604a1"}]
>>
>> ➜  ~  curl localhost:5984/db1/b5d2060479624e483a8fe4747f001dbe
>>
>> {"_id":"b5d2060479624e483a8fe4747f001dbe","_rev":"1-12c665c499a525a3a1a9ad35c90604a1","_attachments":{"foo":{"content_type":"application/octet-stream","revpos":1,"digest":"md5-XUFAKrxLKna5cZ2REBfFkg==","length":5,"stub":true}}}
>>
>> ➜  ~  curl localhost:5984/db1/b5d2060479624e483a8fe4747f001dbe/foo
>> hello%
>>
>> Maybe you left escaped newlines in your base64 input?
>>
>> B.
>>
>>
>> On 11 December 2013 13:11, Robert Newson <rnewson@apache.org> wrote:
>> > http://json.org/string.gif talks escaping back slash, not forward
>> > slash. The PDF page 194 talks about escaping forward slash within a
>> > RegExp statement in Javascript, which is not JSON.
>> >
>> > B.
>> >
>> >
>> > On 11 December 2013 12:58, Daniel Gonzalez <gonvaled@gonvaled.com>
>> wrote:
>> >> It is not an artifact: I am taking that from the couchdb documentation.
>> >>
>> >> And according to Alexander Shorin forward slashes **really** need to be
>> >> escaped in json. But it is not me who must do that, but the library
>> >> converting the python objects to couchdb, so that can not be my problem.
>> >>
>> >> Now I am left with a badarg exception, which I can not relate to my
>> input
>> >> data:
>> >>
>> >> Exception > Problems updating list of documents (length = 1): (500,
>> >> ('badarg', '46'))
>> >>
>> >> What does that '46' mean?
>> >>
>> >>
>> >> On Wed, Dec 11, 2013 at 1:47 PM, Robert Newson <rnewson@apache.org>
>> wrote:
>> >>
>> >>> I think your "image\/png" is just an artifact of your printing method,
>> >>> you don't need to escape the forward slash in content_type, see
>> >>> example below;
>> >>>
>> >>>
>> >>>
>> {"_id":"doc1","_rev":"1-96e2a6c78b8bfb227e79e1fbb16873f9","_attachments":{"att1":{"content_type":"image/png","revpos":1,"digest":"md5-XUFAKrxLKna5cZ2REBfFkg==","length":5,"stub":true}}}
>> >>>
>> >>> B.
>> >>>
>> >>>
>> >>> On 11 December 2013 12:35, Daniel Gonzalez <gonvaled@gonvaled.com>
>> wrote:
>> >>> > That would work *only* for that prefix (data:image/png;base64,),
or
>> any
>> >>> > prefix which happens to have the same length. Not very robust.
>> >>> >
>> >>> > I just discovered that the data coming from the front-end comes
in
>> >>> data-uri
>> >>> > format (rfc2397). This should handle any rfc2397 prefix:
>> >>> > http://stackoverflow.com/a/20518589/647991 (maybe buggy, just
>> >>> implemented).
>> >>> >
>> >>> > Another question: even after removing the data-uri prefix, I am
still
>> >>> > getting problems. I think my content type is not right.
>> >>> >
>> >>> > Must content_type be escaped? That is:
>> >>> >
>> >>> > 'content_type': 'image/png', -> 'content_type': 'image\/png',
>> >>> >
>> >>> > The only reference I see to that is an example here:
>> >>> > http://wiki.apache.org/couchdb/HTTP_Document_API#Inline_Attachments
>> >>> >
>> >>> > But no real explanation of why. It seems no other strings must
be
>> escaped
>> >>> > for couchdb. The only requirement that couchdb seems to impose
on
>> top of
>> >>> > json is that the data in the attachment must be in base64 format.
>> >>> >
>> >>> > But now it seems that the content_type must escape the slashes
(/).
>> Why?
>> >>> It
>> >>> > does not seem to be a json feature: slashes are fine in any json
>> string.
>> >>> So
>> >>> > what is that?
>> >>> >
>> >>> > I would like to know the specificacion for the format expected
for
>> >>> > content_type. Does that have a name? I am calling it "escaped
>> mediatype".
>> >>> > Is it part of a more generic escaping process expected by couchdb,
or
>> >>> only
>> >>> > the content_type is affected? Is there an official name for that?
>> >>> >
>> >>> >
>> >>> > On Wed, Dec 11, 2013 at 12:18 PM, Johannes Jörg Schmidt <
>> >>> > schmidt@netzmerk.com> wrote:
>> >>> >
>> >>> >> data.slice(22)
>> >>> >>
>> >>> >> 2013/12/11 Daniel Gonzalez <gonvaled@gonvaled.com>:
>> >>> >> > Thanks, I just realized about this. The base64 is coming
from the
>> >>> >> > javascript frontend (chose file in a form). So I need
to remove
>> the
>> >>> >> > prefix "data:image/png;base64,".
>> >>> >> > Not sure how to do this without rolling my own regexes
though.
>> >>> >> >
>> >>> >> >
>> >>> >> > On Wed, Dec 11, 2013 at 12:01 PM, Alexander Shorin <
>> kxepal@gmail.com>
>> >>> >> wrote:
>> >>> >> >
>> >>> >> >> Hi,
>> >>> >> >>
>> >>> >> >> _attachments data should be valid base64 encoded string,
while
>> you
>> >>> have:
>> >>> >> >> >
>> >>> >> >>
>> >>> >>
>> >>>
>> ...
>> >>> >> >>
>> >>> >> >> Chars : and , are invalid for base64.
>> >>> >> >> --
>> >>> >> >> ,,,^..^,,,
>> >>> >> >>
>> >>> >> >>
>> >>> >> >> On Wed, Dec 11, 2013 at 2:49 PM, Daniel Gonzalez <
>> >>> gonvaled@gonvaled.com
>> >>> >> >
>> >>> >> >> wrote:
>> >>> >> >> > Hi,
>> >>> >> >> >
>> >>> >> >> > (SO reference: http://stackoverflow.com/q/20516980/647991.
I
>> post
>> >>> >> there
>> >>> >> >> > because formatting makes things much easier to
read, replies /
>> >>> >> comments
>> >>> >> >> are
>> >>> >> >> > well organized, and the up/downvote mechanism
works)
>> >>> >> >> >
>> >>> >> >> > I am performing the following operation:
>> >>> >> >> >
>> >>> >> >> > 1. Prepare some documents: `docs = [ doc1, doc2,
... ]`. The
>> >>> documents
>> >>> >> >> have
>> >>> >> >> > *maybe* attachments
>> >>> >> >> > 2. I `POST` to `_bulk_docs` the list of documents
>> >>> >> >> > 3. I get an `Exception > Problems updating
list of documents
>> >>> (length =
>> >>> >> >> 1):
>> >>> >> >> > (500, ('badarg', '58'))`
>> >>> >> >> >
>> >>> >> >> > My `bulk_docs` is (in this case just one):
>> >>> >> >> >
>> >>> >> >> >     [   {   '_attachments': {   'image.png':
{
>> 'content_type':
>> >>> >> >> > 'image/png',
>> >>> >> >> >                                             
    'data':
>> >>> >> >> >
>> >>> >> >>
>> >>> >>
>> >>>
>> '...'}},
>> >>> >> >> >             '_id': '08b8fc66-cd90-47a1-9053-4f6fefabdfe3',
>> >>> >> >> >             '_rev': '15-ff3d0e8baa56e5ad2fac4937264fb3f6',
>> >>> >> >> >             'docmeta': {   'created': '2013-10-01
>> 14:48:24.311257',
>> >>> >> >> >                            'updated': [   '2013-10-01
>> >>> >> 14:48:24.394157',
>> >>> >> >> >                                           '2013-12-11
>> >>> >> 08:19:47.271812',
>> >>> >> >> >                                           '2013-12-11
>> >>> >> 08:25:05.662546',
>> >>> >> >> >                                           '2013-12-11
>> >>> >> 10:38:56.116145']},
>> >>> >> >> >             'org_id': 45345,
>> >>> >> >> >             'outputs_id': None,
>> >>> >> >> >             'properties': {   'auto-t2s': False,
>> >>> >> >> >                               'content_type':
'image/png',
>> >>> >> >> >                               'lang': 'es',
>> >>> >> >> >                               'name': 'dfasdfasdf',
>> >>> >> >> >                               'text': 'erwerwerwrwerwr'},
>> >>> >> >> >             'subtype': 'voicemail-st',
>> >>> >> >> >             'tags': ['RRR-ccc-dtjkqx'],
>> >>> >> >> >             'type': 'recording'}]
>> >>> >> >> >
>> >>> >> >> > This is the detailed exception:
>> >>> >> >> >
>> >>> >> >> >     Traceback (most recent call last):
>> >>> >> >> >       File "portal_support_ut.py", line 470,
in test_UpdateDoc
>> >>> >> >> >         self.ps.UpdateDoc(self.org_id, what,
doc_id, new_data)
>> >>> >> >> >       File
>> >>> >> >> >
>> >>> >> >>
>> >>> >>
>> >>>
>> "/home/gonvaled/projects/new-wavilon-portal/python_modules/wav/ps/complex_ops.py",
>> >>> >> >> > line 349, in UpdateDoc
>> >>> >> >> >         success, doc = database.UpdateDoc(doc_id,
new_data)
>> >>> >> >> >       File
>> >>> >> >> >
>> >>> >> >>
>> >>> >>
>> >>>
>> "/home/gonvaled/projects/new-wavilon-portal/python_modules/wav/cdb/core/updater.py",
>> >>> >> >> > line 38, in UpdateDoc
>> >>> >> >> >         res = self.SaveDoc(doc_id, doc)
>> >>> >> >> >       File
>> >>> >> >> >
>> >>> >> >>
>> >>> >>
>> >>>
>> "/home/gonvaled/projects/new-wavilon-portal/python_modules/wav/cdb/core/saver.py",
>> >>> >> >> > line 88, in SaveDoc
>> >>> >> >> >         else      : self.bulk_append(doc, flush,
>> update_revision)
>> >>> >> >> >       File
>> >>> >> >> >
>> >>> >> >>
>> >>> >>
>> >>>
>> "/home/gonvaled/projects/new-wavilon-portal/python_modules/wav/cdb/core/bulker.py",
>> >>> >> >> > line 257, in bulk_append
>> >>> >> >> >         if force_send or flush or not self.timer.use_timer
:
>> >>> >> >> > self.BulkSend(show_progress=True)
>> >>> >> >> >       File
>> >>> >> >> >
>> >>> >> >>
>> >>> >>
>> >>>
>> "/home/gonvaled/projects/new-wavilon-portal/python_modules/wav/cdb/core/bulker.py",
>> >>> >> >> > line 144, in BulkSend
>> >>> >> >> >         results = self.UpdateDocuments(self.bulk)
>> >>> >> >> >       File
>> >>> >> >> >
>> >>> >> >>
>> >>> >>
>> >>>
>> "/home/gonvaled/projects/new-wavilon-portal/python_modules/wav/cdb/core/bulker.py",
>> >>> >> >> > line 67, in UpdateDocuments
>> >>> >> >> >         results = self.db.update(bulkdocs)
>> >>> >> >> >       File
>> >>> >> >> >
>> >>> >> >>
>> >>> >>
>> >>>
>> "/home/gonvaled/.virtualenvs/python2.7.3-wavilon1/local/lib/python2.7/site-packages/couchdb/client.py",
>> >>> >> >> > line 764, in update
>> >>> >> >> >         _, _, data = self.resource.post_json('_bulk_docs',
>> >>> >> body=content)
>> >>> >> >> >       File
>> >>> >> >> >
>> >>> >> >>
>> >>> >>
>> >>>
>> "/home/gonvaled/.virtualenvs/python2.7.3-wavilon1/local/lib/python2.7/site-packages/couchdb/http.py",
>> >>> >> >> > line 527, in post_json
>> >>> >> >> >         **params)
>> >>> >> >> >       File
>> >>> >> >> >
>> >>> >> >>
>> >>> >>
>> >>>
>> "/home/gonvaled/.virtualenvs/python2.7.3-wavilon1/local/lib/python2.7/site-packages/couchdb/http.py",
>> >>> >> >> > line 546, in _request_json
>> >>> >> >> >         headers=headers, **params)
>> >>> >> >> >       File
>> >>> >> >> >
>> >>> >> >>
>> >>> >>
>> >>>
>> "/home/gonvaled/.virtualenvs/python2.7.3-wavilon1/local/lib/python2.7/site-packages/couchdb/http.py",
>> >>> >> >> > line 542, in _request
>> >>> >> >> >         credentials=self.credentials)
>> >>> >> >> >       File
>> >>> >> >> >
>> >>> >> >>
>> >>> >>
>> >>>
>> "/home/gonvaled/.virtualenvs/python2.7.3-wavilon1/local/lib/python2.7/site-packages/couchdb/http.py",
>> >>> >> >> > line 398, in request
>> >>> >> >> >         raise ServerError((status, error))
>> >>> >> >> >     ServerError: (500, ('badarg', '58'))
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > What does that `badarg` mean? Is it possible
to send
>> attachments
>> >>> when
>> >>> >> >> doing
>> >>> >> >> > `_bulk_docs`?
>> >>> >> >> >
>> >>> >> >> > Thanks,
>> >>> >> >> > Daniel
>> >>> >> >>
>> >>> >>
>> >>>
>>

Mime
View raw message