incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <rnew...@apache.org>
Subject Re: i have a bulk insert problem about invalid json
Date Tue, 10 Jan 2012 11:59:51 GMT
I think the only problem here was the BOM, though sending very large
_bulk_docs bodies is not advisable as it is memory intensive on the
server.

There is no limit to the size of an HTTP request body but curl will
read the file into memory with -d or --data-binary (but not with the
-T option). CouchDB can restrict the size of a document PUT but the
shipped default is so generous (4 Gib!) that you more likely to
encounter other problems before you hit it (notably, running out of
RAM).

B.


On 10 January 2012 11:20, Zekeriya KOÇ <zekzekus@gmail.com> wrote:
> Thanks for all replies.
>
> The problem was first, the BOM character. After that i split my files into
> chunks that smaller than 30mb. and it started to work.
>
> There is a request size limit isn't there?
>
> Again, thanks for all the replies.
>
> 2012/1/10 CGS <cgsmcmlxxv@gmail.com>
>
>>
>> Oh, I forgot to write the solution, in case it's not obvious. Just divide
>> the number of docs for multiple instances of cURL and it will work. Don't
>> worry, you still use the power of the bulk operation (I had an insertion
>> rate like 5-6 kdocs/s on a not-that-greate server even if I had to send
>> more requests at the same time).
>>
>> CGS
>>
>>
>>
>>
>> On 01/10/2012 11:45 AM, CGS wrote:
>>
>>> Hi,
>>>
>>> With 255000 documents in one session, you go over the number of the
>>> characters allowed either for a prompter command or for a HTTP request (if
>>> not for both). The session truncates the command, so, your JSON is
>>> incomplete. That gave me that response in the past.
>>>
>>> CGS
>>>
>>>
>>>
>>>
>>> On 01/10/2012 10:11 AM, Zekeriya KOÇ wrote:
>>>
>>>> Sorry for subjectless message!!!
>>>>
>>>> Hello,
>>>>
>>>> my problem: i am trying to insert approximately 255000 documents to a
>>>>
>>>> couchdb instance with bulk docs api. i always get invalid json
>>>> error.
>>>>
>>>> so i am trying to test  the problem with just one document. because
>>>> the error raises wether with a large file or a file with just one
>>>> document.
>>>>
>>>> my system:
>>>> couchdb: on an ubuntu server 10.04
>>>> client: windows 7 with cygwin curl
>>>>
>>>> $ curl -X GET http://admin:ad...<https://**groups.google.com/groups/**
>>>> unlock?hl=tr&_done=/group/**couchbase/browse_thread/**
>>>> thread/7f908b186f025047%3Fhl%**3Dtr&msg=25cba4108fd1a8e8<https://groups.google.com/groups/unlock?hl=tr&_done=/group/couchbase/browse_thread/thread/7f908b186f025047%3Fhl%3Dtr&msg=25cba4108fd1a8e8>
>>>> >
>>>> @10.81.2.100:5984
>>>> {"couchdb":"Welcome","version"**:"1.1.0","vendor":
>>>> {"version":"1.2.0","name":"**Couchbase","url":"http://
>>>> www.couchbase.com/<http://www.**google.com/url?sa=D&q=www.**
>>>> couchbase.com/&usg=**AFQjCNGuaH0E_Cygc_yqQqgX0s-**cmb5BuQ<http://www.google.com/url?sa=D&q=www.couchbase.com/&usg=AFQjCNGuaH0E_Cygc_yqQqgX0s-cmb5BuQ>>
>>>>
>>>> "}}
>>>>
>>>> $ curl -d @test.txt -H "Content-Type:application/**json" -X POST
>>>> http://admin:ad...<https://**groups.google.com/groups/**
>>>> unlock?hl=tr&_done=/group/**couchbase/browse_thread/**
>>>> thread/7f908b186f025047%3Fhl%**3Dtr&msg=25cba4108fd1a8e8<https://groups.google.com/groups/unlock?hl=tr&_done=/group/couchbase/browse_thread/thread/7f908b186f025047%3Fhl%3Dtr&msg=25cba4108fd1a8e8>>
>>>>
>>>> @10.81.2.100:5984/dbmerkez/_**bulk_docs<http://10.81.2.100:5984/dbmerkez/_bulk_docs>
>>>> {"error":"bad_request","**reason":"invalid UTF-8 JSON:<<\"\ufeff{\\
>>>> \"docs\\\":[{\\\"adi\\\": \\\"zeko\\\"}]}\">>"}
>>>>
>>>> now i copy the content of test.txt and paste it to my command line:
>>>> $ curl -d '{"docs":[{"adi": "zeko"}]}' -H "Content-Type:application/
>>>> json" -X POST http://admin:ad...<https://**groups.google.com/groups/**
>>>> unlock?hl=tr&_done=/group/**couchbase/browse_thread/**
>>>> thread/7f908b186f025047%3Fhl%**3Dtr&msg=25cba4108fd1a8e8<https://groups.google.com/groups/unlock?hl=tr&_done=/group/couchbase/browse_thread/thread/7f908b186f025047%3Fhl%3Dtr&msg=25cba4108fd1a8e8>
>>>> >
>>>> @10.81.2.100:5984/dbmerkez/_**bulk_docs<http://10.81.2.100:5984/dbmerkez/_bulk_docs>
>>>> [{"id":"**74a5d37e71215e2095d00f90a00007**ac","rev":"1-**111c10804ee9f2b8384ab95e
>>>>
>>>> f66268e0"}]
>>>>
>>>> as you can see same content gives an invalid json error within a file
>>>> but from direct command line it inserts fine.
>>>>
>>>> my text file is encoded in utf-8.
>>>>
>>>> i am so close to give up. i am fighting with this for hours. if i can
>>>> not insert initial data to my instance i can not test the replication
>>>> cases.
>>>>
>>>> please help!!
>>>>
>>>>
>>>
>>
>
>
> --
> Zekeriya "zekUs" KOÇ - http://zekzekus.com/

Mime
View raw message