incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From CGS <cgsmcml...@gmail.com>
Subject Re: i have a bulk insert problem about invalid json
Date Tue, 10 Jan 2012 11:58:12 GMT

There is a limit for sure, but there are two factors you have to consider:
1. HTTP request limit in the number of characters (for example, read 
this: 
http://stackoverflow.com/questions/2659952/maximum-length-of-http-get-request);
2. prompter command under Linux/Cygwin has a maximum number of 
characters (depends on the Linux flavor).

Under CentOS 6, I was able to send 800 documents per instance (document 
= few simple pairs key-value including _id), but not 1000. At 1 kdocs I 
got shell error. Nevertheless, this test is not complete because I used 
CentOS 6 for both client and CouchDB server and I don't know the exact 
length of the command.

CGS




On 01/10/2012 12:20 PM, Zekeriya KOÇ wrote:
> Thanks for all replies.
>
> The problem was first, the BOM character. After that i split my files into
> chunks that smaller than 30mb. and it started to work.
>
> There is a request size limit isn't there?
>
> Again, thanks for all the replies.
>
> 2012/1/10 CGS<cgsmcmlxxv@gmail.com>
>
>> Oh, I forgot to write the solution, in case it's not obvious. Just divide
>> the number of docs for multiple instances of cURL and it will work. Don't
>> worry, you still use the power of the bulk operation (I had an insertion
>> rate like 5-6 kdocs/s on a not-that-greate server even if I had to send
>> more requests at the same time).
>>
>> CGS
>>
>>
>>
>>
>> On 01/10/2012 11:45 AM, CGS wrote:
>>
>>> Hi,
>>>
>>> With 255000 documents in one session, you go over the number of the
>>> characters allowed either for a prompter command or for a HTTP request (if
>>> not for both). The session truncates the command, so, your JSON is
>>> incomplete. That gave me that response in the past.
>>>
>>> CGS
>>>
>>>
>>>
>>>
>>> On 01/10/2012 10:11 AM, Zekeriya KOÇ wrote:
>>>
>>>> Sorry for subjectless message!!!
>>>>
>>>> Hello,
>>>>
>>>> my problem: i am trying to insert approximately 255000 documents to a
>>>>
>>>> couchdb instance with bulk docs api. i always get invalid json
>>>> error.
>>>>
>>>> so i am trying to test  the problem with just one document. because
>>>> the error raises wether with a large file or a file with just one
>>>> document.
>>>>
>>>> my system:
>>>> couchdb: on an ubuntu server 10.04
>>>> client: windows 7 with cygwin curl
>>>>
>>>> $ curl -X GET http://admin:ad...<https://**groups.google.com/groups/**
>>>> unlock?hl=tr&_done=/group/**couchbase/browse_thread/**
>>>> thread/7f908b186f025047%3Fhl%**3Dtr&msg=25cba4108fd1a8e8<https://groups.google.com/groups/unlock?hl=tr&_done=/group/couchbase/browse_thread/thread/7f908b186f025047%3Fhl%3Dtr&msg=25cba4108fd1a8e8>
>>>> @10.81.2.100:5984
>>>> {"couchdb":"Welcome","version"**:"1.1.0","vendor":
>>>> {"version":"1.2.0","name":"**Couchbase","url":"http://
>>>> www.couchbase.com/<http://www.**google.com/url?sa=D&q=www.**
>>>> couchbase.com/&usg=**AFQjCNGuaH0E_Cygc_yqQqgX0s-**cmb5BuQ<http://www.google.com/url?sa=D&q=www.couchbase.com/&usg=AFQjCNGuaH0E_Cygc_yqQqgX0s-cmb5BuQ>>
>>>>
>>>> "}}
>>>>
>>>> $ curl -d @test.txt -H "Content-Type:application/**json" -X POST
>>>> http://admin:ad...<https://**groups.google.com/groups/**
>>>> unlock?hl=tr&_done=/group/**couchbase/browse_thread/**
>>>> thread/7f908b186f025047%3Fhl%**3Dtr&msg=25cba4108fd1a8e8<https://groups.google.com/groups/unlock?hl=tr&_done=/group/couchbase/browse_thread/thread/7f908b186f025047%3Fhl%3Dtr&msg=25cba4108fd1a8e8>>
>>>>
>>>> @10.81.2.100:5984/dbmerkez/_**bulk_docs<http://10.81.2.100:5984/dbmerkez/_bulk_docs>
>>>> {"error":"bad_request","**reason":"invalid UTF-8 JSON:<<\"\ufeff{\\
>>>> \"docs\\\":[{\\\"adi\\\": \\\"zeko\\\"}]}\">>"}
>>>>
>>>> now i copy the content of test.txt and paste it to my command line:
>>>> $ curl -d '{"docs":[{"adi": "zeko"}]}' -H "Content-Type:application/
>>>> json" -X POST http://admin:ad...<https://**groups.google.com/groups/**
>>>> unlock?hl=tr&_done=/group/**couchbase/browse_thread/**
>>>> thread/7f908b186f025047%3Fhl%**3Dtr&msg=25cba4108fd1a8e8<https://groups.google.com/groups/unlock?hl=tr&_done=/group/couchbase/browse_thread/thread/7f908b186f025047%3Fhl%3Dtr&msg=25cba4108fd1a8e8>
>>>> @10.81.2.100:5984/dbmerkez/_**bulk_docs<http://10.81.2.100:5984/dbmerkez/_bulk_docs>
>>>> [{"id":"**74a5d37e71215e2095d00f90a00007**ac","rev":"1-**111c10804ee9f2b8384ab95e
>>>>
>>>> f66268e0"}]
>>>>
>>>> as you can see same content gives an invalid json error within a file
>>>> but from direct command line it inserts fine.
>>>>
>>>> my text file is encoded in utf-8.
>>>>
>>>> i am so close to give up. i am fighting with this for hours. if i can
>>>> not insert initial data to my instance i can not test the replication
>>>> cases.
>>>>
>>>> please help!!
>>>>
>>>>
>


Mime
View raw message