Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B8558B52E for ; Tue, 10 Jan 2012 12:14:46 +0000 (UTC) Received: (qmail 7035 invoked by uid 500); 10 Jan 2012 12:00:11 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 6839 invoked by uid 500); 10 Jan 2012 11:59:57 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 6817 invoked by uid 99); 10 Jan 2012 11:59:53 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Jan 2012 11:59:53 +0000 Received: from localhost (HELO mail-iy0-f180.google.com) (127.0.0.1) (smtp-auth username rnewson, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Jan 2012 11:59:52 +0000 Received: by iadk27 with SMTP id k27so932417iad.11 for ; Tue, 10 Jan 2012 03:59:52 -0800 (PST) MIME-Version: 1.0 Received: by 10.50.155.166 with SMTP id vx6mr1786705igb.16.1326196792035; Tue, 10 Jan 2012 03:59:52 -0800 (PST) Received: by 10.42.243.67 with HTTP; Tue, 10 Jan 2012 03:59:51 -0800 (PST) In-Reply-To: References: <4F0C16B5.3080604@gmail.com> <4F0C19D9.2030300@gmail.com> Date: Tue, 10 Jan 2012 11:59:51 +0000 Message-ID: Subject: Re: i have a bulk insert problem about invalid json From: Robert Newson To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I think the only problem here was the BOM, though sending very large _bulk_docs bodies is not advisable as it is memory intensive on the server. There is no limit to the size of an HTTP request body but curl will read the file into memory with -d or --data-binary (but not with the -T option). CouchDB can restrict the size of a document PUT but the shipped default is so generous (4 Gib!) that you more likely to encounter other problems before you hit it (notably, running out of RAM). B. On 10 January 2012 11:20, Zekeriya KO=C7 wrote: > Thanks for all replies. > > The problem was first, the BOM character. After that i split my files int= o > chunks that smaller than 30mb. and it started to work. > > There is a request size limit isn't there? > > Again, thanks for all the replies. > > 2012/1/10 CGS > >> >> Oh, I forgot to write the solution, in case it's not obvious. Just divid= e >> the number of docs for multiple instances of cURL and it will work. Don'= t >> worry, you still use the power of the bulk operation (I had an insertion >> rate like 5-6 kdocs/s on a not-that-greate server even if I had to send >> more requests at the same time). >> >> CGS >> >> >> >> >> On 01/10/2012 11:45 AM, CGS wrote: >> >>> Hi, >>> >>> With 255000 documents in one session, you go over the number of the >>> characters allowed either for a prompter command or for a HTTP request = (if >>> not for both). The session truncates the command, so, your JSON is >>> incomplete. That gave me that response in the past. >>> >>> CGS >>> >>> >>> >>> >>> On 01/10/2012 10:11 AM, Zekeriya KO=C7 wrote: >>> >>>> Sorry for subjectless message!!! >>>> >>>> Hello, >>>> >>>> my problem: i am trying to insert approximately 255000 documents to a >>>> >>>> couchdb instance with bulk docs api. i always get invalid json >>>> error. >>>> >>>> so i am trying to test =A0the problem with just one document. because >>>> the error raises wether with a large file or a file with just one >>>> document. >>>> >>>> my system: >>>> couchdb: on an ubuntu server 10.04 >>>> client: windows 7 with cygwin curl >>>> >>>> $ curl -X GET http://admin:ad...>>> unlock?hl=3Dtr&_done=3D/group/**couchbase/browse_thread/** >>>> thread/7f908b186f025047%3Fhl%**3Dtr&msg=3D25cba4108fd1a8e8 >>>> > >>>> @10.81.2.100:5984 >>>> {"couchdb":"Welcome","version"**:"1.1.0","vendor": >>>> {"version":"1.2.0","name":"**Couchbase","url":"http:// >>>> www.couchbase.com/>>> couchbase.com/&usg=3D**AFQjCNGuaH0E_Cygc_yqQqgX0s-**cmb5BuQ> >>>> >>>> "}} >>>> >>>> $ curl -d @test.txt -H "Content-Type:application/**json" -X POST >>>> http://admin:ad...>>> unlock?hl=3Dtr&_done=3D/group/**couchbase/browse_thread/** >>>> thread/7f908b186f025047%3Fhl%**3Dtr&msg=3D25cba4108fd1a8e8> >>>> >>>> @10.81.2.100:5984/dbmerkez/_**bulk_docs >>>> {"error":"bad_request","**reason":"invalid UTF-8 JSON:<<\"\ufeff{\\ >>>> \"docs\\\":[{\\\"adi\\\": \\\"zeko\\\"}]}\">>"} >>>> >>>> now i copy the content of test.txt and paste it to my command line: >>>> $ curl -d '{"docs":[{"adi": "zeko"}]}' -H "Content-Type:application/ >>>> json" -X POST http://admin:ad...>>> unlock?hl=3Dtr&_done=3D/group/**couchbase/browse_thread/** >>>> thread/7f908b186f025047%3Fhl%**3Dtr&msg=3D25cba4108fd1a8e8 >>>> > >>>> @10.81.2.100:5984/dbmerkez/_**bulk_docs >>>> [{"id":"**74a5d37e71215e2095d00f90a00007**ac","rev":"1-**111c10804ee9f= 2b8384ab95e >>>> >>>> f66268e0"}] >>>> >>>> as you can see same content gives an invalid json error within a file >>>> but from direct command line it inserts fine. >>>> >>>> my text file is encoded in utf-8. >>>> >>>> i am so close to give up. i am fighting with this for hours. if i can >>>> not insert initial data to my instance i can not test the replication >>>> cases. >>>> >>>> please help!! >>>> >>>> >>> >> > > > -- > Zekeriya "zekUs" KO=C7 - http://zekzekus.com/