Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1EF7DB4EC for ; Tue, 10 Jan 2012 12:14:32 +0000 (UTC) Received: (qmail 6539 invoked by uid 500); 10 Jan 2012 11:59:18 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 6040 invoked by uid 500); 10 Jan 2012 11:58:51 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 5803 invoked by uid 99); 10 Jan 2012 11:58:42 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Jan 2012 11:58:42 +0000 X-ASF-Spam-Status: No, hits=0.4 required=5.0 tests=FROM_LOCAL_NOVOWEL,HK_RANDOM_ENVFROM,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of cgsmcmlxxv@gmail.com designates 209.85.212.180 as permitted sender) Received: from [209.85.212.180] (HELO mail-wi0-f180.google.com) (209.85.212.180) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Jan 2012 11:58:36 +0000 Received: by wibhj10 with SMTP id hj10so4632042wib.11 for ; Tue, 10 Jan 2012 03:58:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=UYC8e9SYZ9tEbkt7K70aRgy7OQp7B8WKkarSiguGNmo=; b=hupV4cqtconLrbr0KceNu20nRzovdDuE/2IOH0P93n1axjsW0dkt/oBjYI+cjOaKR/ MxVYt7ChitgmK2aMFSNGHe8vFWpQx/hG12XAbnd8l/bTlHJ6umGPMg/vdmaZp3yGiZis sAmj9Y3IadAQnabQG1ZHfgLYbJdP5+loe1t8Q= Received: by 10.180.72.162 with SMTP id e2mr11448372wiv.8.1326196695097; Tue, 10 Jan 2012 03:58:15 -0800 (PST) Received: from [192.168.1.103] (095160203004.wroclaw.vectranet.pl. [95.160.203.4]) by mx.google.com with ESMTPS id di5sm171808094wib.3.2012.01.10.03.58.13 (version=SSLv3 cipher=OTHER); Tue, 10 Jan 2012 03:58:14 -0800 (PST) Message-ID: <4F0C27D4.8020802@gmail.com> Date: Tue, 10 Jan 2012 12:58:12 +0100 From: CGS User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.23) Gecko/20110922 Thunderbird/3.1.15 MIME-Version: 1.0 To: user@couchdb.apache.org Subject: Re: i have a bulk insert problem about invalid json References: <4F0C16B5.3080604@gmail.com> <4F0C19D9.2030300@gmail.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit There is a limit for sure, but there are two factors you have to consider: 1. HTTP request limit in the number of characters (for example, read this: http://stackoverflow.com/questions/2659952/maximum-length-of-http-get-request); 2. prompter command under Linux/Cygwin has a maximum number of characters (depends on the Linux flavor). Under CentOS 6, I was able to send 800 documents per instance (document = few simple pairs key-value including _id), but not 1000. At 1 kdocs I got shell error. Nevertheless, this test is not complete because I used CentOS 6 for both client and CouchDB server and I don't know the exact length of the command. CGS On 01/10/2012 12:20 PM, Zekeriya KOÇ wrote: > Thanks for all replies. > > The problem was first, the BOM character. After that i split my files into > chunks that smaller than 30mb. and it started to work. > > There is a request size limit isn't there? > > Again, thanks for all the replies. > > 2012/1/10 CGS > >> Oh, I forgot to write the solution, in case it's not obvious. Just divide >> the number of docs for multiple instances of cURL and it will work. Don't >> worry, you still use the power of the bulk operation (I had an insertion >> rate like 5-6 kdocs/s on a not-that-greate server even if I had to send >> more requests at the same time). >> >> CGS >> >> >> >> >> On 01/10/2012 11:45 AM, CGS wrote: >> >>> Hi, >>> >>> With 255000 documents in one session, you go over the number of the >>> characters allowed either for a prompter command or for a HTTP request (if >>> not for both). The session truncates the command, so, your JSON is >>> incomplete. That gave me that response in the past. >>> >>> CGS >>> >>> >>> >>> >>> On 01/10/2012 10:11 AM, Zekeriya KOÇ wrote: >>> >>>> Sorry for subjectless message!!! >>>> >>>> Hello, >>>> >>>> my problem: i am trying to insert approximately 255000 documents to a >>>> >>>> couchdb instance with bulk docs api. i always get invalid json >>>> error. >>>> >>>> so i am trying to test the problem with just one document. because >>>> the error raises wether with a large file or a file with just one >>>> document. >>>> >>>> my system: >>>> couchdb: on an ubuntu server 10.04 >>>> client: windows 7 with cygwin curl >>>> >>>> $ curl -X GET http://admin:ad...>>> unlock?hl=tr&_done=/group/**couchbase/browse_thread/** >>>> thread/7f908b186f025047%3Fhl%**3Dtr&msg=25cba4108fd1a8e8 >>>> @10.81.2.100:5984 >>>> {"couchdb":"Welcome","version"**:"1.1.0","vendor": >>>> {"version":"1.2.0","name":"**Couchbase","url":"http:// >>>> www.couchbase.com/>>> couchbase.com/&usg=**AFQjCNGuaH0E_Cygc_yqQqgX0s-**cmb5BuQ> >>>> >>>> "}} >>>> >>>> $ curl -d @test.txt -H "Content-Type:application/**json" -X POST >>>> http://admin:ad...>>> unlock?hl=tr&_done=/group/**couchbase/browse_thread/** >>>> thread/7f908b186f025047%3Fhl%**3Dtr&msg=25cba4108fd1a8e8> >>>> >>>> @10.81.2.100:5984/dbmerkez/_**bulk_docs >>>> {"error":"bad_request","**reason":"invalid UTF-8 JSON:<<\"\ufeff{\\ >>>> \"docs\\\":[{\\\"adi\\\": \\\"zeko\\\"}]}\">>"} >>>> >>>> now i copy the content of test.txt and paste it to my command line: >>>> $ curl -d '{"docs":[{"adi": "zeko"}]}' -H "Content-Type:application/ >>>> json" -X POST http://admin:ad...>>> unlock?hl=tr&_done=/group/**couchbase/browse_thread/** >>>> thread/7f908b186f025047%3Fhl%**3Dtr&msg=25cba4108fd1a8e8 >>>> @10.81.2.100:5984/dbmerkez/_**bulk_docs >>>> [{"id":"**74a5d37e71215e2095d00f90a00007**ac","rev":"1-**111c10804ee9f2b8384ab95e >>>> >>>> f66268e0"}] >>>> >>>> as you can see same content gives an invalid json error within a file >>>> but from direct command line it inserts fine. >>>> >>>> my text file is encoded in utf-8. >>>> >>>> i am so close to give up. i am fighting with this for hours. if i can >>>> not insert initial data to my instance i can not test the replication >>>> cases. >>>> >>>> please help!! >>>> >>>> >