Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1B09D961C for ; Wed, 25 Apr 2012 12:59:27 +0000 (UTC) Received: (qmail 87543 invoked by uid 500); 25 Apr 2012 12:59:25 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 87440 invoked by uid 500); 25 Apr 2012 12:59:25 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 87416 invoked by uid 99); 25 Apr 2012 12:59:24 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Apr 2012 12:59:24 +0000 Received: from localhost (HELO mail-gy0-f180.google.com) (127.0.0.1) (smtp-auth username rnewson, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Apr 2012 12:59:24 +0000 Received: by ghbz12 with SMTP id z12so58509ghb.11 for ; Wed, 25 Apr 2012 05:59:23 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.42.129 with SMTP id o1mr2317575igl.72.1335358763606; Wed, 25 Apr 2012 05:59:23 -0700 (PDT) Received: by 10.42.240.135 with HTTP; Wed, 25 Apr 2012 05:59:22 -0700 (PDT) In-Reply-To: References: Date: Wed, 25 Apr 2012 13:59:22 +0100 Message-ID: Subject: Re: CouchDB Invalid JSON UTF-8 From: Robert Newson To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable It sounds like SQLToNoSQLImporter is not converting your data correctly. As it's Java, I would take a wild guess and assume the characters to bytes translation is being done with the platform default rather than "UTF-8". Since UTF-8 is the default encoding for JSON strings, that would be a pretty big oversight. B. On 25 April 2012 11:59, Paulo Carvalho wrote: > Hello, > > I am trying SQLToNoSQLImporter to import data to a couchDB database > from a Postgresql database. > > I configured correctly the import.properties and db-data-config files. > > When I execute run.bat command (I am using windows), I get the > following result: > > 07:50:14,568 =A0INFO DataImporter:134 - Data Configuration loaded > successfully > 07:50:18,477 ERROR DataImporter:178 - ***** =A0Data import failed. > ********** > =A0Reason is : > org.apache.http.HttpException: HTTP/1.1 400 Bad Request > =A0 =A0 =A0 =A0at > net.sathis.export.sql.couch.CouchWriter.post(CouchWriter.java:68) > =A0 =A0 =A0 =A0at > net.sathis.export.sql.couch.CouchWriter.writeToNoSQL(CouchWriter.java: > 52) > =A0 =A0 =A0 =A0at net.sathis.export.sql.DocBuilder.execute(DocBuilder.jav= a: > 142) > =A0 =A0 =A0 =A0at > net.sathis.export.sql.DataImporter.doFullImport(DataImporter.java:174) > =A0 =A0 =A0 =A0at > net.sathis.export.sql.DataImporter.doDataImport(DataImporter.java:93) > =A0 =A0 =A0 =A0at > net.sathis.export.sql.SQLToNoSQLImporter.main(SQLToNoSQLImporter.java: > 19) > > As you can see, the configuration file is loaded correctly. In the > couchDB database log file, I get the following error: > > [debug] [<0.147.0>] Invalid JSON: {{error, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 {126, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0"lexical error: invalid bytes > in UTF8 string.\n"}}, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 <<"{\= "docs\":[{\"_id\":\"0\",\"label > \":\"Pas de taches\"},{\"_id\":\"1\",\"description\":\"Le pourcentage > de recouvrement est < 2 %\",\"label\":\"Tr=E8s peu nombreuses\"},{\"_id > \":\"2\",\"description\":\"Le p....... > > I think the problem happens because the text contained in the table > has special characters ("=E8", etc.). > > The postgresql database is coded in UTF-8. > > > Trying to solve the problem, I have written a little JSON file and i trie= d > to insert it on my database. My JSON file content was the following: > {"docs":[{"_id":"0","label ":"Pas de taches"}]} > > The result of inserting it on my database was: The result was: > {"ok":true,"id":"doc_id","rev":"1- ffaec7bc2aa548ca8e5a9c697ea3eb64"} > > Next, I changed just a little my JSON file: I've put a special character > (=E2): > {"docs":[{"_id":"0","label ":"Pas de t=E2ches"}]} > > The result of inserting this JSON file on the database was: > {"error":"bad_request","reason":"invalid_json"} > > > > Anyone can help me with this issue? > > Thank you > > Best regards.