Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2021B101E7 for ; Wed, 2 Oct 2013 14:45:17 +0000 (UTC) Received: (qmail 55241 invoked by uid 500); 2 Oct 2013 14:45:12 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 54949 invoked by uid 500); 2 Oct 2013 14:45:11 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 54941 invoked by uid 99); 2 Oct 2013 14:45:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Oct 2013 14:45:10 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of paolo.crosato@targaubiest.com designates 213.199.154.78 as permitted sender) Received: from [213.199.154.78] (HELO emea01-db3-obe.outbound.protection.outlook.com) (213.199.154.78) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Oct 2013 14:45:04 +0000 Received: from AM2PRD0311HT002.eurprd03.prod.outlook.com (10.255.162.37) by AMSPR03MB276.eurprd03.prod.outlook.com (10.242.85.153) with Microsoft SMTP Server (TLS) id 15.0.775.9; Wed, 2 Oct 2013 14:44:41 +0000 Received: from [10.1.10.184] (80.86.146.216) by pod51013.outlook.com (10.255.162.37) with Microsoft SMTP Server (TLS) id 14.16.359.1; Wed, 2 Oct 2013 14:44:40 +0000 Message-ID: <524C3155.3080707@targaubiest.com> Date: Wed, 2 Oct 2013 16:44:37 +0200 From: Paolo Crosato User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.0 MIME-Version: 1.0 To: Subject: Issue with source command and utf8 file Content-Type: text/plain; charset="ISO-8859-15"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [80.86.146.216] X-Forefront-PRVS: 0987ACA2E2 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(189002)(199002)(83322001)(36756003)(56776001)(47446002)(19580395003)(74502001)(80316001)(31966008)(74662001)(80976001)(56816003)(81686001)(76482001)(76786001)(54316002)(76796001)(81542001)(69226001)(74876001)(81342001)(47976001)(74706001)(23756003)(50986001)(50466002)(74366001)(65806001)(80022001)(66066001)(47736001)(49866001)(65956001)(64126003)(83072001)(46102001)(51856001)(53806001)(4396001)(79102001)(54356001)(81816001)(76176001)(59896001)(59766001)(77982001)(63696002)(47776003)(33656001)(83506001);DIR:OUT;SFP:;SCL:1;SRVR:AMSPR03MB276;H:AM2PRD0311HT002.eurprd03.prod.outlook.com;CLIP:80.86.146.216;FPR:;RD:InfoNoRecords;MX:1;A:0;LANG:en; X-OriginatorOrg: targaubiest.com X-Virus-Checked: Checked by ClamAV on apache.org Hi, I'm trying to load some data in Cassandra by the source command in cqlsh. The file is utf8 encoded, however Cassandra seems unable to detect utf8 encoded characters. Here is a sample: insert into positions8(iddevice,timestampevent,idunit,idevent,status,value) values(401000035,'2013-06-06T10:08:02',13524915,0,'G','{"sp":"0","A1":"FRANCE","lat":"45216954","iDD":"401000035","A2":"RH�NE-ALPES","tEv":"2013-06-06T10:08:02","iE":"0","iTE":"0","lng":"6462520","iD":"13318089","mi":0,"st":"�CHANGEUR DE ST-MICHEL-DE-MAURIENNE","A4":"SAINT-MARTIN-D'ARC","iU":"13524915","A3":"SAVOIE","tRx":"2013-06-06T10:12:56"}'); Here is the hex dump of the file: 6e69 6573 7472 6920 746e 206f 6f70 6973 6974 6e6f 3873 6928 6464 7665 6369 2c65 6974 656d 7473 6d61 6570 6576 746e 692c 7564 696e 2c74 6469 7665 6e65 2c74 7473 7461 7375 762c 6c61 6575 2029 6176 756c 7365 3428 3130 3030 3030 3533 272c 3032 3331 302d 2d36 3630 3154 3a30 3830 303a 2732 312c 3533 3432 3139 2c35 2c30 4727 2c27 7b27 7322 2270 223a 2230 222c 3141 3a22 4622 4152 434e 2245 222c 616c 2274 223a 3534 3132 3936 3435 2c22 6922 4444 3a22 3422 3130 3030 3030 3533 2c22 4122 2232 223a 4852 94c3 454e 412d 504c 5345 2c22 7422 7645 3a22 3222 3130 2d33 3630 302d 5436 3031 303a 3a38 3230 2c22 6922 2245 223a 2230 222c 5469 2245 223a 2230 222c 6e6c 2267 223a 3436 3236 3235 2230 222c 4469 3a22 3122 3333 3831 3830 2239 222c 696d 3a22 2c30 7322 2274 223a 89c3 4843 4e41 4547 5255 4420 2045 5453 4d2d 4349 4548 2d4c 4544 4d2d 5541 4952 4e45 454e 2c22 4122 2234 223a 4153 4e49 2d54 414d 5452 4e49 442d 4127 4352 2c22 6922 2255 223a 3331 3235 3934 3531 2c22 4122 2233 223a 4153 4f56 4549 2c22 7422 7852 3a22 3222 3130 2d33 3630 302d 5436 3031 313a 3a32 3635 7d22 2927 0a3b 000a As an example, � is encoded as C394. When I try to load the file I get this error: cqlsh:demodb> source 'rhone.cql'; rhone.cql:3:Incomplete statement at end of file The error disappears only when I remove all the non ascii characters. If I copy and paste the insert on cqlsh shell, it works. Cassandra is installed on a centos 6.3 server, LANG is .UTF8, I tried connecting from remote both with gnome terminal and putty on windows, with utf-8 shell, no success on both. Has anybody got any clue? Regards, Paolo -- Paolo Crosato Software engineer/Custom Solutions