Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9218C104DF for ; Wed, 23 Jul 2014 13:09:33 +0000 (UTC) Received: (qmail 17364 invoked by uid 500); 23 Jul 2014 13:09:30 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 17323 invoked by uid 500); 23 Jul 2014 13:09:30 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 17313 invoked by uid 99); 23 Jul 2014 13:09:30 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Jul 2014 13:09:30 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of SRS0=6wQYX2=4S=basetechnology.com=jack@yourhostingaccount.com designates 65.254.253.58 as permitted sender) Received: from [65.254.253.58] (HELO walmailout07.yourhostingaccount.com) (65.254.253.58) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Jul 2014 13:09:24 +0000 Received: from mailscan02.yourhostingaccount.com ([10.1.15.2] helo=walmailscan02.yourhostingaccount.com) by walmailout07.yourhostingaccount.com with esmtp (Exim) id 1X9wI4-0007fN-AN for user@cassandra.apache.org; Wed, 23 Jul 2014 09:09:04 -0400 Received: from impout02.yourhostingaccount.com ([10.1.55.2] helo=impout02.yourhostingaccount.com) by walmailscan02.yourhostingaccount.com with esmtp (Exim) id 1X9wI4-0007Qv-1a for user@cassandra.apache.org; Wed, 23 Jul 2014 09:09:04 -0400 Received: from walauthsmtp08.yourhostingaccount.com ([10.1.18.8]) by impout02.yourhostingaccount.com with NO UCE id Vp921o00j0ASqTN01p927V; Wed, 23 Jul 2014 09:09:02 -0400 X-Authority-Analysis: v=2.0 cv=aPZyWMBm c=1 sm=1 a=UkMH5KcvGpXfM81wB0t8ug==:17 a=aQzbgH187woA:10 a=SahjnZT45jUA:10 a=3jZET7lWBKwA:10 a=jvYhGVW7AAAA:8 a=mV9VRH-2AAAA:8 a=PAJz7xfDAAAA:8 a=wN1unWzqHWRc1ZaetRAA:9 a=wPNLvfGTeEIA:10 a=6T1ffihQY3QA:10 a=UHKeSDB1-QTVbGbMmEQA:9 a=_W_S_7VecoQA:10 a=qMblwLmmKPHfN0zv:21 a=8amoANLqcXHyoDJd6jbCBw==:117 X-EN-OrigOutIP: 10.1.18.8 X-EN-IMPSID: Vp921o00j0ASqTN01p927V Received: from 207-237-113-28.c3-0.nyr-ubr1.nyr.ny.cable.rcn.com ([207.237.113.28]:27265 helo=JackKrupansky14) by walauthsmtp08.yourhostingaccount.com with esmtpa (Exim) id 1X9wI2-0003lz-Hz for user@cassandra.apache.org; Wed, 23 Jul 2014 09:09:02 -0400 Message-ID: From: "Jack Krupansky" To: References: In-Reply-To: Subject: Re: CSV Import is taking huge time Date: Wed, 23 Jul 2014 09:09:03 -0400 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_3367_01CFA655.BFB9F5A0" X-Priority: 3 X-MSMail-Priority: Normal Importance: Normal X-Mailer: Microsoft Windows Live Mail 16.4.3528.331 X-MimeOLE: Produced By Microsoft MimeOLE V16.4.3528.331 X-EN-UserInfo: e0a4b55451ed9f27313ebf02e3d4348d:931c98230c6409dcc37fa7e93b490c27 X-EN-AuthUser: jack@basetechnology.com Sender: "Jack Krupansky" X-EN-OrigIP: 207.237.113.28 X-EN-OrigHost: 207-237-113-28.c3-0.nyr-ubr1.nyr.ny.cable.rcn.com X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. ------=_NextPart_000_3367_01CFA655.BFB9F5A0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Is it compute bound or I/O bound? What does your cluster look like? -- Jack Krupansky From: Akshay Ballarpure=20 Sent: Wednesday, July 23, 2014 5:00 AM To: user@cassandra.apache.org=20 Subject: CSV Import is taking huge time Hello,=20 I am trying copy command in Cassandra to import CSV file in to DB, = Import is taking huge time, any suggestion to improve it?=20 id,a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z=20 100,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26 = 101,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26 = ----=20 --=20 --=20 there are ~ 50 K lines in this file , size is ~ 5 MB.=20 =20 I have created table as per below:=20 create table csldata4 ( id int PRIMARY KEY,a int , b int, c int, d int, = e int, f int,=20 g int, h int,i int, j int, k int, l int,m int, n = int, o int, p int, q int, r int, = s int, t int, u int, v int, w int, x int, y int , z int);=20 Copy Command:=20 COPY csldata4 (id , a , b , c , d , e , f , g , h , i , j , k , l , m , = n , o , p , q , r , s , t , u , v , w , x , y , z ) FROM 'csldata1.csv' = WITH HEADER=3DTRUE;=20 =20 Issue here is it's taking huge time to import=20 cqlsh:mykeyspace> COPY csldata (id , a , b , c , d , e , f , g , h , i , = j , k , l , m , n , o , p , q , r , s , t , u , v , w , x , y , z ) FROM = 'csldata1.csv' WITH HEADER=3DTRUE;=20 66215 rows imported in 1 minute and 31.044 seconds.=20 Thanks & Regards Akshay Ghanshyam Ballarpure Tata Consultancy Services Cell:- 9985084075 Mailto: akshay.ballarpure@tcs.com Website: http://www.tcs.com ____________________________________________ Experience certainty. IT Services Business Solutions Consulting ____________________________________________=20 =3D=3D=3D=3D=3D-----=3D=3D=3D=3D=3D-----=3D=3D=3D=3D=3D Notice: The information contained in this e-mail message and/or attachments to it may contain=20 confidential or privileged information. If you are=20 not the intended recipient, any dissemination, use,=20 review, distribution, printing or copying of the=20 information contained in this e-mail message=20 and/or attachments to it are strictly prohibited. If=20 you have received this communication in error,=20 please notify us by reply e-mail or telephone and=20 immediately and permanently delete the message=20 and any attachments. Thank you ------=_NextPart_000_3367_01CFA655.BFB9F5A0 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Is it compute bound or I/O bound?
 
What does your cluster look like?
 
-- Jack=20 Krupansky
 
Sent: Wednesday, July 23, 2014 5:00 AM
Subject: CSV Import is taking huge time
 
Hello,
I am=20 trying copy command in Cassandra to import CSV file in to DB, Import is = taking=20 huge time, any suggestion to improve it?

id,a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z<= /FONT>=20
100,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,= 21,22,23,24,25,26=20
101,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,= 21,22,23,24,25,26=20
----
--
--=20

there are ~ 50 K lines in this = file , size=20 is ~ 5 MB.
  =
I have created table as per below:
=

create table csldata4 ( id int PRIMARY KEY,a = int , b int,=20 c int, d int, e int, f int,
         &= nbsp;           &n= bsp; =20 g int, h int,i int, j int, k int, l int,m int, n int, o int, p int,=20 q            =             &= nbsp;           &n= bsp;   =20 int, r int, s int, t int, u int, v int, w int, x int, y int , z = int);=20
Copy Command:

COPY csldata4 (id , a , b , c , d , e , f , g , h , i = , j , k ,=20 l , m , n , o , p , q , r , s , t , u , v , w , x , y , z ) FROM = 'csldata1.csv'=20 WITH HEADER=3DTRUE;
 =20
Issue here is it's taking huge time = to=20 import

cqlsh:mykeyspace> COPY=20 csldata (id , a , b , c , d , e , f , g , h , i , j , k , l , m , n , o = , p , q=20 , r , s , t , u , v , w , x , y , z ) FROM 'csldata1.csv' WITH=20 HEADER=3DTRUE;
66215 rows = imported in=20 1 minute and = 31.044=20 seconds. =


Thanks & Regards
Akshay Ghanshyam = Ballarpure
Tata=20 Consultancy Services
Cell:- 9985084075
Mailto:=20 akshay.ballarpure@tcs.com
Website:
http://www.tcs.com
____________________________________________
Exp= erience=20 certainty.        IT=20 Services
          &= nbsp;           =20 Business=20 Solutions
          =              = Consulting
____________________________________________
=20

=3D=3D=3D=3D=3D-----=3D=3D=3D=3D=3D-----=3D=3D=3D=3D=3D
Notice: = The information contained in this=20 e-mail
message and/or attachments to it may contain
confidential = or=20 privileged information. If you are
not the intended recipient, any=20 dissemination, use,
review, distribution, printing or copying of the =
information contained in this e-mail message
and/or attachments = to it=20 are strictly prohibited. If
you have received this communication in = error,=20
please notify us by reply e-mail or telephone and
immediately = and=20 permanently delete the message
and any attachments. Thank=20 you

------=_NextPart_000_3367_01CFA655.BFB9F5A0--