Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C88CEE4C5 for ; Tue, 5 Feb 2013 18:51:38 +0000 (UTC) Received: (qmail 49026 invoked by uid 500); 5 Feb 2013 18:51:36 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 48987 invoked by uid 500); 5 Feb 2013 18:51:36 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 48978 invoked by uid 99); 5 Feb 2013 18:51:36 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Feb 2013 18:51:36 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of edlinuxguru@gmail.com designates 74.125.82.41 as permitted sender) Received: from [74.125.82.41] (HELO mail-wg0-f41.google.com) (74.125.82.41) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Feb 2013 18:51:30 +0000 Received: by mail-wg0-f41.google.com with SMTP id ds1so4345832wgb.4 for ; Tue, 05 Feb 2013 10:51:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=0XSjaVePVplCcB7N6ageQ6pq3Uv+M1cfXwar4YhiZNI=; b=aObDXoj//B4C+2/Bl4B92w+PaoOvMfEOLEiVF4R9wb93yAV8FEsG9PgxFo8CTadlhY EcO++F4UouCr1kGDKF0AZLKplL4dcgFIgdO13VH9JPpCoSsL2LEiTc3VAG5gbdwPmS8C Ec0YmBsIqS775Egw9XxeY6rvRW888BfAlvF3knumKz6B3w/j0trkBhcXuzogo9FxAiXA vNsvY3YbanAyQNI1nSupr6PEdl8Jyu1TFlzCQAku1riGcsMGTnnH/um0e7ZaWxm2yIZg T8ic609AUJ1s7sxRXk7OTpDUZtuvAOq/25Bc5DFp96pi75HUbC+kTFBEcRY5SodvkI1X zFZA== MIME-Version: 1.0 X-Received: by 10.194.94.37 with SMTP id cz5mr44760478wjb.49.1360090270390; Tue, 05 Feb 2013 10:51:10 -0800 (PST) Received: by 10.194.165.228 with HTTP; Tue, 5 Feb 2013 10:51:10 -0800 (PST) In-Reply-To: References: <46354466-86CD-4B14-82B3-4070791E9714@thelastpickle.com> <1360060470.14547.24.camel@tim-desktop> Date: Tue, 5 Feb 2013 13:51:10 -0500 Message-ID: Subject: Re: Pycassa vs YCSB results. From: Edward Capriolo To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Without stating the obvious, if you are interested in scale, then why pick python?. I did want to point out that YCSB is not even the gold standard for benchmarks using cassandra's stress you can get more ops per sec then YCSB. On Tue, Feb 5, 2013 at 1:13 PM, Pradeep Kumar Mantha wrote: > Thanks, I will use the multiprocessing package, since I need to scale it to > multiple nodes. > > I will also try to optimize the function calls and use global variables. > > Thank you very much for your help. > > > > On Tue, Feb 5, 2013 at 9:12 AM, aaron morton > wrote: >> >> The simple thing to do would be use the multiprocessing package and >> eliminate all shared state. >> >> On a multicore box python threads can run on different cores and battle >> over obtaining the GIL. >> >> Cheers >> >> ----------------- >> Aaron Morton >> Freelance Cassandra Developer >> New Zealand >> >> @aaronmorton >> http://www.thelastpickle.com >> >> On 5/02/2013, at 11:34 PM, Tim Wintle wrote: >> >> On Tue, 2013-02-05 at 21:38 +1300, aaron morton wrote: >> >> The first thing I noticed is your script uses python threading library, >> which is hampered by the Global Interpreter Lock >> http://docs.python.org/2/library/threading.html >> >> You don't really have multiple threads running in parallel, try using the >> multiprocessor library. >> >> >> Python _should_ release the GIL around IO-bound work, so this is a >> situation where the GIL shouldn't be an issue (It's actually a very good >> use for python's threads as there's no serialization overhead for >> message passing between processes as there would be in most >> multi-process examples) >> >> >> A constant factor 2 slowdown really doesn't seem that significant for >> two different implementations, and I would not worry about this unless >> you're talking about thousands of machines.. >> >> If you are talking about enough machines that this is real $$$, then I >> do think the python code can be optimised a lot. >> >> I'm talking about language/VM specific optimisations - so I'm assuming >> cpython (the standard /usr/bin/python as in the shebang). >> >> I don't know how much of a difference this will make, but I'd be >> interested in hearing your results: >> >> >> I would start by trying rewriting this: >> >> def start_cassandra_client(Threadname): >> f=open(Threadname,"w") >> for key in lines: >> key=key.strip() >> st=time.time() >> f.write(str(cf.get(key))+"\n") >> et=time.time() >> f.write("Time taken for a single query is " + >> str(round(1000*(et-st),2))+" milli secs\n") >> f.close() >> >> As something like this: >> >> def start_cassandra_client(Threadname): >> # Avoid variable names outside this scope >> time_fn = time.time >> colfam = cf >> f=open(Threadname,"w") >> for key in lines: >> key=key.strip() >> st=time_fn() >> f.write(str(colfam.get(key))+"\n") >> et=time_fn() >> f.write("Time taken for a single query is " + >> str(round(1000*(et-st),2))+" milli secs\n") >> f.close() >> >> >> If you don't consider it cheating compared to the java version, I would >> also move the "key.strip()" call to the module initiation instead of >> doing it once per thread, as there's a lot of function dispatch overhead >> in python. >> >> >> I'd also closely compare the IO going on in both versions (the .write >> calls). For example this may be significantly faster: >> >> et=time_fn() >> f.write(str(colfam.get(key))+"\nTime taken for a single query is " >> + str(round(1000*(et-st),2))+" milli secs\n") >> >> >> .. I haven't read your java code and I don't know Java IO semantics well >> enough to compare the behaviour of both. >> >> Tim >> >> >> >> >> >> Cheers >> >> ----------------- >> Aaron Morton >> Freelance Cassandra Developer >> New Zealand >> >> @aaronmorton >> http://www.thelastpickle.com >> >> On 5/02/2013, at 7:15 AM, Pradeep Kumar Mantha >> wrote: >> >> Hi, >> >> Could some one please let me know any hints, why the pycassa >> client(attached) is much slower than the YCSB? >> is it something to attribute to performance difference between python and >> Java? or the pycassa api has some performance limitations? >> >> I don't see any client statements affecting the pycassa performance. >> Please have a look at the simple python script attached and let me know >> your suggestions. >> >> thanks >> pradeep >> >> On Thu, Jan 31, 2013 at 4:53 PM, Pradeep Kumar Mantha >> wrote: >> >> >> On Thu, Jan 31, 2013 at 4:49 PM, Pradeep Kumar Mantha >> wrote: >> Thanks.. Please find the script as attachment. >> >> Just re-iterating. >> Its just a simple python script which submit 4 threads. >> This script has been scheduled on 8 cores using taskset unix command , >> thus running 32 threads/node. >> and then scaling to 16 nodes >> >> thanks >> pradeep >> >> >> On Thu, Jan 31, 2013 at 4:38 PM, Tyler Hobbs wrote: >> Can you provide the python script that you're using? >> >> (I'm moving this thread to the pycassa mailing list >> (pycassa-discuss@googlegroups.com), which is a better place for this >> discussion.) >> >> >> On Thu, Jan 31, 2013 at 6:25 PM, Pradeep Kumar Mantha >> wrote: >> Hi, >> >> I am trying to benchmark cassandra on a 12 Data Node cluster using 16 >> clients ( each client uses 32 threads) using custom pycassa client and YCSB. >> >> I found the maximum number of operations/seconds achieved using pycassa >> client is nearly 70k+ reads/second. >> Whereas with YCSB it is ~ 120k reads/second. >> >> Any thoughts, why I see this huge difference in performance? >> >> >> Here is the description of setup. >> >> Pycassa client (a simple python script). >> 1. Each pycassa client starts 4 threads - where each thread queries 76896 >> queries. >> 2. a shell script is used to submit 4threads/each core using taskset unix >> command on a 8 core single node. ( 8 * 4 * 76896 queries) >> 3. Another shell script is used to scale the single node shell script to >> 16 nodes ( total queries now - 16 * 8 * 4 * 76896 queries ) >> >> I tried to keep YCSB configuration as much as similar to my custom pycassa >> benchmarking setup. >> >> YCSB - >> >> Launched 16 YCSB clients on 16 nodes where each client uses 32 threads for >> execution and need to query ( 32 * 76896 keys ), i.e 100% reads >> >> The dataset is different in each case, but has >> >> 1. same number of total records. >> 2. same number of fields. >> 3. field length is almost same. >> >> Could you please let me know, why I see this huge performance difference >> and is there any way I can improve the operations/second using pycassa >> client. >> >> thanks >> pradeep >> >> >> >> >> -- >> Tyler Hobbs >> DataStax >> >> >> >> >> >> >> >> >> >