Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: cassandra-user@incubator.apache.org
Received-SPF: pass (athena.apache.org: domain of richiesgr@gmail.com
 designates 209.85.220.214 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type;
        b=EC/l66/yRsvVr1T3HmIcvImfcR94MjLS5JF3lFDtL/x93xr34JLKSIu3KNti0pnwSI
         A4K6hxpqY6tndNMi8fnP3wd1dy8SoU2CdWTTXUnDnB/H5oDl6D46H4Y485Ka+epSyy6l
         T0mQkjfvDWvSbe6H2ORLzC+j46pobSm52BYEw=
MIME-Version: 1.0
In-Reply-To: <e06563880912160645i6a16f98av60b6111f14cf86e6@mail.gmail.com>
References: <468b21170912160606n64e1f780id3500758862eaddb@mail.gmail.com>
	 <e06563880912160645i6a16f98av60b6111f14cf86e6@mail.gmail.com>
Date: Wed, 16 Dec 2009 17:30:23 +0200
Message-ID: <468b21170912160730t13f0b551rb46c6db08721aff9@mail.gmail.com>
Subject: Re: Question about Insert Time with multiple node
From: Richard Grossman <richiesgr@gmail.com>
To: cassandra-user@incubator.apache.org
Content-Type: multipart/alternative; boundary=001485f271e060459d047ada2fba

--001485f271e060459d047ada2fba
Content-Type: text/plain; charset=ISO-8859-1

I'm not using 50 thread but make it with 4 thread.
I give 2 thread by server ip. So I insert using 2 thread on 1 machine and
with 2 other on the second machine

I need to add a lot of thread to be able to insert this data quickly enough.

but for you it's logical this behavior ?


On Wed, Dec 16, 2009 at 4:45 PM, Jonathan Ellis <jbellis@gmail.com> wrote:

> Sounds like you are using a single thread, so the increased latency is
> artificially reducing your numbers.  Add more threads (stress.py uses
> 50 by default) to get more throughput. (Also true even for a single
> node, but more noticable when you add network overhead to the
> cluster.)
>
> -Jonathan
>
> On Wed, Dec 16, 2009 at 8:06 AM, Richard Grossman <richiesgr@gmail.com>
> wrote:
> > Hi
> >
> > I think someone ask already similar but can't find where.
> >
> > On 1 machine standalone I insert data I get ~850 rows / second
> > On another machine I make exactly the same operation I get ~900/1000 rows
> /
> > second
> >
> > Now I remove all the data from the 2 machines. Take exactly the same
> > storage-conf.xml but just add seed in both file nothing else.
> > Make the insert I get ~90 rows / second.
> >
> > Someone have an idea why the performance could fall sharply like this. Or
> > simply give a hint what or where to check why it's happend
> > I've already checked network problem the 2 machines are identical.
> >
> > Thanks.
> >
> >
> >
>

--001485f271e060459d047ada2fba
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">I&#39;m not using 50 thread but make it with 4 thread.<br>=
I give 2 thread by server ip. So I insert using 2 thread on 1 machine and w=
ith 2 other on the second machine<br><br>I need to add a lot of thread to b=
e able to insert this data quickly enough.<br>
<br>but for you it&#39;s logical this behavior ?<br><br><br><br><div class=
=3D"gmail_quote">On Wed, Dec 16, 2009 at 4:45 PM, Jonathan Ellis <span dir=
=3D"ltr">&lt;<a href=3D"mailto:jbellis@gmail.com">jbellis@gmail.com</a>&gt;=
</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"border-left: 1px solid rgb(204, =
204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Sounds like you a=
re using a single thread, so the increased latency is<br>
artificially reducing your numbers. =A0Add more threads (stress.py uses<br>
50 by default) to get more throughput. (Also true even for a single<br>
node, but more noticable when you add network overhead to the<br>
cluster.)<br>
<font color=3D"#888888"><br>
-Jonathan<br>
</font><div><div></div><div class=3D"h5"><br>
On Wed, Dec 16, 2009 at 8:06 AM, Richard Grossman &lt;<a href=3D"mailto:ric=
hiesgr@gmail.com">richiesgr@gmail.com</a>&gt; wrote:<br>
&gt; Hi<br>
&gt;<br>
&gt; I think someone ask already similar but can&#39;t find where.<br>
&gt;<br>
&gt; On 1 machine standalone I insert data I get ~850 rows / second<br>
&gt; On another machine I make exactly the same operation I get ~900/1000 r=
ows /<br>
&gt; second<br>
&gt;<br>
&gt; Now I remove all the data from the 2 machines. Take exactly the same<b=
r>
&gt; storage-conf.xml but just add seed in both file nothing else.<br>
&gt; Make the insert I get ~90 rows / second.<br>
&gt;<br>
&gt; Someone have an idea why the performance could fall sharply like this.=
 Or<br>
&gt; simply give a hint what or where to check why it&#39;s happend<br>
&gt; I&#39;ve already checked network problem the 2 machines are identical.=
<br>
&gt;<br>
&gt; Thanks.<br>
&gt;<br>
&gt;<br>
&gt;<br>
</div></div></blockquote></div><br></div>

--001485f271e060459d047ada2fba--