incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tyler Hobbs <ty...@datastax.com>
Subject Re: anyone have any performance numbers? and here are some perf numbers of my own...
Date Sat, 11 Aug 2012 20:32:13 GMT
One node can typically handle 30k+ inserts per second, so you should be
able to insert the 9 million rows in about 5 minutes with a single node
cluster.  My guess is that you're inserting with a single thread, which
means you're bound by network latency.  Try using 100 threads, or better,
just use the stress tool that comes with Cassandra:
http://www.datastax.com/docs/1.0/references/stress_java

On Fri, Aug 10, 2012 at 5:02 PM, Hiller, Dean <Dean.Hiller@nrel.gov> wrote:

> Ignore the third one, my math was badÅ worked out to 733 bytes / row and it
> ended up being 6.6 gig as it compacted it some after it was done when the
> load was light(noticed that a bit later)
>
> But what about the other two?  Is that the time is expected approximately?
>
> Thanks,
> Dean
>
> On 8/10/12 3:50 PM, "Hiller, Dean" <Dean.Hiller@nrel.gov> wrote:
>
> >****** 3. In my test below, I see there is now 8Gig of data and 9,000,000
> >rows.  Does that sound right?,  nearly 1MB of space is used per row for a
> >50 column row????  That sounds like a huge amount of overhead. (my values
> >are long on every column, but that is still not much).  I was expecting
> >KB / row maybe, but MB / row?  My column names are "col"+I as well so
> >they are very short too.
> >
> >A common configuration is 1T drives per node, so I was wondering if
> >anyone ran any tests with map/reduce on reading in all those rows(not
> >doing anything with it, just reading it in).
> >
> >****** 1. How long does it take to go through the 500MB that would be on
> >that node?
> >
> >I ran some tests on just writing a fake table in 50 columns wide and am
> >seeing it will take about 31 hours to write 500MB of information (a node
> >is about full at 500MB since need to reserve 50-30% space for compaction
> >and such).  Ie. If I need to rerun any kind of indexing, it will take 31
> >hoursÅ does this sound about normal/ballpark?  Obviously many nodes will
> >be below so that would be worst case with 1 T drives.
> >
> >****** 2. Anyone have any other data?
> >
> >Thanks,
> >Dean
>
>


-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Mime
View raw message