incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hiller, Dean" <Dean.Hil...@nrel.gov>
Subject Re: anyone have any performance numbers? and here are some perf numbers of my own...
Date Fri, 10 Aug 2012 22:02:24 GMT
Ignore the third one, my math was badÅ worked out to 733 bytes / row and it
ended up being 6.6 gig as it compacted it some after it was done when the
load was light(noticed that a bit later)

But what about the other two?  Is that the time is expected approximately?

Thanks,
Dean

On 8/10/12 3:50 PM, "Hiller, Dean" <Dean.Hiller@nrel.gov> wrote:

>****** 3. In my test below, I see there is now 8Gig of data and 9,000,000
>rows.  Does that sound right?,  nearly 1MB of space is used per row for a
>50 column row????  That sounds like a huge amount of overhead. (my values
>are long on every column, but that is still not much).  I was expecting
>KB / row maybe, but MB / row?  My column names are "col"+I as well so
>they are very short too.
>
>A common configuration is 1T drives per node, so I was wondering if
>anyone ran any tests with map/reduce on reading in all those rows(not
>doing anything with it, just reading it in).
>
>****** 1. How long does it take to go through the 500MB that would be on
>that node?
>
>I ran some tests on just writing a fake table in 50 columns wide and am
>seeing it will take about 31 hours to write 500MB of information (a node
>is about full at 500MB since need to reserve 50-30% space for compaction
>and such).  Ie. If I need to rerun any kind of indexing, it will take 31
>hoursÅ does this sound about normal/ballpark?  Obviously many nodes will
>be below so that would be worst case with 1 T drives.
>
>****** 2. Anyone have any other data?
>
>Thanks,
>Dean


Mime
View raw message