have you put your commit log on a disk by itself?  not a logical partition shared by oracle or cassandra "data".  this will make a difference, as you don't want the cassandra commit logs competing with other OS and oracle I/O.  look in storage-conf.xml and see if you can move this.

also check "MemtableThroughputInMB".  if you are doing a _lot_ of writes you probably want to jack this up a bunch to get through the migration, then put it back down for normal operation.  the default out of the box is too low i believe.

On 05/10/2010 02:05 AM, David Boxenhorn wrote:
I read something like 80,000 rows from Oracle and write them to Cassandra in chunks of 1000 rows - so I'm supposedly working to Cassandra's strength and Oracle's weakness.

Reading 1000 rows from Oracle is "instantaneous", writing them takes maybe 30 seconds. Not too much data per row, maybe 1K.



On Mon, May 10, 2010 at 11:48 AM, Ran Tavory <rantav@gmail.com> wrote:
Hector uses tsocket. not sure what you mean by "buffered" - is that framed? Hector by default does not use framed.

However, I find it hard to believe that the actual connection is the slowing factor.

Roughly speaking, cassandra is fast on writes and slow on reads. Exact numbers are per-scenario so it's hard to say, but if you only write and objects are small then from my experience you should expect a few k writes per second on a single host. How much do you see?

There are many configuration factors and they all depend on expected usage and available h/w. 


On Mon, May 10, 2010 at 11:27 AM, vd <vineetdaniel@gmail.com> wrote:
What is the complete code string you are using to connect with cassandra from Java code




On Mon, May 10, 2010 at 1:49 PM, David Boxenhorn <david@lookin2.com> wrote:
I don't know what "TSocket or the buffered one" means. Maybe I should know?

I'm using Hector. Does that explain anything?

On Mon, May 10, 2010 at 11:15 AM, vd <vineetdaniel@gmail.com> wrote:

Hi

what is it that you are using to connect with cassnadra TSocket or the buffered one ?


____________________________________

_______________________________________




On Mon, May 10, 2010 at 1:29 PM, David Boxenhorn <david@lookin2.com> wrote:
I'm running Java on the client, jdbc queries on Oracle, Hector on Cassandra.

The Cassandra and Oracle database designs are radically different, as you might guess.

I have no doubt that Cassandra can be tuned, in a multiple-server cluster, to have superior throughput (that's why I'm doing it!). But for now, it's really frustrating my development effort that Cassandra is so slow. Can't I get it up to twice as slow as Oracle in my configuration?

On Mon, May 10, 2010 at 10:47 AM, vd <vineetdaniel@gmail.com> wrote:
Hi David

If I may ask...how do you plan to import data from oracle to cassandra ?
As answer AFAIK cassandra's true ability comes into play when running on more than one machine...and please share how you are making comparisons like on writes or reads from cassandra.



_______________________________________
_______________________________________





On Mon, May 10, 2010 at 1:04 PM, David Boxenhorn <david@lookin2.com> wrote:
I'm running Oracle and Cassandra on my machine, trying to import my data to Cassandra from Oracle.

In my configuration Oracle is about ten times faster than Cassandra. Cassandra has out-of-the-box tuning.

I am new to Cassandra. How do I begin trying to tune it?

Thanks.