I read something like 80,000 rows from Oracle and write them to Cassandra in chunks of 1000 rows - so I'm supposedly working to Cassandra's strength and Oracle's weakness.
Reading 1000 rows from Oracle is "instantaneous", writing them takes maybe 30 seconds. Not too much data per row, maybe 1K.
On Mon, May 10, 2010 at 11:48 AM, Ran Tavory <firstname.lastname@example.org> wrote:
Hector uses tsocket. not sure what you mean by "buffered" - is that framed? Hector by default does not use framed.The code is here if you'd like to have a look http://github.com/rantav/hector/blob/master/src/main/java/me/prettyprint/cassandra/service/CassandraClientFactory.java#L77
However, I find it hard to believe that the actual connection is the slowing factor.
Roughly speaking, cassandra is fast on writes and slow on reads. Exact numbers are per-scenario so it's hard to say, but if you only write and objects are small then from my experience you should expect a few k writes per second on a single host. How much do you see?
There are many configuration factors and they all depend on expected usage and available h/w.
On Mon, May 10, 2010 at 11:27 AM, vd <email@example.com> wrote:
What is the complete code string you are using to connect with cassandra from Java code
On Mon, May 10, 2010 at 1:49 PM, David Boxenhorn <firstname.lastname@example.org> wrote:
I don't know what "TSocket or the buffered one" means. Maybe I should know?
I'm using Hector. Does that explain anything?
On Mon, May 10, 2010 at 11:15 AM, vd <email@example.com> wrote:
what is it that you are using to connect with cassnadra TSocket or the buffered one ?
On Mon, May 10, 2010 at 1:29 PM, David Boxenhorn <firstname.lastname@example.org> wrote:
I'm running Java on the client, jdbc queries on Oracle, Hector on Cassandra.
The Cassandra and Oracle database designs are radically different, as you might guess.
I have no doubt that Cassandra can be tuned, in a multiple-server cluster, to have superior throughput (that's why I'm doing it!). But for now, it's really frustrating my development effort that Cassandra is so slow. Can't I get it up to twice as slow as Oracle in my configuration?
On Mon, May 10, 2010 at 10:47 AM, vd <email@example.com> wrote:
If I may ask...how do you plan to import data from oracle to cassandra ?
As answer AFAIK cassandra's true ability comes into play when running on more than one machine...and please share how you are making comparisons like on writes or reads from cassandra.
On Mon, May 10, 2010 at 1:04 PM, David Boxenhorn <firstname.lastname@example.org> wrote:
I'm running Oracle and Cassandra on my machine, trying to import my data to Cassandra from Oracle.
In my configuration Oracle is about ten times faster than Cassandra. Cassandra has out-of-the-box tuning.
I am new to Cassandra. How do I begin trying to tune it?