cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrik Modesto <patrik.mode...@gmail.com>
Subject Re: [mapreduce] ColumnFamilyRecordWriter hidden reuse
Date Wed, 26 Jan 2011 11:13:02 GMT
On Wed, Jan 26, 2011 at 08:58, Mck <mck@apache.org> wrote:
>> You are correct that microseconds would be better but for the test it
>> doesn't matter that much.
>
> Have you tried. I'm very new to cassandra as well, and always uncertain
> as to what to expect...

IMHO it's matter of use-case. In my use-case there is no possibility
for two (or more) processes to write/update the same key so
miliseconds are fine for me.

BTW how to get current time in microseconds in Java?

> As far as moving the clone(..) into ColumnFamilyRecordWriter.write(..)
> won't this hurt performance? Normally i would _always_ agree that a
> defensive copy of an array/collection argument be stored, but has this
> intentionally not been done (or should it) because of large reduce jobs
> (millions of records) and the performance impact here.

The size of the queue is computed at runtime:
ColumnFamilyOutputFormat.QUEUE_SIZE, 32 *
Runtime.getRuntime().availableProcessors()
So the queue is not too large so I'd say the performance shouldn't get hurt.

 --
Patrik

Mime
View raw message