cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mick Semb Wever <...@apache.org>
Subject Re: [mapreduce] ColumnFamilyRecordWriter hidden reuse
Date Tue, 25 Jan 2011 12:15:34 GMT
On Tue, 2011-01-25 at 09:37 +0100, Patrik Modesto wrote:
> While developing really simple MR task, I've found that a
> combiantion of Hadoop optimalization and Cassandra
> ColumnFamilyRecordWriter queue creates wrong keys to send to
> batch_mutate(). 

I've seen similar behaviour (junk rows being written), although my keys
are always a result from
  LongSerializer.get().toByteBuffer(key)


i'm interested in looking into it - but can you provide a code example? 

  From what i can see TextOutputFormat.LineRecordWriter.write(..)
doesn't clone anything, but it does write it out immediately.
  While ColumnFamilyRecordWriter does batch the mutations up as you say,
it takes a ByteBuffer as a key, why/how are you re-using this
client-side (arn't you creating a new ByteBuffer each call to
write(..))?

~mck

-- 
"Never let your sense of morals get in the way of doing what's right."
Isaac Asimov 
| http://semb.wever.org | http://sesat.no
| http://finn.no       | Java XSS Filter


Mime
View raw message