cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sonny Heer <sonnyh...@gmail.com>
Subject Binary memory table flush question
Date Fri, 26 Feb 2010 23:32:25 GMT
Hey,

I have an application which is iterating over a directory with text
files in it.  For each document it is ingesting words as keys, and the
docid as the column name with column value empty (no super columns).

Below is the code I'm using to construct a key and column:

ColumnFamily cf = ColumnFamily.create(keyspaceStr, cfStr);
docCF.addColumn(
   new Column( docid.getBytes("UTF-8"), "".getBytes("UTF-8"), 0 )
);
LinkedList<ColumnFamily> docCFs = new LinkedList<ColumnFamily>();
docCFs.add(docCF);

Message message = createMessage(keyspaceStr, key, cfStr, docCFs);

/* Send message to end point */
for (InetAddress endpoint:
StorageService.instance().getNaturalEndpoints(key.toString()))
{
      MessagingService.instance().sendOneWay(message, endpoint);
}

The createMessage method is the same code as in
CassandraBulkLoader.java.  So that snippet of code is run for each
term found in a document.  This should give me unique Terms, with
unique list of docids in which those terms appear.

After running this program, I do a flush of the keyspace.  I noticed
I'm only getting the last message sent.  It appears to be overriding
my previous key, docCFs combination.

I tried flushing programmatically after each send like so:
StorageService.instance().forceTableFlush(keyspaceStr, cfStr);

This did not have any effect.  I'm new to Cassandra,  I hope someone
with more experience can chime in with some help :)

Thanks.

Mime
View raw message