cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vijay (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-265) Large object support
Date Tue, 07 Jul 2009 05:08:14 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727921#action_12727921
] 

Vijay commented on CASSANDRA-265:
---------------------------------

I would vote for Having an integration to HDFS, We should have a plugin to allow the HDFS
to save the data in the same node as the data is served or atleast in the same rack... in
this way we can serve the data faster with a very little latency, We can leveraged the DFS
which is already tested by multiple people. 

Thanks
VJ

> Large object support
> --------------------
>
>                 Key: CASSANDRA-265
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-265
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>
> The standard answer since forever has been "cassandra is a bad fit for large objects."
> But I think it doesn't have to be that way.  With a few simplifying assumptions we can
make this doable.
> First, screw Thrift.  There is no way to specify a stream of bytes cross-platform.  You
can't mix raw sockets into Thrift very easily (?) so screw it.  Make it an internal-only API
to start with, like the much-vaunted and much-feared BinaryVerbHandler.
> Second, forget about writing multiple lobs at once.  You insert one lob at a time, to
a specific column.
> With Thrift out of the equation we are not out of the woods.  MessagingService also assumes
that Messages will be memory resident and not streamed.  One approach to fix this would be
to have a StreamingMessage class that consists of a message id (that would be paired w/ origination
endpoint to make it unique) and a size.  The VerbHandler would keep a Map of incomplete StreamingMessages
around until the full size was read.  Then they could be disposed of.
> So a LargeObjectCommand would be basically just the command id and the payload, the streamed
lob.  And we would handle it by streaming it directly to a file.  When the stream was complete,
we would do a write to the standard commitlog/memtable with a pointer to that lob file.  That
would then be flushed normally to the sstable.  (This would require adding another boolean
to Column serialization, whether the value is really a lob pointer.  We could combine this
with the existing bool into a single byte and have room for a couple more flags, without taking
extra space.)
> So lobs would never appear directly in the commitlog, and we would never have to rewrite
them multiple times during compaction; just the pointers would get merged, but the lob files
themselves would not have to be touched.  (Except to remove them when a compaction shows that
an older version is no longer needed.)
> Then of course we'd need a corresponding ReadLargeObject command.  So the basics are
straightforward.
> Read Repair and Hinted Handoff would add a few more wrinkles but nothing fundamentally
challenging.
> Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message