cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Large File Storage
Date Wed, 15 Sep 2010 15:54:49 GMT
the row-in-memory-during-compaction was fixed some time ago for 0.7
(CASSANDRA-16).

On Wed, Sep 15, 2010 at 10:03 AM, Lucas Nodine <lucasnodine@gmail.com> wrote:
> Hello Users,
>
> I am planning a system where both metadata and data will be stored.  Usually
> it will be small file such as word documents along with some specific data
> about the file.  Sometimes, there will be a large file, possibly a few
> hundred meg - a gig such as video.  I have read a lot about suggested
> methods for large file storage within Cassandra, but I want to verify my
> thoughts on the method of implementation before I start working on it.
>
> On June 29, 2009 Jonathan listed the task on JIRA
> (https://issues.apache.org/jira/browse/CASSANDRA-265) - but closed it
> stating that it was not on anyone's roadmap
>
> On April 26, 2010 there was a posting to this group stating "During
> compaction, as is well noted, Cassandra needs the entire row in memory,
> which will cause a FAIL once you have files more than a few gigs." Shuge
> Lee.
>
> Currently, the Wiki has an entry explaining the handling, or more
> appropriately, workaround to handle Large BLOBs
> (http://wiki.apache.org/cassandra/FAQ#large_file_and_blob_storage).
>
> Seeing as native support for large files is not expected, and the Wiki
> states that files <= 64MB can easily be stored within the database and
> knowing that during compaction, the entire row will be loaded into memory...
>
> 1) Is the appropriate way to handle files that greatly vary in size (1KB to
> a few GB) to break the data into smaller "chunks" and then store those
> chunks each into a seperate row?
>     A) If so, how should it be done to accomplish the best read/write
> results?
>     B) Is there a row size that should be considered a "sweet spot" or
> should it be able to be modified on a per cluster basis?
> 2) Does anyone forsee large blob support in the coming future?
>
> Thanks,
>
> - Lucas



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Mime
View raw message