Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 17470 invoked from network); 15 Apr 2010 02:25:39 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 15 Apr 2010 02:25:39 -0000 Received: (qmail 31007 invoked by uid 500); 15 Apr 2010 02:25:38 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 30977 invoked by uid 500); 15 Apr 2010 02:25:38 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 30969 invoked by uid 99); 15 Apr 2010 02:25:38 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Apr 2010 02:25:38 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of avinash.lakshman@gmail.com designates 209.85.211.195 as permitted sender) Received: from [209.85.211.195] (HELO mail-yw0-f195.google.com) (209.85.211.195) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Apr 2010 02:25:32 +0000 Received: by mail-yw0-f195.google.com with SMTP id 33so880389ywh.11 for ; Wed, 14 Apr 2010 19:25:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:received:message-id:subject:from:to:content-type; bh=4dBndGahmu96Non5k0Az2HD7dnEF42i9ACfBSTGK2p8=; b=w3Ypp084X06CsRk5QMeXTxAj8paohbewB36hunbu3bIdsNYFlW+GseAX91+YkUjDzi ReN2DpCKux/BPGqezuVIRCylj/MyPyKIRmdnVvq4ST1ELgfnbqZEd3Qhb9LRpeXXDMPh vu84NNFdncWiQINy1z1uSCzaEP+/R/D73VEf0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=FKsySXAfgSAbsmryE8VXE5nkQzYPuj8e6YLbBj3TkUp37jCd9oi8okyyX7DkgnUzcg 6bmWr+qfSxzNl3xliJf4k88KJSAx+7uJ3aaBCGm+pV1Qz29kwpipbNu3kClzZ5Rcg4sp vIAHtXeepe1Wyw6r6k9hTY4GRoCO7k8UeKljc= MIME-Version: 1.0 Received: by 10.150.205.1 with HTTP; Wed, 14 Apr 2010 19:25:11 -0700 (PDT) In-Reply-To: References: Date: Wed, 14 Apr 2010 19:25:11 -0700 Received: by 10.150.168.18 with SMTP id q18mr7858719ybe.326.1271298311770; Wed, 14 Apr 2010 19:25:11 -0700 (PDT) Message-ID: Subject: Re: Is that possible to write a file system over Cassandra? From: Avinash Lakshman To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=000e0cd5cf5846d12804843d3480 X-Virus-Checked: Checked by ClamAV on apache.org --000e0cd5cf5846d12804843d3480 Content-Type: text/plain; charset=ISO-8859-1 Exactly. You can split a file into blocks of any size and you can actually distribute the metadata across a large set of machines. You wouldn't have the issue of having small files in this approach. The issue maybe the eventual consistency - not sure that is a paradigm that would be acceptable for a file system. But that is a discussion for another time/day. Avinash On Wed, Apr 14, 2010 at 7:15 PM, Ken Sandney wrote: > Large files can be split into small blocks, and the size of block can be > tuned. It may increase the complexity of writing such a file system, but can > be for general purpose (not only for relative small files) > > > On Thu, Apr 15, 2010 at 10:08 AM, Tatu Saloranta wrote: > >> On Wed, Apr 14, 2010 at 6:42 PM, Zhuguo Shi wrote: >> > Hi, >> > Cassandra has a good distributed model: decentralized, auto-partition, >> > auto-recovery. I am evaluating about writing a file system over >> Cassandra >> > (like CassFS: http://github.com/jdarcy/CassFS ), but I don't know if >> > Cassandra is good at such use case? >> >> It sort of depends on what you are looking for. From use case for >> which something like S3 is good, yes, except with one difference: >> Cassandra is more geared towards lots of small files, whereas S3 is >> more geared towards moderate number of files (possibly large). >> >> So I think it can definitely be a good use case, and I may use >> Cassandra for this myself in future. Having range queries allows >> implementing directory/path structures (list keys using path as >> prefix). And you can split storage such that metadata could live in >> OPP partition, raw data in RP. >> >> -+ Tatu +- >> > > --000e0cd5cf5846d12804843d3480 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Exactly. You can split a file into blocks of any size and you can actually = distribute the metadata across a large set of machines. You wouldn't ha= ve the issue of having small files in this approach. The issue maybe the ev= entual consistency - not sure that is a paradigm that would be acceptable f= or a file system. But that is a discussion for another time/day.

Avinash

On Wed, Apr 14, 2010 at 7:15 = PM, Ken Sandney <blueflycn@gmail.com> wrote:
Large files can be split into small blocks, and the size of block can be tu= ned. It may increase the complexity of writing such a file system, but can = be for general purpose (not only for relative small files)


On Thu, Apr 15, 2010 at 10:08 AM, Tatu Saloranta <tsaloranta@gmail.com<= /a>> wrote:
On Wed, Apr 14, 2010 at 6:42 PM, Zhuguo Shi <blueflycn@gmail.com&= gt; wrote:
> Hi,
> Cassandra has a good distributed model: decentralized, auto-partition,=
> auto-recovery. I am evaluating about writing a file system over Cassan= dra
> (like CassFS:=A0http://github.com/jdarcy/CassFS=A0), but I don't know if > Cassandra is good at such use case?

It sort of depends on what you are looking for. From use case f= or
which something like S3 is good, yes, except with one difference:
Cassandra is more geared towards lots of small files, whereas S3 is
more geared towards moderate number of files (possibly large).

So I think it can definitely be a good use case, and I may use
Cassandra for this myself in future. Having range queries allows
implementing directory/path structures (list keys using path as
prefix). And you can split storage such that metadata could live in
OPP partition, raw data in RP.

-+ Tatu +-


--000e0cd5cf5846d12804843d3480--