hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erich Nachbar <er...@carrieriq.com>
Subject Re: Announcing release of Kosmos Filesystem (KFS)
Date Mon, 01 Oct 2007 23:12:57 GMT
Sriram, congrats to your release!

One thing I wasn't able to get from your website is if KFS is able to  
deal with metadata server failures
(being a single point of failures in many DFS implementations).

Also, do you use KFS in production for the Kosmix website?
Thanks!
-Erich

On Sep 27, 2007, at 5:56 PM, Sriram Rao wrote:

> Greetings!
>
> We are happy to announce the release of Kosmos Filesystem (KFS) as an
> open source project.  KFS was designed and implemented at Kosmix Corp.
>
> The initial release of KFS is version 0.1 (alpha).  The source code as
> well as pre-compiled binares for x86-64-Linux-FC5 platforms is
> available at the project page on Sourceforge
> (http://kosmosfs.sourceforge.net)
>
> KFS is an available, distributed filesystem that is targeted towards
> applications that are required to handle large amounts of data (such
> as, grid computing, web search apps, mining apps etc).  KFS can be
> used to virtualize storage on a cluster of commodity PCs.  A full
> description of the project as well as set of features that are
> implemented can be found at the following link:
>
> http://kosmosfs.sourceforge.net
>
> KFS consists of 3 components:
>  - a metadata server: that implements a global namespace
>  - a set of chunkservers, that store data.  Blocks of a file, or
> chunks, are stored on individual nodes; the size of each chunk is
> fixed at 64MB
>  - a client library that is linked with applications for accessing  
> KFS.
>
> KFS implemented in C++.  It also contains support for Java/Python  
> applications.
>
> In a nutshell,
>  - KFS supports file replication, that is configurable on a per- 
> file basis
>  - Chunks of a file are replicated, typically, 3-way.  This is used to
> provide data availability during chunkserver outages.
>  - Re-replication is used to recover chunks that were lost due to
> extended  chunkserver outages.
>  - For data integrity, KFS stores check sums on data blocks, which are
> verified on read; if corrruption is detected, re-replication is used
> to recover the corrupted data
>  - KFS supports incremental scalability; new storage nodes can be
> added to the system
>  - To enable better disk utilization, KFS  metaserver may periodically
> rebalance the chunks by migrating chunks from "over-utilized" servers
> to "under-utilized" servers.
>  - KFS exports a standard filesystem API (such as, create, read,
> write, etc.).  Files can be written to multiple times; KFS supports
> append operation on files.
>
> To enable applications to use KFS, KFS has been integrated with Hadoop
> using Hadoop's filesystem interfaces (see Hadoop-Jira-1963).  This
> enables existing Hadoop applications to use KFS seamlessly.
>
> We are looking to build a user community for KFS.  If I can help in
> anyway for you to evaluate KFS, please feel free to get in touch with
> me.
>
> I'd also be happy to share any level of detail about KFS.
>
> Thank you.
>
> Sriram


Mime
View raw message