incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Kjellman <mkjell...@barracuda.com>
Subject Re: reading sstables stored in hdfs
Date Sat, 23 Mar 2013 20:40:18 GMT
Just curious, why would you want to store sstables in HDFS?

On 3/23/13 12:43 PM, "Amit Kumar" <kumaramit01@gmail.com> wrote:

>I am starting some work on an input-format that would let us read
>sstables stored in HDFS, I wonder if anyone has worked on something
>similar before. I did come across
>
>http://techblog.netflix.com/2012/02/aegisthus-bulk-data-pipeline-out-of.ht
>ml
>
>However it's not open sourced/available yet.
>
>I am writing for a sanity check before I go too deep into this.
>
>I have a few questions -hoping someone here would be able to help.
>
>So far, I have been able to read sstables stored on the local file
>system using the SSTableScanner and the SSTableReader. I am wondering
>what would be a good way to proceed -having a custom implementation of
>RandomAccessFile like the (RandomAccessReader and the
>CompressedRandomAccessReader), that would use hadoop's  File System
>API?
>
>
>I did search for, but could have missed -Is there some documentation
>on the binary format of the data, index, and stats files? That might
>make it simpler for me to prototype without having to go through the
>Cassandra Internals. I am currently working of our production
>deployment that is 1.1.0.
>
>Any guidance if you want to give (I am new to Cassandra Internals).
>
>Many thanks
>Amit


Copy, by Barracuda, helps you store, protect, and share all your amazing
things. Start today: www.copy.com.

Mime
View raw message