I don't think currently Cassandra can support this, but if it does, can someone tell me how, or is it reasonable to ask this feature and where should I submit it to?
I am thinking that SSTableReader class, and underline supporting classes, should support handling the files in a DFS, like CFS (Cassandra File System), HDFS (Hadoop Distribute File System) or Amazon S3. Basically, if I pass a file URI, like "hdfs://xxxxxx/xxx.db_file" or "S3://xxxxx/xxxx.db_file", as long as there is library in the runtime to support this DFS, Cassandra SSTableReader should just work to read the Data/Index etc files from there. Anyway, they are all InputStream.
The reason I am asking this is that in my project, I want to parse the SSTable files in MR job. But reusing SSTableReader is hard as internally it only uses File class.
Any tech reasons will make it hard to support the SSTable files existed in a DFS?