cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pavel Yaskevich (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-1898) json2sstable should support streaming
Date Mon, 24 Jan 2011 16:47:44 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-1898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12985802#action_12985802
] 

Pavel Yaskevich commented on CASSANDRA-1898:
--------------------------------------------

But this task is not about sstable2json modifications... This should go to the separate ticket
 or be placed to the ticket about SSTable2json, am I not right?

> json2sstable should support streaming
> -------------------------------------
>
>                 Key: CASSANDRA-1898
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1898
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Nick Bailey
>            Assignee: Pavel Yaskevich
>             Fix For: 0.7.1
>
>         Attachments: CASSANDRA-1898-v2.patch, CASSANDRA-1898-v3.patch, CASSANDRA-1898-v4.patch,
CASSANDRA-1898.patch
>
>   Original Estimate: 8h
>          Time Spent: 8h
>  Remaining Estimate: 0h
>
> json2sstable loads the entire json file into memory. This is so it can sort the file
before creating an sstable. If the file was created using sstable2json and the partitioner
isn't changing, this isn't necessary.  For very large files this means json2sstable requires
a huge amount of memory.
> There should be an option to stream the file. A simple check for out of order keys will
prevent writing bad sstables.
> This should be possible with the SAX style parser available in our current json library.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message