cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joost Ouwerkerk (JIRA)" <j...@apache.org>
Subject [jira] Created: (CASSANDRA-1050) Too many splits for ColumnFamily with only a few rows
Date Tue, 04 May 2010 19:31:57 GMT
Too many splits for ColumnFamily with only a few rows
-----------------------------------------------------

                 Key: CASSANDRA-1050
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1050
             Project: Cassandra
          Issue Type: Bug
          Components: Hadoop
    Affects Versions: 0.6
            Reporter: Joost Ouwerkerk
             Fix For: 0.6.2


ColumnFamilyInputFormat creates splits for the entire Keyspace.  If one ColumnFamily has 100
Million rows and another has only 100 rows, the number of splits will be the 1,526 (assuming
64k rows per split) for either one, since it is based on the total number of unique keys across
the whole keyspace, and not on the number of rows in the ColumnFamily.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message