cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pavel Yaskevich (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-2225) Cannot get columns from sstable generated by json2sstable
Date Wed, 23 Feb 2011 14:44:38 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12998390#comment-12998390
] 

Pavel Yaskevich commented on CASSANDRA-2225:
--------------------------------------------

Looks like you messed something up with your sstables, what I did is following (branch cassandra-0.7):

1). Cleaned all data directories and started Cassandra
2). `./bin/cassandra-cli --host localhost < create_table.cli`
3). `python cassandra_sample_insert.py` (which took a while to finish)
4). `./bin/nodetool -h localhost compact SampleKS SampleCF` to get all keys in one compacted
sstable (because there were 2 sstables)
5). verified key count in compacted sstable - 50 keys.
6). Stopped Cassandra
7). `./bin/sstable2json /var/lib/cassandra/data/SampleKS/SampleCF-f-3-Data.db > s.json`
88M resulting file
8). cleaned data directory for SampleKS `rm /var/lib/cassandra/data/SampleKS/*`
9). generated new sstable using `./bin/json2sstable -s -K SampleKS -c SampleCF s.json /var/lib/cassandra/data/SampleKS/SampleCF-f-1-Data.db`
10). Started Cassandra
11). Run `./bin/cassandra-cli --host localhost --keyspace SampleKS`
12). in CLI `get SampleCF['030yyyyyyyyyy'];` which gave me all columns


> Cannot get columns from sstable generated by json2sstable
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-2225
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2225
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.2
>         Environment: Fedora 11, Intel Core i5, JDK 1.6.0_20
>            Reporter: Muga Nishizawa
>            Assignee: Pavel Yaskevich
>             Fix For: 0.7.3
>
>         Attachments: cassandra_sample_insert.py, create_table.cli
>
>
> I cannot get columns on Cassandra that has sstable generated by json2sstable.  It returns
"null" as its result.  Columns that are associated to specified row keys are stored on Cassandra
in advance.  Cassandra outputs following exception to system.log.  
> This Cassandra has sstable that was generated by json2sstable.  I stored data in Cassandra,
shut it down, then create JSON data from its sstable with sstable2json and I generate sstable
from JSON data with json2sstable in advance.  I could check that columns are included in JSON
data file.  But columns could not be acquired from the generated sstable.  This problem occurs
with or without using Pavel's patch on CASSANDRA-2188.  
> I attached programs so that you can know detail of data stored in Cassandra.  You will
be able to reproduce this problem by executing attached programs, sstable2json and json2sstable.
 For example, I could not get columns associated to row key "030yyyyyyyyyy" from sstable generated
by json2sstable.  "null" will be returned as result.  Cassandra will output exception to system.log.
 
> - 1. Start Cassandra daemon on localhost (number of thrift port is 9160)
> - 2. Create keyspace and column family, according to "create_table.cli"
> - 3. Execute "cassandra_sample_insert.py", storing pairs of row keys and super columns
>  "cassandra_sample_insert.py" requires pycassa
> - 4. Shutdown Cassandra daemon
> - 5. Execute sstable2json and create JSON data
> - 6. Execute json2sstable and generate sstable from JSON data
> - 7. Start Cassandra daemon again
> - 8. Get columns related to row key "030yyyyyyyyyy" (but, I could not get)
> {quote}
>  ERROR 15:45:18,228 Fatal exception in thread Thread[ReadStage:2,5,main]
>  java.io.IOError: org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid
column name length 0
>   at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:246)
>   at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:262)
>   at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:1)
>   at java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(ConcurrentSkipListMap.java:1493)
>   at java.util.concurrent.ConcurrentSkipListMap.<init>(ConcurrentSkipListMap.java:1443)
>   at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:366)
>   at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:1)
>   at org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:79)
>   at org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:1)
>   at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
>   at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
>   at org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:108)
>   at org.apache.commons.collections.iterators.CollatingIterator.anyHasNext(CollatingIterator.java:364)
>   at org.apache.commons.collections.iterators.CollatingIterator.hasNext(CollatingIterator.java:217)
>   at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:55)
>   at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
>   at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
>   at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:118)
>   at org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(QueryFilter.java:142)
>   at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1290)
>   at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1167)
>   at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1095)
>   at org.apache.cassandra.db.Table.getRow(Table.java:384)
>   at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:63)
>   at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:473)
>   at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:619)
>  Caused by: org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid
column name length 0
>   at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:68)
>   at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:242)
>   ... 28 more
> {quote}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message