cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sunit Randhawa <sunit.randh...@gmail.com>
Subject Storing Counters in Hive
Date Tue, 20 Mar 2012 00:09:18 GMT
I am trying to store Counters CF from cassandra to Hive. Below is the
CREATE TABLE syntax in Hive:

DROP TABLE IF EXISTS Counters;
create external table Counters(row_key string, column_name string, value
string)
STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler'
WITH SERDEPROPERTIES ("cassandra.columns.mapping" = ":key,:column,:value",
  "cassandra.ks.name" = "BAMSchema",
  "cassandra.ks.repfactor" = "1",
  "cassandra.ks.strategy" = "org.apache.cassandra.locator.SimpleStrategy",
  "cassandra.cf.name" = "Counters" ,
  "cassandra.host" = "127.0.0.1" ,
  "cassandra.port" = "9160",
  "cassandra.partitioner" = "org.apache.cassandra.dht.RandomPartitioner")
TBLPROPERTIES (
  "cassandra.input.split.size" = "64000",
  "cassandra.range.size" = "1000",
  "cassandra.slice.predicate.size" = "1000");

and Counter CF is defined as :

create column family Counters
        with comparator = UTF8Type
        and default_validation_class=CounterColumnType
      and replicate_on_write=true;


I am not able to import the Counter value in Hive. I am getting other
row_key and column_name properly.


Below is the output from Hive:

hive> select * from Counters;
OK
213_debit_1326690000    1-sess_count    d
213_debit_1326690000    1-total_db_time
213_debit_1326690000    1-total_exec_time
213_debit_1326690000    1-txn_count
213_debit_1326690000    2-sess_count
213_debit_1326690000    2-total_db_time
213_debit_1326690000    2-total_exec_time
213_debit_1326690000    2-txn_count
Time taken: 0.263 seconds


Below is output from Cassandra:

[default@BAMSchema] list Counters;
Using default limit of 100
-------------------
RowKey: 213_debit_1326690000
=> (counter=1-sess_count, value=100)
=> (counter=1-total_db_time, value=20)
=> (counter=1-total_exec_time, value=30)
=> (counter=1-txn_count, value=1)
=> (counter=2-sess_count, value=30)
=> (counter=2-total_db_time, value=30)
=> (counter=2-total_exec_time, value=30)
=> (counter=2-txn_count, value=1)


As you can see that "d" junk letter is getting added in Hive when import
happens from Cassandra to Hive. Wondering what am I missing.

Thanks for your help!

Mime
View raw message