hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From onmstester onmstester <onmstes...@zoho.com>
Subject Fwd: Migrating from Apache Cassandra to Hbase
Date Tue, 11 Sep 2018 03:57:52 GMT
Any idea? Sent using Zoho Mail ============ Forwarded message ============ From : onmstester
onmstester <onmstester@zoho.com> To : "user"<user@hbase.apache.org> Date : Sat,
08 Sep 2018 10:46:25 +0430 Subject : Migrating from Apache Cassandra to Hbase ============
Forwarded message ============ Hi, Currently I'm using Apache Cassandra as backend for my
restfull application. Having a cluster of 30 nodes (each having 12 cores, 64gb ram and 6 TB
disk which 50% of the disk been used) write and read throughput is more than satisfactory
for us. The input is a fixed set of long and int columns which we need to query it based on
every column, so having 8 columns there should be 8 tables based on Cassandra query plan recommendation.
The cassandra keyspace schema would be someting like this: Table 1 (timebucket,col1, ...,col8,
primary key(timebuecket,col1)) to handle select * from input where timebucket = X and col1
= Y .... Table 8 (timebucket,col1, ...,col8, primary key(timebuecket,col8)) So for each input
row, there would be 8X insert in Cassandra (not considering RF) and using TTL of 12 months,
production cluster should keep about 2 Peta Bytes of data With recommended node density for
Cassandra cluster (2 TB per node), i need a cluster with more than 1000 nodes (which i can
not afford) So long story short: I'm looking for an alternative to Apache Cassandra for this
application. How HBase would solve these problem: 1. 8X data redundancy due to needed queries
2. nodes with large data density (30 TB data on each node if No.1 could not be solved in HBase),
how HBase would handle compaction and node join-remove problems while there is only 5 * 6
TB 7200 SATA Disk available on each node? How much Hbase needs as empty space for template
files of compaction? 3. Also i read in some documents (including datastax's) that HBase is
more of a offline & data-lake backend that better not to be used as web application backendd
which needs less than some seconds QoS in response time. Thanks in advance Sent using Zoho
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message