hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anand Nalya <anand.na...@gmail.com>
Subject Data Deduplication in HBase
Date Tue, 27 Aug 2013 14:12:35 GMT

I have a use case in which I need to store segments of mp3 files in hbase.
A song may come to the application in different ovelapping segments. For
example, a 5 min song can have the following segments 0-1,0.5-2,2-4,3-5. As
seen, some of the data is duplicate (3-4 is present in the last 2

What would be the ideal way of removing this duplicate storage? Will snappy
compression help here or do I need to write some logic over HBase? Also,
what if I store a single segment multiple times. Will hbase do some sort of


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message