hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hansi Klose" <hansi.kl...@web.de>
Subject double keys major_compaction
Date Thu, 01 Oct 2015 08:17:56 GMT

I have the problem that we have key in our cluster which exist double.
The keys have different timestamps.

I got notice of the keys, because we are replicating the data to another cluster
and in the target cluster we see only the keys with the newer timestamp.

we run major_compaction on regular basis in both cluster.

The table has VERSIONS => '1'

get 't1', "\x98\x04......", {COLUMN => 'd', VERSIONS => 5 }
timestamp=1442848394860, value=@\x83

get 't1', "\x98\x04......", {COLUMN => 'd', VERSIONS => 5, TIMESTAMP => 1442569821452
timestamp=1442569821452, value=@\x83

I thought that after a 

flush 't1'
major_compact 't1'

the key with the old timestamp would be deleted because we have versions => 1

"In a major compaction, deleted key/values are removed, this new file 
doesn’t contain the tombstone markers and all the duplicate key/values
(replace value operations) are removed."

But this does not happen.

After the flush and major_compaction of the table the keys are still there.

We use hbase: 0.94.2-cdh4.2.0, rUnknown

Why the are still there? Do i have to delete them manual?


View raw message