hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emre Colak <cole...@gmail.com>
Subject Re: Cells do not get cleared after TTL is set in HBase
Date Wed, 14 Oct 2015 04:13:15 GMT
Yes, I'm trying to use the per cell TTL feature. I've tried releases 1.0.2
and 1.1.2.

Here's some Scala code that I've written:
===============================

def makePut(rowKey: Array[Byte], cf: Array[Byte], qual: Array[Byte], value:
Array[Byte]): Put = {
    val put = new Put(rowKey)
    put.addColumn(cf, qual, value)
    put
}

def getIndex(table: Table, indexName: Array[Byte], cfName: Array[Byte]):
Seq[(String, Array[Byte], Long)] = {
  val result = MutableList[(String, Array[Byte], Long])]()

    val queryResult = table.get(new Get(indexName))
    val cellScanner: CellScanner = queryResult.cellScanner()
    while (cellScanner.advance()) {
    val cell = cellScanner.current()

    if (CellUtil.matchingFamily(cell, cfName)) {
        val tuple = (Bytes.toStringBinary(cell.getQualifierArray,
cell.getQualifierOffset, cell.getQualifierLength),
                      Bytes.copy(cell.getValueArray, cell.getValueOffset,
cell.getValueLength),
                      cell.getTimestamp)
        result += tuple
      }
  }

    result
}

def printIndices(table: Table, indexName: Array[Byte], cfName:
Array[Byte]): Unit = {
  getIndex(table, indexName, cfName).foreach {
    case (q, v, ts) => {
println("qualifier: %s, value: %s, ts: %d".format(q, v, ts))
      }
    }
}

// Establish connection

println("Inserting indices into the database")
val table = connection.getTable(TableName.valueOf(tableName))
table.put(makePut(rowKeyBytes, cfBytes, Bytes.toBytes("idx1"),
Array[Byte](0,0,0,0,1)))
table.put(makePut(rowKeyBytes, cfBytes, Bytes.toBytes("idx2"),
Array[Byte](0,0,0,1,0)))
table.put(makePut(rowKeyBytes, cfBytes, Bytes.toBytes("idx3"),
Array[Byte](0,0,1,0,0)))

println("Indices in the database: ")
val putList = MutableList[Put]()
getIndex(table, rowKeyBytes, cfBytes).foreach {
  case (q, v, ts) => {
println("qualifier: %s, value: %s, ts: %d".format(q, v, ts))

         val put = makePut(rowKeyBytes, cfBytes, Bytes.toBytes(q), v)
         put.setTTL(30000) // 30 second TTL
         putList += put
    }
    putList += makePut(rowKeyBytes, cfBytes, Bytes.toBytes("idxMerged"),
Array[Byte](0,0,1,1,1))
}

println("Merging existing cells and setting TTLs")
table.put(putList)

println("Table contents right after the merge: ")
printIndices(table, rowKeyBytes, cfBytes)

Thread.sleep(10000)

println("Table contents 10 seconds after the merge: ")
printIndices(table, rowKeyBytes, cfBytes)

Thread.sleep(30000)

println("Table contents 40 seconds after the merge: ")
printIndices(table, rowKeyBytes, cfBytes)

// close table and connection

And here's what it prints out:
=========================

Inserting indices into the database
Indices in the database:
key: idx1, value: 0,0,0,0,1, ts: 1444791952201
key: idx2, value: 0,0,0,1,0, ts: 1444791952214
key: idx3, value: 0,0,1,0,0, ts: 1444791952218
Merging existing cells and setting TTLs
Table contents right after the merge:
key: idxMerged, value: 0,0,1,1,1, ts: 1444791952341
key: idx1, value: 0,0,0,0,1, ts: 1444791952341
key: idx2, value: 0,0,0,1,0, ts: 1444791952341
key: idx3, value: 0,0,1,0,0, ts: 1444791952341
Table contents 10 seconds after the merge:
key: idxMerged, value: 0,0,1,1,1, ts: 1444791952341
key: idx1, value: 0,0,0,0,1, ts: 1444791952341
key: idx2, value: 0,0,0,1,0, ts: 1444791952341
key: idx3, value: 0,0,1,0,0, ts: 1444791952341
Table contents 40 seconds after the merge:
key: idxMerged, value: 0,0,1,1,1, ts: 1444791952341
key: idx1, value: 0,0,0,0,1, ts: 1444791952201
key: idx2, value: 0,0,0,1,0, ts: 1444791952214
key: idx3, value: 0,0,1,0,0, ts: 1444791952218


On Tue, Oct 13, 2015 at 8:25 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> Looks like you are using per cell TTL feature.
>
> Which hbase release are you using ?
>
> Can you formulate your description with either sequence of shell commands
> or a unit test ?
>
> Thanks
>
> On Tue, Oct 13, 2015 at 8:13 PM, Colak, Emre <emre.colak@bina.roche.com>
> wrote:
>
> > Hi,
> >
> > I have an HBase table with the following description:
> >
> > {NAME => 'cf', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW',
> > REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE',
> > MIN_VERSIONS => '0' , TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE',
> > BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
> >
> > I put some values in it and then set TTL (30s) on those values with
> another
> > put operation. First thing I notice is that the timestamps of the cells
> get
> > updated after the 2nd put. And 30 seconds later, when I do a scan on the
> > table, I still see those cells in the table, however this time with their
> > timestamps updated to the original timestamps.
> >
> > I understand that these cells won't necessarily be deleted until a
> > compaction, but why do they still come up in my scan even though the TTL
> > that I set on them has expired?
> >
> > Best,
> >
> > Emre
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message