cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-10314) Update index file format
Date Mon, 14 Sep 2015 15:26:45 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jonathan Ellis updated CASSANDRA-10314:
---------------------------------------
    Reviewer: Ariel Weisberg

> Update index file format
> ------------------------
>
>                 Key: CASSANDRA-10314
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10314
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Robert Stupp
>            Assignee: Robert Stupp
>             Fix For: 3.0.0 rc1
>
>
> As CASSANDRA-9738 may not make it into 3.0rc1, but having an off-heap key-cache is still
a goal, we should change the index file format to meet off-heap requirements (so I've set
fixver to 3.0rc1).
> Off-heap (and mmap'd index files) need the offsets of the individual IndexInfo objects
and the at least the offset field of IndexInfo structures.
> The format I propose is as follows:
> {noformat}
>  (long) position (vint since 3.0, 64bit before)
>   (int) serialized size of data that follows (vint since 3.0, 32bit before)
>  -- following for indexed entries only (so serialized size > 0)
>  (long) header-length (vint since 3.0)
>   (int) DeletionTime.localDeletionTime (32 bit int)
>  (long) DeletionTime.markedForDeletionAt (64 bit long)
>   (int) number of IndexInfo objects (vint since 3.0, 32bit before)
>     (*) serialized IndexInfo objects, see below
>     (*) offsets of serialized IndexInfo objects, since version "ma" (3.0)
>         Each IndexInfo object's offset is relative to the first IndexInfo object.
> {noformat}
> {noformat}
>     (*) IndexInfo.firstName (ClusteringPrefix serializer, either Clustering.serializer.serialize
or Slice.Bound.serializer.serialize)
>     (*) IndexInfo.lastName (ClusteringPrefix serializer, either Clustering.serializer.serialize
or Slice.Bound.serializer.serialize)
>  (long) IndexInfo.offset (vint encoded since 3.0, 64bit int before)
>  (long) IndexInfo.width (vint encoded since 3.0, 64bit int before)
>  (bool) IndexInfo.endOpenMarker != null              (if 3.0)
>   (int) IndexInfo.endOpenMarker.localDeletionTime    (if 3.0 && IndexInfo.endOpenMarker
!= null)
>  (long) IndexInfo.endOpenMarker.markedForDeletionAt  (if 3.0 && IndexInfo.endOpenMarker
!= null)
> {noformat}
> Regarding the {{IndexInfo.offset}} and {{.width}} fields there are two options. 
> * Serialize both of them or
> * Serialize only the offset field plus a _last byte offset_ to be able to recalculate
the width of the last IndexInfo
> The first option is probably the simpler one, the second saves a few bytes (those of
the vint encoded width).
> EDIT: update vint fields (as per CASSANDRA-10232)
> EDIT2: add header-length fields (as per CASSANDRA-10232)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message