pulsar-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [pulsar] candlerb opened a new issue #5394: Add ability to truncate raw partition after compaction
Date Wed, 16 Oct 2019 16:19:15 GMT
candlerb opened a new issue #5394: Add ability to truncate raw partition after compaction
URL: https://github.com/apache/pulsar/issues/5394
   **Is your feature request related to a problem? Please describe.**
   When using an event stream as a "table", Create/Update/Delete options are modelled as posting
new key/value pairs.  The stream needs an infinite retention policy, because some keys may
never get new values.
   Pulsar already has a Compaction feature, which converts a raw topic into a compacted topic,
containing only the latest value for each key.  Clients can choose to read the compacted stream.
 However the raw topic remains indefinitely on disk.
   This means that the raw topic can grow without bounds, even for a limited size table, as
the entire history of updates is kept indefinitely.
   **Describe the solution you'd like**
   Add the ability to truncate the raw topic data after compaction.  This might mean rotating
the compacted ledger into the place of the original raw ledger.
   Since compaction is done on demand, this rotation can be done on demand too.  It could
be an attribute of the compaction request.
   (Aside: I don't fully understand what happens when compaction is run on an already-compacted
topic; and I don't know what happens if a client asks to read the compacted version of a topic
which has not yet been compacted)
   **Describe alternatives you've considered**
   As far as I can see, the only other option is to spool all the data out of the compacted
topic into a fresh topic.
   **Additional context**
   This would bring feature parity to Kafka's [log compaction](https://kafka.apache.org/documentation/#compaction).
   Possibly related to #2736

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

With regards,
Apache Git Services

View raw message