asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ildar Absalyamov <ildar.absalya...@gmail.com>
Subject LSM storage refactoring
Date Tue, 19 Sep 2017 05:16:26 GMT
Hi Devs,

In line with earlier major structural refactorings of storage/index-related code [1] I would
like to propose a next step in this cleanup [2].
The main problem that I tried to solve with this patch is that code responsible for LSM disk/memory
component lifecycle (creation, destruction, bulkloading, etc) is smeared across fabric methods
in appropriate index implementations, while much of it is duplicated between various types
of index components (bTrees, externalBTrees, externalBTreesWithBuddyBTree, rTrees, antimatterRTrees,
invertedIndexes, etc). Moreover all these different disk\memory component implementations
have a lot of commonality in how they manage lifecycle of their parts (main indexes, bloom
filters, buddyBTrees\deletedKeysBTrees).

This change removes much of boilerplate from LSM component-handling code and relies on more
object-oriented design to bring in the logic of a particular element of the component into
one place.
It also introduces a composable method of assembling bulkload pipelines, allowing to create
a chain of operators,  responsible for bulkloading a piece of component, and easily extend
this pipeline with additional operations (calculating stats\inferring schema\etc).

If your are interested or have an opinion on how this part of the codebase should be structured
(or it will break all your code in a private branch ;)), please have a look [2].

[1] https://asterix-gerrit.ics.uci.edu/#/c/1728/ <https://asterix-gerrit.ics.uci.edu/#/c/1728/>
[2] https://asterix-gerrit.ics.uci.edu/#/c/2014/ <https://asterix-gerrit.ics.uci.edu/#/c/2014/>
Best regards,
Ildar


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message