directmemory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Johannes.Lichtenberger" <Johannes.Lichtenber...@uni-konstanz.de>
Subject Re: MapDB
Date Wed, 07 Nov 2012 12:00:59 GMT
On 11/07/2012 08:54 AM, Raffaele P. Guidi wrote:
> I think we have more than one reason to integrate and share code:
>
>     1. DirectMemory could make good use of mapdb to serialize least
>     frequently used items to disk and free memory
>     2. DirectMemory could implement a MapDB disk based store in addition to
>     the bytebuffer and unsafe ones
>     3. MapDB could take advantage of DM's componentization approach to
>     support multiple serializers (we believe each one has its advantages in
>     different scenarios)
>     4. MapDB could use DM to write items to an off-heap before writing to
>     disk (asynchronously) to improve speed
>     5. We could merge our serialization efforts (I believe lightning is very
>     fast and worth to be considered) and provide an even better solution or two
>     alternative implementations
>
> In both cases we would be open to contribution in different forms - just
> contributing patches or with you to join us and the ASF as module or
> subproject (the latter options have to undergo a formal vote by all project
> members, of course) as I strongly believe that merging efforts would bring
> to a better and more complete product.

Hi Raffaele,

I believe I'm in a similar position, however I assume, "my" storage 
system[1] (I'd say it is a versioned XML DBS), forked about 1,5 years 
ago from a university project I've been working on during HiWi-Jobs, 
Bachelor project/thesis, Master project/thesis and in my spare time is 
bottlenecking on I/O. However, I've never found time to really measure 
performance. I've had some discussions with Christoph in a german 
java-messageboard and he suggested something similar (that Sirix could 
be an on-disk store for DM) , because I thought about writing a proposal 
for a new Apache top-level project. However I would have to ask my 
former mentor, a few students and the initiator from the project from 
which it's forked. At the moment I'm seriously let down every once in a 
while, because the project started in 2006 (I think or even late 2005? 
hm) with some "gaps" of contribution, and except of students like me 
noone ever used the project. It's now open source since about 2 years 
and my fork since about half a year. At least I'm convinced that most of 
the sourcecode is really well documented and adheres (hopefully) to 
Josh's items in Effective Java in most aspects ;-)

However, regarding DM -- I'm not sure when to use non-heapspace caches. 
I'm currently using Google Guava Caches for reading variable-length 
pages from disk and a simple in memory LRU-Cache (LinkedHashMap) with a 
BerkeleyDB overflow mechanism for a transaction-log. I'm thinking about 
adding a memory-mapped file store (alternatively to the usual simple 
append-only RandomAccessFile), maybe via Chronicle (at least memory 
mapped files seem to be hyped everywhere ;-)).

Well, my ultimate vision would be rather different from DM (a versioned, 
secure XML/JSON (tree-based) DBS running in the cloud (maybe via Scala 
(Akka) to provide the basis for Data Mining tasks on temporal 
tree-structures and to provide encryption techniques for temporal data, 
such that different user groups/roles can access different subtrees for 
specified versions). Plus the usual XQuery/XQuery Update Facility stuff 
(through brackit(.org)) and temporal XPath axis...

But well, I think currently it doesn't add anything for anyone (but it's 
understandable as other XML database systems have index-structures on 
which I'm currently working together with sophisticated rewriting rules 
-- however I think most of them do not update the indexes automatically 
during modifications), at least it seem so, thus I'm seriously thinking 
about going forward to other projects (which I have todo anyhow, because 
I have to find a job ;-), even though I currently hardly can think of 
other interesting stuff ;-)) But at least I would have some spare time 
(hopefully) -- and it seems in some companies one could contribute one 
full day for open source projects, but this doesn't seem to be the rule ;-)

Well, sorry, hopefully it doesn't sound too foolish, but every once in a 
while it's a bit frustrating having contributed so many hours for an 
open source project noone uses.

kind regards,
Johannes

[1] https://github.com/JohannesLichtenberger/sirix

Mime
View raw message