incubator-directmemory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raffaele P. Guidi" <raffaelegu...@apache.org>
Subject Re: Initial roadmap discussion
Date Mon, 10 Oct 2011 13:30:56 GMT
> :-) If we can write the project with a democratic/well know language it's
IMHO better for adoption and increasing community (at least the plugin
mechanism can have an option to run plugins write in other languages).

Sure it is and java is of course the best candidate for this, but (of course
with a lower priority) giving more idiomatics and language specific ways to
use DM from scala (and clojure) would keep us on the edge for emerging
technologies developers

Ciao,
    R

On Mon, Oct 10, 2011 at 11:21 AM, Olivier Lamy <olamy@apache.org> wrote:

> Hello,
>
> 2011/10/9 Raffaele P. Guidi <raffaeleguidi@apache.org>:
> > Gentlemen, welcome and thank you for joining in (and the opportunity, for
> me
> > and the project, to join the ASF, which is great) . I wrote some notes
> about
> > the current state of the project and some hypothesis on future
> developments
> > which I would like to discuss with you all. These are the items I would
> like
> > to discuss (and sorry for being a bit lengthy):
> >
> >   - *Design choices*
> >   - *New features*
> >   - *Integration with other products*
> >   - *Build, Test and Continuous integration strategy *
> >   - *Miscellanea*
> >
> > *Design choices*
> > *I recently rewrote DM entirely for simplification. It used to have three
> > layers (heap, off-heap, file/nosql) and to authomatically push
> > forward/backward in the chain items according to their usage. It turned
> out
> > overly complicated and mostly inefficent at runtime (probably mostly
> because
> > of my poor implementation). The singleton facade is proving simple and
> > effective and well refects the nature of direct memory - which cannot be
> > really freed. But this needs a strategy for feature and behaviour
> > composability.*
> >
> >   - *Singleton *(largely Play! inspired) *approach *- is it good?
> >   - *Feature and behaviour composability*  (DI and Feature injection? A
> >   plugin system? OSGi?)* - just let's keep things simple and developer
> >   friendly*
> A plugin mechanism with various extensions is IMHO what makes a
> project a success story (see Apache Maven or Jenkins).
> It's always a good idea to give possibility to users to enhance a
> project/tool easily (at least we must provide necessary tooling for
> that to make the life easier :-) ).
> >
> > *New features*
> > Adding simple heap cache features would spread usage among those who
> think
> > that would EVENTUALLY need a huge off-heap one (I believe it's the vast
> > majority of our potential "customers"). Same thing for file and
> distributed
> > ones. Having both three would qualify DM as an Enterprise Ready (please
> > notice the capitalization ;) cache.
> >
> >   - *Heap storage *- Guava already fits the requirement, of course. We
> >   could both use the heap as a "queue" to speed up inserts and serialize
> later
> >   and/or keep most frequently used items into the heap for speed. It's
> more a
> >   design choice than a technical one
> >   - *File storage *-  this would be easy to achieve with the same "index"
> >   strategy of the off-heap one (I believe JCS does the same)
> >   - *Lateral storage *(distributed or replicated) - A possible way to do
> >   this: *hazelcast *for map distribution and *Apache Thrift *for intra
> node
> >   communication (node a needs an item stored in node b and then asks for
> it).
> >   I'm not sure hazelcast would perform as well as Guava with multi
> million
> >   item maps, it has to be thoroughly tested for perfomance and memory
> >   consumption - should hazelcast not fit the performance requirement we
> should
> >   finda an alternative way to distribute/replicate the map across
> > nodes. *jgroups
> >   *with multicasting would be perfect but it's LGPL (well, JCS uses it)
> >   and, of course, a custom, maybe thrift based, distribution mechanism
> could
> >   be written ad-hoc
> >
> > *Integration with other products*
> > Providing plugins, integration or just support with/for other
> > technologies/products would of course spread adoption. These are the
> first
> > few that pop in my mind at the moment
> >
> >   - *Apache Cayenne integration* - do I need to tell why? ;)
> >   - *Play! Framework integration* - because I simply love play! and use
> it
> >   in other side projects whenever I need a web/mobile fron-end
> >   - *Memcached *(like) *integration* - DirectMemory can be seen as an
> >   embedded memcached and adoption its protocol would be a good fit for
> >   replacing it in distributed scenarios most of all when it's used by
> java
> >   applications
> >   - *Scala, Clojure and other jvm languages* integration - emerging
> >   technologies that deserve attention. Should I have 48 hours days I
> would use
> >   the other 24 to improve my scala skills and rewrite DirectMemory can
> with it
> :-) If we can write the project with a democratic/well know language
> it's IMHO better for adoption and increasing community (at least the
> plugin mechanism can have an option to run plugins write in other
> languages).
> >
> >
> >
> > *Miscellanea*
> > There are of course a lot of things that are not essential but could be
> > investigated
> >
> >   - *HugeArrayList, FastMap*, etc... DirectMemory currently uses Guava
> for
> >   the Map and ArrayList (I know it's not thread safe but it could be
> really
> >   not required) for the Pointer's index. Evaluation of  other fast and
> low
> >   memory impact Map and List implementation could possibly bring
> performance
> >   improvements
> >   - *Reliability improvements* - DirectMemory is fast also because it
> >   sacrifices reliability - is it always a good trade-off? Could we
> provide
> >   configuration or pluggable implementation for different usage
> scenarios,
> >   maybe at list for the MemoryManager? Or even transactionality?
> >   - *Would hadoop need off-heap* caching? (this is a good one)
> >
> > *Build, Test and Continuous integration strategy *
> > *The overall point for DM is testing for performance with large
> quantities
> > of memory - where the minimum should be more than the average 2GB used by
> > web applications - the more the better.*
> >
> >   - *Testing infrastructure* - I currently use an amazon machine with
> 16+GB
> >   RAM (which costs ~$1 per hour), a bit tedious and time consuming to
> startup
> >   and to deploy on (would require some scripting) and of course
> continuous
> >   performance testing is too expensive - alternatives?
> >   - *Branching strategy* - I don't like feature branches - I believe
> >   feature composability should not be done at the SCM level - (and SVN is
> >   probably a bit too slow for them) and don't believe in using just
> release
> >   branches. Don't know whether there's an apache standard but I usually
> work
> >   with *spike* branches (where a spike is more than a single feature and
> >   less than a whole release) and then publish on release branches tagging
> for
> >   events (production, distribution, etc). Does it sound good for you?
> >   - *Binary packaging and demo applications* - I used to provide a binary
> >   distribution and a simple web application to test against but it simply
> was
> >   too effort for me alone
> >   - *OSGi bundling* - it costs very little and can be quite useful
> >   - *Maven repository *- I've applied for a sonatype repo registration
> but
> >   simply didn't have enough time to complete it and I'm using a github
> folder
> >   as a repository. I guess that artifacts would naturally go in apache
> repos,
> >   from now on, right?
> Yes repository.apache.org is synched to central.
> >   - *Testing and certification* over different JVMs and OS (sun, openjdk,
> >   ibm, windows, linux, AIX? Solaris?)
> >
> > *Roadmap*
> > I would say that intensive performance testing and certification would
> make
> > a solid 0.7 GA release; heap and file storages inclusion would make a
> pretty
> > good 1.0 (the distributed storage would make it incredible!)
> >
> > Waiting forward for your feedback.
> >
> > Cheers,
> >     Raffaele
> >
>
> Sorry but for some points I didn't have yet the time to have a look at
> the code :-(
>
> --
> Olivier Lamy
> Talend : http://talend.com
> http://twitter.com/olamy | http://linkedin.com/in/olamy
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message