directmemory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christoph Engelbert <noctar...@apache.org>
Subject Re: MapDB
Date Wed, 07 Nov 2012 21:15:12 GMT
Am 07.11.2012 22:02, schrieb Jan Kotek:
> > As long as all get down to base types (no matter in what
> hierarchy layer) it'll work out of the box.
>
> I think basic problem is what a  'base type' is. It is very
> space-unefficient to treat HashMap or Date as POJO.
> For MapDB serialization HashMap, Date (and many other common data
> types) is a  base type.
>

PS: That means you can define your own "basetypes" by providing
special Marshaller implementations for custom types.

> Jan
>
> On 07/11/12 20:40, Christoph Engelbert wrote:
>> Am 07.11.2012 21:30, schrieb Jan Kotek:
>>> Hi Roman,
>>>
>>> your patch saved me lot of headaches and was very welcomed.
>>>
>>> I am not using Quickser becouse serialization in MapDB is still
>>> evolving rapidly. For example I made some refactoring to make
>>> serialization less dependant on other JDBM classes. I also have
>>> plan to use some stuff from Kryo and Lighting (unsafe ops,
>>> bytecode generators). And Quickser did not had much updates since
>>> it forked.
>>>
>>> Obviously it is more comfortable for me if serialization framework
>>> stays inside MapDB. It is critical part since in database we care
>>> about long term persistence. But on other side I would love if
>>> somebody would took over this part. I have no problem with
>>> extracting serialization to separate project, but I need to see
>>> that this fork is active and can evolve on its own.
>>>
>>> I hoped I could use Lightning, Kryo or other framework developed
>>> as part of DirectMemory. But there seems to be conception
>>> difference.  Kryo and Lightning seems to be more like
>>> 'serialization framework'; it has bunch of serializers (for
>>> numbers, dates...) and you should choose one which suits you best.
>>>
>>> But MapDB should 'just work' without additional configuration. So
>>> I need universal serialization; it should turn any object into
>>> bytes  (similar to Java Serialization or XStream). Also I want it
>>> to mimic standard Java Serialization (Serializable marker
>>> interface, Externalizable, writeExternal methods... etc).
>>>
>> Ok now I have time to get into this discussion :-) First I need to
>> say, it's nice to see that Lightning got some attention. It's always
>> nice to see if some of your baby grow up.
>>
>> Lightning in general is a nearly complete approach of an serializer.
>> You can serialize a lot of classes by just tell it to take all
>> "attributes" in a class an serialize them. As long as all get down
>> to base types (no matter in what hierarchy layer) it'll work out of
>> the box.
>> When initializing the serializer all depending classes are analysed
>> and the bytecode marshallers are generated (or at least one of the
>> base marshallers is used).
>>
>> There is no need for Externalizable or Serializable (but both can be
>> serialized) and there's another Lightning internal interface (for
>> the same usage as Externalizable) Streamed.
>>
>>> So for now I will investigate if I can patch Lighting to support
>>> my needs. If not I will take parts I like and integrate it into
>>> MapDB.
>>>
>> I'll love to see some help and give backup in investigation.
>>
>>> Jan
>>>
>>> On 07/11/12 09:30, Roman Levenstein wrote:
>>>> Hi,
>>>>
>>>> I'm one of the contributors to JDMB3 serialization implementation.
>>>> Actually earlier this year I made it much faster than before (2
>>>> orders
>>>> of magnitude). And BTW, I'm also a contributor to Kryo and
>>>> protostuff-runtime.
>>>>
>>>> I find this discussion very interesting, so let me provide my two
>>>> cents as well.
>>>>
>>>> First of all, I just want to mention that while working on
>>>> improving
>>>> JDBM's serialization, I extracted the serialization part of the
>>>> JDBM
>>>> into a dedicated serialization library, which I called
>>>> Quickser. You
>>>> can find it on GitHub: https://github.com/romix/quickser
>>>> It is really very fast, often faster than Kryo and protostuff.
>>>> Since
>>>> Quickser contains only serialization-related stuff from
>>>> JDBM/MapDB, it
>>>> is easier to use it if you just want to add yet another
>>>> serialization
>>>> method to DM without any DB related functionality.
>>>>
>>>> It could even make sense, if MapDB would use Quickser for
>>>> serialization instead of having both DB and serialization related
>>>> functionality in one pot.
>>>>
>>>> @Jan: What do you think about it? I understand that you don't like
>>>> external dependencies. But Quickser is not really external. It is
>>>> more
>>>> or less a copy of JDMBs serialization-related classes.
>>>>
>>>> On Wed, Nov 7, 2012 at 9:49 AM, Jan Kotek <kjan80@gmail.com>
>>>> wrote:
>>>>> Hi,
>>>>>
>>>>>
>>>>>>       1. DirectMemory could make good use of mapdb to serialize
>>>>>> least
>>>>>>       frequently used items to disk and free memory
>>>>>>       2. DirectMemory could implement a MapDB disk based store
>>>>>> in addition
>>>>>> to
>>>>>>       the bytebuffer and unsafe ones
>>>>> The only problem may be that MapDB currently does not support
>>>>> concurrent
>>>>> transactions (it has only one single global transaction).
>>>>> Not sure if it could be a problem.
>>>>>
>>>>> However it implements ConcurrentMap, so it is possible to swap
>>>>> items
>>>>> atomically
>>>>>
>>>>>
>>>>>>       3. MapDB could take advantage of DM's componentization
>>>>>> approach to
>>>>>>       support multiple serializers (we believe each one has its
>>>>>> advantages
>>>>>> in
>>>>>>       different scenarios)
>>>>> MapDB already supports alternative serializers. User can supply
>>>>> their own on
>>>>> Map (similar to table) creation.
>>>>> I would love to integrate stuff from lightning serializer.
>>>>>
>>>>>
>>>>>>       4. MapDB could use DM to write items to an off-heap before
>>>>>> writing to
>>>>>>       disk (asynchronously) to improve speed
>>>>> Not sure it would be practical. MapDB already uses memory mapped
>>>>> files so
>>>>> effect would be very similar. My tests shows that there is
>>>>> only 50%
>>>>> performance difference between inMemory store and onDisk store.
>>>>>
>>>>> Currently MapDB has only heap based inMemory store. But
>>>>> implementing off
>>>>> heap memory store is trivial and I will do it soon.
>>>> This is very nice to know. Looking forward to see this feature.
>>>> May be
>>>> you should use DM for it?
>>>>
>>>>>>       5. We could merge our serialization efforts (I believe
>>>>>> lightning is
>>>>>> very  fast and worth to be considered) and provide an even
>>>>>> better solution
>>>>>> or two alternative implementations
>>>>> 100% agree. I will check lightning sources and see if I could
>>>>> contribute my
>>>>> stuff. MapDB serialization is very space-efficiency oriented and
>>>>> it can
>>>>> contribute a lot.
>>>> Well, having worked with JDBM's/MapDB's serialization, Kryo and
>>>> protostuff, I would say that MapDB's serialization is
>>>> space-efficient,
>>>> but roughly at the same level as Kryo or a bit worse than latest
>>>> versions of Kryo.
>>>>
>>>> IMHO, the biggest advantage of MapDB's serialization is its
>>>> speed. It
>>>> usually wins against highly optimized versions of Kryo and
>>>> protostuff,
>>>> even though they use Unsafe tricks and the like. To some extent
>>>> this
>>>> speed improvement  can be probably attributed to the 
>>>> simplicity of
>>>> MapDB's serialization implementation. It is not very feature
>>>> rich, but
>>>> very small and simple (just a few classes) and call stacks during
>>>> serialization are usually also very short. Probably JIT is able to
>>>> optimize and inline much better than in other more complex and
>>>> universal frameworks.
>>>>
>>>>> My only condition is that lighting is distributed in separate
>>>>> JAR. I like
>>>>> minimal dependencies.
>>>>>
>>>>>
>>>>>> In both cases we would be open to contribution in different
>>>>>> forms - just
>>>>>> contributing patches or with you to join us and the ASF as
>>>>>> module or
>>>>>> subproject (the latter options have to undergo a formal vote by
>>>>>> all
>>>>>> project
>>>>>> members, of course) as I strongly believe that merging efforts
>>>>>> would bring
>>>>>> to a better and more complete product.
>>>>> I would prefer  MapDB to stay on GitHub.  I find it more
>>>>> comfortable to use.
>>>>> JDBM3 (previous version) nearly become ApacheDS subproject, but
>>>>> on last
>>>>> moment I decided otherwise.
>>>> I strongly agree with Jan here. JDBM/MapDB is used by most people
>>>> as a
>>>> DB or persistent map.
>>>> Its serialization functionality is nice to have, but not the most
>>>> important feature of it.
>>>> At the same time, for DM such things like off-heap mgmt and
>>>> serialization are the most important ones, but persistency is
>>>> optional.
>>>> Therefore, IMHO both project should remain independent and
>>>> cooperate
>>>> or make use of each other. But they should not be integrated into
>>>> one
>>>> "megaproject", which can do everything.
>>>>
>>>> -Roman
>>>>
>>
>


Mime
View raw message