river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gregg Wonderly <ge...@cox.net>
Subject Re: Migrating data in a JavaSpace
Date Tue, 05 Feb 2013 14:08:31 GMT
In the end, it's possible to make changes to the class which break serial compatibility, see
the "incompatibility exception", paste in the old serialVersionUID value, into the new class,
add a new variable, whose only purpose is to see the "change" in compatibility and still do
this.


public class MyClass implements Remote {

…add the values  you need…

// create the value which should always be initialized by the default constructor,
// but which on the first deserialization, will not be initialized, because the old constructor
// did not initialize it.
private volatile Object __flagValue;

// Initialize to whatever the old version's value is
private final long serialVersionUID = -949847128947012L;

// Default constructor to init everything.  If you have required parameters or other
// things to deal with change the signature, or add similar constructors.
public MyClass() {
	// new initialization
	__flagValue = new Serializable(){};
	… init of new values would be in here …
	… other initializations here …
}

private final void readObject( ObjectInputStream is ) throws IOException, ClassNotFoundException,
… {
	is.defaultReadObject();
	if( __flagValue == null ) {
		// new version of class, first deserialization, migrate values here
	} else {
	 	// new version of class with updated member use, second and subsequent deserialization.
	}
}

I've done this a few times to go "one direction" in moving old values to a new version of
a class that I didn't want to rename.  Its possible to do some other things as well, but ultimately,
you really should use a more custom readObject/writeObject if you're going to get fancy, or
do this regularly.  Then, you can use another member value as a "serialization stream" version
control mechanism.

Gregg Wonderly


On Feb 5, 2013, at 7:19 AM, Peter <jini@zeus.net.au> wrote:

> Just thought I'd throw in my 2 cents:
> 
> Don't forget that if you do not call defaultReadObject() on the ObjectInputStream during
deserialisation, any additional fields added later will break serial compatibility.
> 
> The Serialization builder pattern allows you to substitute and migrate classes and multiple
serial forms can coexist.  It does this by separating the serial form from implementation.
  It solves the serialization problem for long lived objects in a distributed system.
> 
> For an extreme example see the reference collections library included with river, many
collections share the same serial form which can be coexist with a new serial form, class
implementations can replaced during deserialisation in future releases.  It's also available
as a project on sourceforge called custardapple. 
> 
> See the river wiki page on serialization for more info.
> 
> Cheers,
> 
> Peter.
> 
> 
> 
> ----- Original message -----
>> Can't tell from discussion so far but if serialVersionUID wasn't
>> hardwired, I'm thinking likelihood of compatibility becomes quite
>> limited?
>> 
>> On 4 February 2013 22:10, Dennis Reedy <dennis.reedy@gmail.com> wrote:
>>> I'm actually not sure if Dawid has to actually do anything here. The entries
>>> have been written into the space and have as their annotation a URL that has
>>> been provided by the entries defining classloader. In this case the entry is
>>> annotated using Rio's artifact URL scheme.
>>> 
>>> If the change(s) are compatible changes (compatible serialization wise), then
>>> a client can come along at a time later, take the matched entry (as needed
>>> dynamically load classes from the annotated codebase), create a new entry and
>>> write that back into the space. The new entry will have as it's annotation the
>>> new artifact.
>>> 
>>> If the changes are not compatible, then IMO, the change should be implemented
>>> using a new class (either name or in a new package). Take the old entry,
>>> create the new, write it back to the space.
>>> 
>>> HTH
>>> 
>>> Dennis
>>> 
>>> On Feb 4, 2013, at 1249PM, Dan Creswell wrote:
>>> 
>>>> On 4 February 2013 17:32, Dawid Loubser <dawid@travellinck.com> wrote:
>>>>> Thanks Gerard,
>>>>> 
>>>>> That does sound reasonable, but wouldn't I effectively lose the unique
>>>>> individual codebase annotations of each entry? I have various unrelated
>>>>> services that interact in often-complex ways. Consider the following:
>>>>> 
>>>>> * In foo-api, I have an entry called FooEvent
>>>>> * In my space-based timer api, I have an entry called PublishLater, and
>>>>> a particular instance of PublishLater contains an instance of FooEvent,
>>>>> and a timestamp that says when to publish the nested entry.
>>>>> 
>>>>> The timer service (and the timer-api) has no knowledge of foo-api. There
>>>>> would be no generic way to write that PublishLater entry to XML, and
>>>>> parse it again, making sure that the nested FooEvent has the correct
>>>>> codebase (which will be distinct from the codebase of the higher-level
>>>>> Entry). I have many such occurrences of entries generically containing
>>>>> other entries, and the codebase has to remain intact for each.
>>>>> 
>>>>> I think I will (as Dan suggested( have to write a Java-based migration
>>>>> tool, that (using reflection) reconstructs each Entry, taking care to,
>>>>> at each level, retain the proper codebase, with only the changes
>>>>> required for the migration. Because I'm using Rio's maven-based class
>>>>> loading, I know that where a codebase URL was "artifact:foo:bar-api:1.0"
>>>>> I can now reconstruct it, replacing it with "artifact:foo:bar-api:1.1".
>>>>> 
>>>>> This will be very interesting indeed, and I need to do it ASAP :-( A
>>>>> production deployment depends on this. After reading the Entry spec,
it
>>>>> seems that only at each top-level field of an Entry can each object have
>>>>> a different codebase, right? (and not at lower levels within those
>>>>> objects). If so, that'll make things a lot easier.
>>>> 
>>>> I think it would be possible for something below top-level field to
>>>> have its own codebase but that would be extremely rare (too ugly to
>>>> work with).
>>>> 
>>>> More importantly I don't think you need to be that generic as I
>>>> suspect that your codebase probably does obey the "top-level field"
>>>> rule you mention. You could check that somewhat by doing a JavaSpace05
>>>> contents and dumping out class and associated classloader plus
>>>> codebase if present for each entry top to bottom (in fact you could
>>>> store it all up in a couple of hashtables and then dump it out which'd
>>>> save you reading through piles of duplicates).
>>>> 
>>>>> 
>>>>> Anybody have any experience doing this the "hard" (with Java
>>>>> classloading) way?
>>>> 
>>>> Anyone who's implemented a JavaSpace at least ;)
>>>> 
>>>> Seriously, if you need some advice or whatever, punt a request up here....
>>>> 
>>>>> 
>>>>> Dawid
>>>>> 
>>>>> 
>>>>> On Mon, 2013-02-04 at 06:56 -0800, Gerard Fulton wrote:
>>>>>> One easy option may be to write a simple client using your old code
to
>>>>>> serialize the entries in the space to XML on disk. Then launch your
new
>>>>>> application and put entries into the space instance.
>>>>>> 
>>>>>> 
>>>>>> On Mon, Feb 4, 2013 at 3:34 AM, Dawid Loubser <dawid@travellinck.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> Thanks for the quick response, Dan!
>>>>>>> 
>>>>>>> I want to understand the classloading a bit better. Let me explain
to
>>>>>>> you how I *think* it works. Also, for reference, I'm using the
rio
>>>>>>> project, that has a special classloader that understands URLs
in the
>>>>>>> form "artifact:foo:bar:1.0" and which loads classes from Maven
>>>>>>> artifacts, but I think it's conceptually the same as any other
URL
>>>>>>> scheme etc.
>>>>>>> 
>>>>>>> * When an  Entry it written to space, it's turned into a
>>>>>>> MarshalledInstance. This is annotated with the codebase (a collection
>>>>>>> of URLs). Immediate question: Is there only one codebase at the
>>>>>>> top-level of the entry, or does every object in the graph have
(or can
>>>>>>> have) its own codebase?
>>>>>>> 
>>>>>>> * When a worker takes/reads an entry (which might contain things
that
>>>>>>> both are on the worker's classpath, and perhaps lower-level content
>>>>>>> that is not (i.e. specialisations that it does not have to
>>>>>>> understand), how does the space proxy know what to do? I imagine
it
>>>>>>> uses the thread context class loader, but then how does it deserialise
>>>>>>> the objects that is not on that classpath (using the codebase
>>>>>>> annotation of the MarshalledInstance, I imagine) whilst not colliding
>>>>>>> with the classes already available to the worker? Using some
sort of
>>>>>>> parent/child delegation?
>>>>>>> 
>>>>>>> I've got a very tricky ClassCastException problem I'm trying
to debug,
>>>>>>> where it's clearly the same class loaded by two classloaders,
and thus
>>>>>>> the field cannot be assigned. I don't know how to get "in there"
and
>>>>>>> solve the problem, it seems I can only respond to the
>>>>>>> UnusableEntryException, get the partial entry, and lose the rest?
>>>>>>> 
>>>>>>> thanks so much,
>>>>>>> Dawid
>>>>>>> 
>>>>>>> 
>>>>>>> On Mon, 2013-02-04 at 11:17 +0000, Dan Creswell wrote:
>>>>>>>> On 4 February 2013 11:10, Dawid Loubser <dawid@travellinck.com>
>>>>>>>> wrote:
>>>>>>>>> Hi all,
>>>>>>>>> 
>>>>>>>>> I have a bunch of entries in a JavaSpace (representing
long-running
>>>>>>>>> process state, i.e. they exist for days or weeks), and
these
>>>>>>>>> contain some objects that were generated from XML (using
JAXB).
>>>>>>>>> That vocabulary has evolved (additions only) but now,
of course,
>>>>>>>>> the computed SerialVersionUIDs will be different. When
I redeploy
>>>>>>>>> my workers that have been built against the new API,
they will
>>>>>>>>> surely fail when reading the old entries.
>>>>>>>>> 
>>>>>>>>> Any strategies as to how I can migrate the data in the
space? I'm
>>>>>>>>> running a persistent outrigger (snaplogstore). I was
thinking of,
>>>>>>>>> in a worker with an 'old' classpath, draining the space,
and
>>>>>>>>> storing those entries in some non-java representation
on disk, and
>>>>>>>>> then in a worker with the 'new' classpath, reading those
entries
>>>>>>>>> and re-populating the space.
>>>>>>>> 
>>>>>>>> Slightly more complicated but it's possible to have one worker
do all
>>>>>>>> this with some classloader magic. You basically load old
and new
>>>>>>>> definitions into separate classloaders with the old version
being
>>>>>>>> directly on the classpath, the other dynamically loaded from
>>>>>>>> something not on the classpath.
>>>>>>>> 
>>>>>>>> Then you can take the old easily and use reflection magic
to populate
>>>>>>>> a new and write it.
>>>>>>>> 
>>>>>>>> One other challenge is that most JavaSpace implementations
don't like
>>>>>>>> mixed schemas do probably you're better to create a second
space,
>>>>>>>> write the migrated ones into that and then turn off the old
one (or
>>>>>>>> copy back to the old once you've cleared it down/re-built
it).
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Migrating data in a space is surely something that must
have caused
>>>>>>>>> problems for somebody before, and I'd love to tackle
this problem
>>>>>>>>> drawing on some experience of others.
>>>>>>>>> 
>>>>>>>>> regards,
>>>>>>>>> Dawid
>>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>> 
>>> 
> 


Mime
View raw message