river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Firmstone <j...@zeus.net.au>
Subject Re: Lookup Service Discovery using DNS?
Date Sat, 16 Jan 2010 11:49:53 GMT
Thanks Gregg,

Your spot on.

The memory explosion caused by multiple class loaders, one for each 
remote location, is significantly eliminated by your solution of 
preferred local classes.  This works since all marshalled instances are 
unloaded in the one classloader and utilise the same bytecode also 
improving compatibility, by preventing isolation of compatible objects 
by classloader tree Type differences.

I think utilising the OSGi framework combined with Codebase services, 
eliminating the coupling between codebase URL's and Classloaders, can 
perform a similar reduction in downloaded code, although not quite as 
small, all classes from a package can use the latest compatible bundle, 
share the same classloader, bytecode and any other packages that are 
depended upon, significantly reducing RMIClassloader explosion and 
duplicate bytecodes, once a compatible bundle has been downloaded, it 
can be utilised for all instances of that class, we should prefer latest 
compatible bundles I believe.

Package version Metadata (specified in each bundle) can be stored in 
MarshalledObject instance Metadata, the OSGi versioning scheme, 
specifies compatibility across bundle upgrades.

In reality, MarshalledObjects are a compromise to reduce downloading 
remote code, they duplicate the information held within the binary 
marshalled form of an object (such as implemented interfaces.)  Keeping 
duplicated information to a minimum is important, whenever we duplicate 
data, we risk duplication errors or increased size.  At some point it 
becomes more efficient to just unmarshall the object rather than 
increase stored metadata, this is best done at the client, internet 
servers just wont have the resources to perform queries while the 
security implications of executing foreign code are significant.

I think the full load issues in Reggie can be fixed, so it doesn't fail 
under load, levelling out instead.  Although I think your right, Reggie 
isn't suited for the internet.  Perhaps some type of Global indexing 
service that crawls the list of available services through DNS-SD and 
stores their marshalled proxy instances to perform the functions that 
Reggie currently performs.  Perhaps an interface inserted into the 
hierarchy implementing a subset of Reggie's current methods for some 
compatibility with Reggie?  Another interface might implement other 
methods that assist in filtering the results, or returning a bytestream, 
where proxy's are unmarshalled one at a time at the client and 
inspected, then dropped until a suitable match is found, the remaining 
bytestream can be discarded.  Garbage collection would clean up unwanted 
proxy's during bytestream inspection, keeping memory usage to a minimum.

Looking at this service type definition, Daniel is using DNS-SD to 
locate a Jini service directly:

Mutiple identical matches are returned with an index integer appended.

DNS SRV (RFC 2782) Service Types at 
http://www.dns-sd.org/ServiceTypes.html contains this service entry:

  jini            Jini Service Discovery
                  Daniel Steinberg <daniel at oreilly.com>
                  Protocol description: Convention giving a 
deterministic programmatic mapping
                  between Jini service interface names and subtypes of 
the DNS-SD service
                  meta-type "_jini._tcp". For example, a client wishing 
to discover objects
                  that implement the "com.oreilly.ExampleService" 
interface would broswse for
                  the DNS-SD service subtype 
                  (Note: Using Apple's Bonjour programming API, service 
                  like this are expressed as a comma-separated list 
                  main type, e.g. "_jini._tcp,ExampleService.oreilly.com".
                  This allows an object that implements several 
interfaces to specify
                  all of those interfaces in a list when it registers 
its service.
                  When browsing for services, at most a single subtype 
is allowed.)
                  Defined TXT keys: None

Some observations:

   1. Re-discovering the correct identical service would be almost
      impossible with DNS-SD
   2. Reggie is not designed for a world wide network.
   3. Interfaces utilised by services must not change over time, they
      can be extended when change is required.
   4. Filtering is limited to qualified names.
   5. Additional Types can be registered for one service instance.
   6. We can crawl the DNS-SD list downloading marshalled proxy's
      forming the basis of a new lookup service implementation.

Some thoughts:

   1. You would be aware of which services you would want to make
      global, perhaps a configuration option?
   2. We probably want to restrict Service proxy's to simple reflective
      proxies with Secure Jeri for now, security is much simpler.
   3. Smart proxies, hmm, proxy verification, hmm needs more thought.
   4. Interfaces for services should be in separate bundles from
      implementations, to prevent Interface duplication in local JVM
      ClassLoaders when implementations change in an incompatible manner
      ( versions over time), allowing a service to remain as an
      abstraction.  The OSGi frame work could locally upgrade an
      interface bundle when a new service utilises a new interface or an
      interface extending existing interfaces, allowing old and new
      service implementations to be utilised as the same Type within a
      local JVM.

DNS-SD might be used when we don't care about identity or matching 

What about when we cared about identity? 
Similar ground has been covered before in Project Neuromancer with 
XuidDirectory, perhaps this could be a crawler instead of utilising 
registration?  Don't worry about leasing, just repeat the crawl, 
discarding the older data on a cyclic basis? Would it be safe to 
unmarshall downloaded proxy instance to query for Xuid, provided it was 
sandboxed with no permissions and utilised integrity checking? What do 
you think Jim?

Interface UuidDirectory
    Lease register(Xuid id, Object o, long leaselen)  throws 
UnknownXuidException, RemoteException;
    Lease[] register (Xuid[] ids, Object[] o, long[] leaseLens) throws 
UnknownXuidException, RemoteException;
    Object lookup(Xuid id, XuidDirectory[] visited) throws 
UnknownXuidException, RemoteException;

Not only would we need worldwide available Codebase services, but 
caching Codebase services also.
eg interfaces:

org.apache.river.global.CodebaseService  //code I make available which I 
org.apache.river.global.CachingCodebaseService //code I've dynamically 
downloaded, not signed by me ( extends CodebaseService)

A caching code base service is important to ensure bytecode remains 
available over time when other service locations go down.

Unsigned jar files would not be allowed in a Codebase Service.

For Objects passed between services where identity is important, local 
JVM immutability would be desired.  The same object is duplicated across 
nodes and doesn't change so we don't need to coordinate transactions etc.

All platform services would rely on local code.  Platform code could be 
upgraded using codebase services on a periodical basis, worldwide 
utilising the OSGi platform.  I wonder how project Jigsaw will pan out, 
perhaps the JVM will become dynamically upgradeable also?

Your Thoughts?


Gregg Wonderly wrote:
> One of the primary issues with the current lookup server design and 
> the ServiceRegistrar interface in particular is the fact that one can 
> only receive unmarshalled services.  My work on providing marshalled 
> results, visible in the http://reef.dev.java.net project, allows the 
> opportunity to find stuff without getting a JVM memory explosion.  
> However, there is a further issue, and that is in order to "see into" 
> the marshalled object you need to either resolve it or dive into the 
> stream of bytes.  My further work on the PreferredClassLoader 
> mechanism for establishing "never preferred" classes helps to make it 
> possible to do resolution of remote objects using locally defined 
> class instances, so that you can, for example, look at Entry objects.
> Also, in my reef work, I investigated adding the names of all classes 
> that are visible in the type hierarchy of the objects so that you 
> could ask "instanceof" kinds of questions without unmarshalling.
> There are just all kinds of issues related to this that come into 
> play. Performing a Jini lookup, on the internet today, would be like 
> asking your web browers to open a tab for every page on the net, and 
> then waiting for that to finish so that you could click through the 
> tabs to find what you are looking for.
> Clearly, lookup needs to be a completely different concept to exist in 
> a large world such as is visible "on the internet."
> Gregg Wonderly
> Peter Firmstone wrote:
>> Anyone got any opinions about Lookup Service Discovery?
>> How could lookup service discovery be extended to encompass the 
>> internet?   Could we utilise DNS to return locations of Lookup Services?
>> For world wide lookup services, our current lookup service might 
>> return a massive array with too many service matches. Queries present 
>> the opportunity to reduce the size of returned results, however 
>> security issues from code execution on the lookup service present 
>> problems.
>> If we did allow queries on a Lookup Service, could we do so with a 
>> restricted set of available Types utilising only trusted signed 
>> bytecodes?  If bytecode becomes divorced from the origin of a 
>> Marshalled Object, and instead obtained from a trusted codebase 
>> service, then perhaps we could have a system of vetting source code 
>> submitted for the purpose of becoming trusted authorised query 
>> types?  Any query utilising untrusted bytecode might return an 
>> UntrustedByteCodeException?
>> Perhaps we could make service match results available as a 
>> bytestream, clients that couldn't handle large amounts of data could 
>> inspect the bytestream, continually discarding what isn't required?
>> Check out this link on DNS service discovery:
>> http://files.dns-sd.org/draft-cheshire-dnsext-dns-sd.txt
>> Cheers,
>> Peter.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message