river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom Hobbs <tvho...@googlemail.com>
Subject Re: MarshalledServiceItem
Date Wed, 02 Feb 2011 11:29:51 GMT
I think Michal made a very good suggestion with the invention of a
*new* service type called something like "ServiceBrowserService".

(Good call, Michal!).

Reading the to-ing and fro-ing discussing this use of
MarshalledServiceItem and the use of the lookup service, it does
appear that we're trying to squeeze the LUS into doing something that
it possibly isn't designed to do.  As Gregg has demonstrated, it can
be done and made to work well, but what is the cost of the complexity
in the infra/code cleanliness/design/etc?

A client can use a LUS to get *any* SBS, which can return non-domain
specific details describing the available services.  (e.g. name :
String, type : Class[], service Id : String, groups : String[], etc).
Potentially clients can be deployed with all the JARs needed to
unmarshal a SBS (so no download needed there), they can then see
what's available view it's UI, and so download only becomes necessary
when the user uses a LUS and details from the SBS to choose their
service.

Maybe this SBS topic needs to be split off into it's own thread.

On Wed, Feb 2, 2011 at 11:00 AM, Dan Creswell <dan.creswell@gmail.com> wrote:
> On 2 February 2011 10:33, Peter Firmstone <jini@zeus.net.au> wrote:
>
>> Dan Creswell wrote:
>>
>>>
>>>>
>>>>> Clarification:  Separate the location and identity of a jar file or
>>>>>
>>>>>
>>>> archive containing class files:
>>>>
>>>> We have integrity constraints, which rely on message digests to confirm
>>>> files have not been tampered with.
>>>>
>>>> The message digest (although now considered a weak form of encryption),
>>>> is
>>>> the identity of a jar file, well we use this type of information to
>>>> identify
>>>> it's the jar we expect.
>>>>
>>>> For httpmd, we use a URL string annotation (which could be an IP address
>>>> and port, or a DNS hostname and port), a path and file name, followed by
>>>> the
>>>> message digest.
>>>>
>>>> The IP address or dns hostname , port and path represent the location of
>>>> the jar file, while the file name and message digest represent it's
>>>> identity.
>>>>
>>>> If we separate the location + port, it can be discovered using DNS-SRV
>>>> records, allowing redundant codebase servers, while the identity is
>>>> limited
>>>> to the file name and message digest.
>>>>
>>>> Then the RMI codebase property only needs to be a domain in which a
>>>> suitable codebase can be discovered and queried.
>>>>
>>>>
>>>>
>>>>
>>> Mmm, but the process of resolution on codebases slows things down for
>>> Gregg.
>>>
>>>
>>
>> Correct, Gregg avoids the process of resolution, until it's absolutely
>> necessary, while Chris, caches his jar files after downloading, so he only
>> has to do so once.  DNS resolution might take 10 seconds, so this is an
>> important point you've made.  A local caching DNS would be advantageous.
>>
>>
>>  I think we're gonna have to decide which profiles of machines we're going
>>> to
>>> target with what solutions. I'm not suggesting we try and build for all
>>> the
>>> different profiles more that we'll have to choose what we do and do not
>>> support.
>>>  But, if service proxy's are sharing jar files, what does that mean?
>>> Someone
>>>
>>>
>>>> somewhere chose to package them all together like that for some reason.
>>>>> What
>>>>> reason, what problem are they solving?
>>>>>
>>>>>
>>>>>
>>>>>
>>>> Deployment reasons.
>>>>
>>>>
>>>>
>>>>
>>> :) Yes, deployment but why are we choosing to deploy with this setup. What
>>> does it imply about services when they share .jar files?
>>>
>>>
>>
>> Because it becomes difficult to manage with existing tools when a codebase
>> contains multiple jar files for multiple services.  I believe that the
>> Codebase service implementors managed to make deployment easier by
>> automating codebase annotations.
>
>
> Or we could build some new tools....
>
>
>>
>>
>>>
>>>>  Again, I'm sitting here thinking as a deployer of services, if I want to
>>>>
>>>>
>>>>> widget around and consolidate .jars I can do that ahead of time and then
>>>>> tweak service codebases via config just prior to deploy.
>>>>>
>>>>>
>>>>>
>>>>>
>>>> But it would be faster for the client to receive unconsolidated jar's,
>>>> since other services might use some of these jars also.
>>>>
>>>>
>>>>
>>>>
>>> Not for Gregg who hates roundtrips and multiple discoveries of codebases
>>> and
>>> such.
>>>
>>>
>>
>> If we have the Entry jar files installed at the client or first download
>> the jar files for Entry's we need, using the getEntryClasses method, then we
>> can unmarshall only these entry's which we already have codebases for during
>> lookup, of course at some point we're going to have to download something,
>> but we can avoid downloading the codebases for services we don't need.
>>
>>
>>
>>>
>>>
>>>> What matters is the proxy gets the correct class files, in its own
>>>> private
>>>> namespace that are not shared with other proxy's (except for service api,
>>>> which may include Entry's), but we can save duplicate downloads.
>>>>
>>>>
>>>>
>>>>
>>> Can you explain your reasoning for saving duplicate downloads? I think you
>>> mean because in some cases service's could share a codebase which they can
>>> do under the current scheme of course. Note that having "correct class
>>> files" isn't IMHO a sufficient constraint, it has to be a "particular
>>> collection of specific implementations of classes and versions".
>>>
>>>
>>
>> Correct, but it might reduce the size of the download when we've got to
>> bite the bullet and download the service jar files, if we've already got
>> some of the necessary jar's cached locally.
>
>
> Size of download is only one thing we need to worry about - latency....
>
>
>>
>>
>>>
>>>
>>>
>>>> Maven provisioning is interesting.
>>>>
>>>> This is why I'm interested to investigate separating jar file identity
>>>> from
>>>> location, to simplify deployment and redundancy.  I'm putting my thoughts
>>>> out on the list, to gather responses, to see if there's a better way.
>>>>
>>>>
>>>
>>>
>>> Okay, so I think this also impacts on the stuff being discussed with
>>> Gregg.
>>> Whilst there are some complimentary aspects there are some costly steps in
>>> there as well.
>>>
>>>
>>
>> The trick is to delay the costly step until we've decided which service
>> instance we want.
>>
>>
>>
>>>
>>>
>>>>
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Jini's lookup service lack of AND / OR querying capability is due
to
>>>>>> security, the avoidance of instantiating foreign objects.
>>>>>>
>>>>>> Delayed unmarshalling of the service proxy allows service entry's
to be
>>>>>> compared as objects, without requiring a codebase download for the
>>>>>> proxy
>>>>>> if
>>>>>> it's not the service we want, so it's not quite just returning a
>>>>>> MarshalledInstance.  This should be done without compromising the
good
>>>>>> security features of the existing lookup service.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> I don't follow - I could tweak ServiceItem to hold the proxy as a
>>>>> MarshalledInstance and still expose all other "service identifying
>>>>> information". That MarshalledInstance mightn't even immediately carry
>>>>> the
>>>>> proxy code, could still be on the server and pulled down at point the
>>>>> consumer actually wants the proxy. Feels like some simple
>>>>> interface/sub-classing.
>>>>>
>>>>>
>>>>>
>>>>>
>>>> This is true for cases where the service types don't matter to the
>>>> client,
>>>> I think Gregg wanted to elimate all codebase downloads until he was sure
>>>> he
>>>> had the correct service. Use of a MarshalledInstance could be an
>>>> acceptable
>>>> compromise, if Entry's have their own codebase annotations.  How would
>>>> this
>>>> affect the lookup semantics if we're looking for particular service types
>>>> though?
>>>>
>>>>
>>>
>>>
>>> Entry's have to have their own codebase for all cases of clients that
>>> don't
>>> know about those Entry's. Any client that has built-in knowledge of those
>>> Entry's will have them available on the classpath by virtue of it's need
>>> to
>>> specify them in it's lookup search.
>>>
>>>
>>
>> Except when we call getEntryClasses for a ServiceTemplate and someone's
>> created a new Entry, then we need reflection to get the fields, but that's
>> probably a rare case anyway.
>>
>>
>>  Can you explain more about the service types question?
>>>
>>>
>>
>> Reggie stores the class types of the service (not class files), so that
>> Reggie can use a ServiceTemplate to retrieve marshalled service proxy's that
>> match the service type defined in the template.
>>
>>
>>
>>  If I'm looking for a particular type, I specify a particular interface.
>>>
>>
>> Exactly, so if we use a MarshalledInstance when we register the proxy, the
>> only interface (or class in this case) a client can specify is Object or
>> MarshalledInstance.
>>
>>
> Ah, no. MarshalledInstance is the general contract we're talking not the
> implementation. There is nothing to stop us building a subclass or some
> other container that compacts the proxy away but also retains/generates
> sufficient information to allow matching on types.
>
> And in fact, we only need that contract for the client. Registration needn't
> follow that form. Client's also will have Entry types they wish to specify
> for searches on their classpath.
>
> So the codebase trick is asymmetric at least potentially - we do it for
> clients, not services that are registering....
>
>
>>   I'm
>>> guessing you mean a service with specific Entry's? A client after some
>>> specific set of Entry's will already know those via classpath. We could
>>> simply allow a client to say "I'm only interested in Services with these
>>> Entry's, return me all matches but only give me the following Entry types
>>> as
>>> part of the ServiceItem". This feels much closer to the original intent
>>> "stop stuff getting to a client that it's not interested in" than trying
>>> to
>>> "control in detail all aspects of download and intimately dig around in
>>> service implementations, including .jars, to do it".
>>>
>>>
>>
>> I don't think we should dig around in jars etc to do it, just package
>> Entry's separately so we don't have to.  By providing the class files for
>> the Entry's we want unmarshalled via the lookup call, we wouldn't need to
>> download their jar files.
>>
>
> Package them separately and put them on client classpaths and then filter.
> In this fashion client has limited knowledge and can protect itself from
> additional knowledge via download side-effects by expressing it's desire to
> only see types it explicitly searches for/considers. Quite simply, if the
> client doesn't have an Entry on its classpath it has next to no interest in
> seeing that stuff.
>
> I'm looking for the simplest thing that will work, then we can look at
> additions/extensions. Why tackle complicated downloading/classloading if we
> can just solve the problem with a simple API extension that gives enough
> flexibility for the common case?
>
>
>>
>> Hence the method:
>>
>>
>> ResultStream lookup(ServiceTemplate tmpl, Class[] unmarshalledEntries, int
>> maxBatchSize) throws IOException;
>>
>> But once we have the service we want, we'll need to download the jar files.
>>
>>
> Yep,no chance of avoiding that....
>
>
>>
>>
>>>
>>>
>>>
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Over the internet, we could potentially have very large lookup
>>>>>> services,
>>>>>> by
>>>>>> allowing clients to remove unwanted services from their results before
>>>>>> unmarshalling, we can reduce the resources required of the client:
>>>>>>
>>>>>>  * Network Bandwidth, clients don't need to download unwanted
>>>>>> codebases.
>>>>>>  * Memory (ClassLoader and unwanted classes are not loaded into
>>>>>> memory).
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> If we were to go across internet have we squared away use of e.g.
>>>>> DNS-SD?
>>>>> i.e. Is it a given we'll expose a classic JINI LUS?
>>>>>
>>>>>
>>>>>
>>>>>
>>>> The semantics of DNS-SD make it well suited to discovery of a lookup
>>>> service.  I think Sim highlighted earlier that its difficult to get
>>>> domain
>>>> administrators to do Dynamic Updated DNS-SD, they're comfortable with
>>>> DNS-SRV records, so it would appear easier to rely on DNS-SD as a lookup
>>>> locator / domain browser, not as a lookup service replacement.
>>>>
>>>>
>>>>
>>>
>>> Agreed. Solves my concerns of exposing unicast locators of the traditional
>>> type and multicast across the net.
>>>
>>>
>>>
>>>
>> ;)
>>
>> Cheers,
>>
>> Peter.
>>
>

Mime
View raw message