incubator-clerezza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henry Story <henry.st...@bblfish.net>
Subject Re: How to name things with URIs
Date Mon, 16 May 2011 09:20:48 GMT


Where we propose an initial criterion to answering the question this thread is asking:
"How to name things with URIs".


>> [snip]
>> On 15 May 2011, at 12:11, Reto Bachmann-Gmuer wrote:
>>> Henry Story Wrote:
>>>> [snip]
>>>> But here is a quick question: why not simply make the ProxyTcProvider a higher
priority than the pure local one?
>>> We wouldn't know when to try to dereference a graph and when not.
>> 
>> Any Resource that is not local, that can be dereferenced (http, https, ftp, ftps,...)
(that is not on some blacklist) can be dereferenced.
>> 
>> Any resources that are local (localhost domain, or the name of the current machine
>> is running on) should be passed to the next TcProvider.
> We don't know the names of the current machine, remember that we are
> operating on the rdf access layer (SCB) here. Unlike the WebId service
> this is not on the platform service.

Well perhaps it is at the wrong layer then. Perhaps a WebProxy, being at the web layer needs
to know the name of the service it is running as. Please read through the following  reasoning
carefully before replying.

Consider that the API you are developing has the following method

     def getGraph(name: UriRef)

If the WebProxy services is called like this:

wp.getGraph("https://bblfish.net/user/admin/".uri)

Then it certainly would be very useful for the proxy service to know that bblfish.net is the
local machine, and that it does not need to do an httpS GET to discover the graph. 

A use case can help illustrate this. Imagine a zz service looks at a foaf file that contains
a reference to <https://bblfish.net/user/admin/#me> It sends the request to the cache
proxy to fetch the file so that it can find out more about that resource. What is it? A fish,
a mineral, a human, or an ontology?  So it goes to the web proxy and calls it above. If your
proxy looks in its
  
    urn:x-localinstance:/cache/

namespace it won't find a reference to our URL above. So it will do a GET on the web, find
it,
and place it in 

    urn:x-localinstance:/cache/https://bblfish.net/user/admin/


Now imagine I then update my bblfish.net profile. The next time my zz instance will go and
look in the proxy it will find the above urn and look up the information there. Not only will
we now have the information twice in the database, we will end up getting out of date information
for our own data!

   So interestingly this suggests that local data be placed in the store, in a graph that
corresponds to the name of the graph at which people from across the web would fetch it at.
Because then one won't need to do any special conversion when searching for data that is local.
(Ie: public data about me should be placed in a graph named https://bblfish.net/user/admin/
)

   OR  BETTER: all local data should be stored in the local store with *relative* URIs - of
which your urn:x-localinstance is a hack to work with triple stores that don't come with relative
URI support - and these *relative* uris should when transformed to global URIs be the URIs
of what an external user would GET when making a request on that graph. Ie: my local public
profile should be in the graph 
    
   /user/admin 

which is internally written out as

   urn:x-localinstance:/user/admin

   This gives us a simple but very powerful criteria to test URIs - which is what this thread
is about: "How to name things with URIs". 
  
   - local graphs should be named by how they are accessible remotely (even if they only use
relative URIs)
   - remote cache graphs - ie graphs that pretend to only store what is said remotely - should
(when public) store the data at the graph of the remote URL.

   Now: this is where your view and mine can converge. If the local Clerezza triple store
republishes remote URLs, then those will have to be placed in the store at a local, relative
URL, eg urn:x-localinstance:/cache, because at this point people will be able to point to
the publishing zz instance to make the claims made by the remote graph.

   Though instead of doing that I would rather suggest using the HTTP Proxy mechanisms, and
develop a Proxy  layer for each zz instance, as I think that avoids taking responsibility
- other than the responsibility of being a good proxy - for what others say. (see slide 37
of my presentation "Philosophy and the Social Web" http://www.slideshare.net/bblfish/philosophy-and-the-social-web-5583083
)

   NEVERTHELESS it is then still true that the WebProxy needs to know the URIs of the local
machine if it is going to function correctly, since it needs to know when it does not need
to make an HTTP connection but can either look at the graph directly or make a request to
the correct jax-rs method



Mime
View raw message