incubator-clerezza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henry Story <>
Subject Re: How to name things with URIs
Date Sat, 14 May 2011 17:54:55 GMT
Btw, I suppose I should say that I am not massively against the suggestion
you started this thread with. It is more than I am trying to explore this
more carefully, because it is an important discussion that deserves careful

On 14 May 2011, at 17:09, Reto Bachmann-Gmuer wrote:

> On Fri, May 13, 2011 at 5:46 PM, Henry Story <> wrote:
>> Reto wrote:
>>> Clerrezza-489 and you also quote may statement of 463. okay, you might say
>>> that I'm stating rather than arguing.
>> :-)
>>> The argument: they are different thing, both intensionally (cache and
>>> source) as in many case extensionally (triples may differ).
>> in that sense I agree.
>> But then the other point I made is also true, and that is that different
>> users may get different
>> graphs back for the same remote resource. In fact those users may be the
>> same user at different times.  Since those are all different graphs by your
>> definition above one should also give them different names.
> We do not have support for this yet and I think its a feature
> increasing complexity massively.

You are dealing with an architectural problem which cannot just be dealt
with in stages. You need to look at the problem as a whole, or you will
just end up with the problem we are having right now. It is better to get this
issue cleared up now, than have a mess of graph names in one year, when a lot of
applications depend on this.

In any case it's not increasing anything massively, it is the logical 
continuation of your point above. 

Your argument was:

"they [the remote and the locally fetched graph] are different thing, both 
intensionally (cache and source) as in many case extensionally (triples may differ)."

And so it follows that graphs sent at different times may also differ
extensionally and should have different names too.

You can't have it both ways, argue on intentionality for different names and then
refuse to see that temporally different graphs would also then need different names.

( Btw. there are good arguments that intentionally the local graph if it is a cache
does not differ from the remote one. In any case if you pursue this too far you will
find that you can never name any remote thing. )

> I don't think that clerezza-490 need to be resolved urgently, but anyway we
> should proceed issue by issue, and the best resolution of an issue is a minimal
> resolution not one that tries to foresee and future issues.

I tend to see logical consequences of an argument as being contained in the argument,
and not being future issues that can be looked at later as somehow being distinct. 

Clerezza-490 that deals with different ways the server can present itself to other 
servers, is not of course something that needs to be implemented immediately. But it 
would be good that the naming solution we come up with can be extended to that case 
and to the temporal case.

So I am invoking Clerezza-490 as something to help test the naming ideas being put 
forward here. This is a logical test if you will.

>> So local graph naming schemes should take that into account, which is why I
>> suggest that we have an API that can allow for extensibility here.
> We have currently things and we are naming them badly.
> Prior to you r webproxy we had:
> <webid-profile-url>.cache as name for the cache of the webprofile
> and
> <webid-profile-url> as uri for triples the user generated locally,
> this can be seen as extensions to the remote profile with information
> (like preferred language) that happen not to be in the remote profile
> which was consistent with local users who only had
> <webid-profile-url> for the triples they control which include both
> the regular profile as well

yes, and both of those were not good solutions.
The .cache solution is bound to create a problem if someone remotely has 
a URI named http://some.example/resource.cache

It is bound to lead to nasty name clashes, with the same URI naming two different things.

Remote URIs are named by remote resources, so it makes more sense to use the URI of the 
remote resource to name the graph of the remote resource. The remote resource was named
by the owner of the resource. We should respect that.

If there are local additions to a remote graph, they should be given a local
URI. There is nothing simpler than this solution it seems to me.

> Now <webid-profile-url> is the cache,

You can look at it that way, or you can think of it as the name of the remote
graph, with the contents being the cache of the remote graph.

If you were to make the local graph available publicly, it would then of
course need to have a local url tied into your namespace. Perhaps this is a good
way to think of the distinction.

> not sure where additional
> triples added locally get stored, i.e. where triples added to
> webIdGraphsService.getWebIDInfo(webId).publicUserGraph are stored.

They should be stored in graph names with a local URL clearly since these are being stored
by a local agent. And I think it will be application specific what the names of those graphs
should be.

So currently as an initial proposal I put them in 

{local service name}/user/{remoteWebID}


(though that needs to be URLEncoded.)

This makes sense. If you have local info for the user admin you would put it in

Now imagine there are 2 or 3 applications on a clerezza instance, that a remote user  with
his WebID uses.  There is no reason these applications should be putting all the information
they generate for that user in the same local graph. 

A banking graph should put banking info in its graph and a blogging graph into  its graph.
The way to do this is to give applications - like users - access to  namespaces. Perhaps the
bank application that was given control of the /bank namespace could coin graphs for remote
users in that space, eg /bank/id/{remoteWebID} and the blogging one in /blog/id?{remoteWebID}

By giving apps access to name spaces you can also make sure that there won't be any clashes.

now, that could be a reason for having URIs like


But then you see that applications on different servers will have name clashes too if they
ever merge their databases.

The advantage of using the local published name is that this then would allow simple dumps
of databases and their merging in remote databases without clashes.

> I'm not saying the old naming was perfect but it worked in a somehow
> consistent fashion for local and remote users.

It was very confusing to me at least, as I point out in CLEREZZA-489. 

And it furthermore is inconsistent with your point above that remote graphs are
intentionally different from the local version.

> Now my application taht used this feature is now longer working.

Well that is the problem of having an initial system that is broken.
It will be easy to fix this, and we should fix it well, not do a half job of it, 
because this is a distributed naming problem. 

>> in Clerezza-489 I wrote that one could describe each graph like this in a
>> special Cache graph perhaps.
>> :g202323 a :Graph;
>>     = { ... };
>>     :fetchedFrom <;;
>>     :fetchedBy <;;
>>     :representation <file:/tmp/repr/202323>;
>>     :httpMeta [ etag "sdfsdfsddfs";
>>                      validTo "2012...."^^xsd:dateTime;
>>                     ... redirected info?
>>                     ] .
>> :g202324 a :Graph;
>>     = { ... };
>>     :fetchedFrom <;;
>>     :fetchedBy <;;
>>     :representation <file:/tmp/repr/202324>;
>>     :httpMeta [ etag "ddfsdfsddfd";
>>                      validTo "2012...."^^xsd:dateTime;
>>                     ... redirected info?
>>                     ] .
> If we had barketing in RDF and our tooling would support it the the
> above might be somehow topical, answer to the question "how to name
> this?" "don't name it".

The above is just a way of writing the contents of the graph and the metadata
in the same file.  That is what the 

 :g202323 = { ... } 

is about. You don't need any special tools for that. If you use Jena to get the graph
named above you would get the content of the brackets. The point is that the content

  :fetchedFrom ..
  :fetchedBy ...

is not in the g202323 graph, but in a graph metadata graph.

> Please lets proceed issue by issue and make
> sure every brick we place is really solid and separate this from
> visionary long term stuff.

Ok, I hope you see that I introduced nothing new there. It's just an
n3 notation that makes it easy to write things out in an e-mail.

So please consider that point again in that light.

>> Then this API could use information from this graph to and information from
>> the user's request
>> to find the correct local graph he wants.
> Still the local graph would have a name, probably - but as I said its
> irrelevant. Lets deal with the issues at hand, you changed the names
> of graph (which I agree didn't have the best possible names) with
> names that I think are worse, lets find something we can agree upon.
> (otherwise, please roll back to the version with the orginal names
> till we find a consensus).

Well I don't think rolling back would improve anything. I think clearly
this was an improvement. But I do think we can do better.

So my thinking is that to reach consensus we can do this with an API, without
deciding what precisely the names should be. The best is just to lay out the

 1. mapping from a remote URI to the URI understood by the local triple store 
   and back. There should be no name clashes. It should be possible to easily extend   
   to have agent views and temporal views.
 2. method for applications to take hold of legitimate namespaces in such a way that 
    a clash of names is not possible.


> Reto
>> Henry
>> PS. Having said that one then may just wonder why local graphs should ever
>> have anything other than
>> local URLs, since every time someone made a copy of a local graph it would
>> be different.

Social Web Architect

View raw message