Return-Path: X-Original-To: apmail-incubator-clerezza-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-clerezza-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 584744FB3 for ; Sun, 15 May 2011 09:28:55 +0000 (UTC) Received: (qmail 83896 invoked by uid 500); 15 May 2011 09:28:55 -0000 Delivered-To: apmail-incubator-clerezza-dev-archive@incubator.apache.org Received: (qmail 83840 invoked by uid 500); 15 May 2011 09:28:54 -0000 Mailing-List: contact clerezza-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: clerezza-dev@incubator.apache.org Delivered-To: mailing list clerezza-dev@incubator.apache.org Received: (qmail 83832 invoked by uid 99); 15 May 2011 09:28:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 15 May 2011 09:28:54 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [74.125.82.175] (HELO mail-wy0-f175.google.com) (74.125.82.175) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 15 May 2011 09:28:47 +0000 Received: by wye20 with SMTP id 20so3117751wye.6 for ; Sun, 15 May 2011 02:28:26 -0700 (PDT) Received: by 10.216.236.157 with SMTP id w29mr1303988weq.18.1305451706580; Sun, 15 May 2011 02:28:26 -0700 (PDT) Received: from bblfish.home (AAubervilliers-651-1-187-121.w83-200.abo.wanadoo.fr [83.200.58.121]) by mx.google.com with ESMTPS id h39sm1972634wes.29.2011.05.15.02.28.24 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 15 May 2011 02:28:25 -0700 (PDT) Subject: Re: How to name things with URIs Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Henry Story In-Reply-To: Date: Sun, 15 May 2011 11:28:22 +0200 Cc: clerezza-dev@incubator.apache.org Content-Transfer-Encoding: quoted-printable Message-Id: <5766C5BA-859F-4B2F-8031-40A1132E00B0@bblfish.net> References:

<1305286573.1455.11.camel@Nokia-N900-51-1> <4B25F5AC-2E99-48E6-9596-23ACA832B8D6@bblfish.net> <5A45BF79-2528-4D62-B4D6-7D2472548D29@bblfish.net> <1305295678.1455.17.camel@Nokia-N900-51-1> <1305299062.1455.23.camel@Nokia-N900-51-1>

<687866F9-4C8B-40BB-B476-DE33F005663F@bblfish.net> <8C2AC462-42B2-4D93-A190-EF9EEFEC6471@bblfish.net> To: Reto Bachmann-Gmuer X-Mailer: Apple Mail (2.1084) X-Virus-Checked: Checked by ClamAV on apache.org On 15 May 2011, at 03:41, Reto Bachmann-Gmuer wrote: > I apologize for for having wasted (not just my) time in engaging in > this argument in such a non-constructive way. Well we did get cover a lot of important issues in using URIs, many of = which I this discussion has helped bring back to the fore of my mind. And we did boil things down to the question of why we need=20 urn:x-localinstance:/cache/ rather than just We both agree that either is an improvement over .cache >=20 > Let's talk code. Please have a look at parent/rdf.storage.web I just = committed. >=20 > What this thread is about is implemented in line 158 in > = http://svn.apache.org/viewvc/incubator/clerezza/trunk/parent/rdf.storage.w= eb/src/main/scala/WebProxy.scala?revision=3D1103264&view=3Dmarkup >=20 > i.e.: val cacheGraphName =3D new UriRef("urn:x-localinstance:/cache/" = + > name.getUnicodeString) >=20 > The code wouldn't work with cacheGraphName =3D name because in this > case, once created the cache would always be provided by a higher > priority WeightedTcProvider so that the caching Provider (WebProxy) > never gets considered again and thus cannot update when the cache copy > is considered outdated. Ok, so this issue is occurring because you are refactoring WebProxy to = be a=20 TcProvider, which it was not originally. One can of course see the pull = to=20 making it a TcProvider, though perhaps the delete methods are not so = useful=20 there. > So you're welcome to make suggestion on how it > should be different, I have not really had time to study all your local TcProviders, nor work out how a number can help anything distinguish between one = TcProvider and another. I have to go right now - my sister is calling... But here is a quick question: why not simply make the ProxyTcProvider a = higher priority than the pure local one? > but without other proposal and without you > withdrawing the -1 I have to change it to > name.getUnicodeString+".cache" which was the last (silently) accepted > name. I think we both agree that localinstance is better than the > .cache proposal, so I urge you to revoke your vote. >=20 > Cheers, > Reto >=20 > On Sun, May 15, 2011 at 12:17 AM, Henry Story = wrote: >> I would like you first to read through the extensive mail I wrote, = which took >> me some time to write, and think things through. >>=20 >>=20 >> Henry >>=20 >> On 14 May 2011, at 22:37, Reto Bachmann-Gmuer wrote: >>=20 >>> On Sat, May 14, 2011 at 7:54 PM, Henry Story = wrote: >>>> Btw, I suppose I should say that I am not massively against the = suggestion >>>> you started this thread with. It is more than I am trying to = explore this >>>> more carefully, because it is an important discussion that deserves = careful >>>> thought. >>> The careful procedure is to have tiny little issues which when >>> resolved bring a tiny but undisputed improvement. Now with your >>> resolution of CLEREZZA-463 I'm having massive problems and even if = you >>> think the status quo ante was fundamentally wrong I believe the >>> graph-renaming you did makes things worse. >>>=20 >>> I know that CLEREZZA-463 contains many real improvement. But it also >>> introduce problems. And not just what you might consider a >>> philosophical problem that names denote extensionally different = things >>> but also very practical ones. >>>=20 >>> One major problem is the permission. We introduces >>> WebIdBasedPermissionProvider and one implementation >>> (UserGraphAcessPermissionProvider) used to provide readwrite access = to >>> the profile graph. Now this no longer works because you changed the >>> names of graphs. Because of this and not because of a fundamentally >>> broken architecture before your patch applications that used to = work. >>>=20 >>> Your -1 was against urn:x-localinstance:/cache/ >>>=20 >>> The status quo ante was >>>=20 >>> cache graph: .cache >>> profie-graph: >>>=20 >>> with the resolution of CLEREZZA-463 we have >>>=20 >>> cache graph >>> profile graphs for local users: >>> profile graphs for remote users: = / >>>=20 >>> you did change some names, probably just because of inconsistent >>> changes things broke (UserGraphAcessPermissionProvider seems = pointless >>> right now). I don't want to >>>=20 >>> and such that because of the renaming of graphs the >>> UserGraphAcessPermissionProvider >>>=20 >>> - The user has no longer the right to write to its own graph >>> - Because the user graphs that is now (with your resolution of >>> CLEREZZA-463) named like >>> = >>>=20 >>> In my opinion to changed a suboptimal solution against quite a mess, >>> now you argue against my solution to tidy things up because you are >>> afraid of having a mess in one year. >>>=20 >>> So please either accept my proposal which started this thread as >>> something that is better than the status quo (i.e. retract your -1 = so >>> I can finally go back coding) or make a concrete proposal on how to >>> name the different entities I've been suggesting names for or else >>> revert the changes for CLEREZZA-463 (so that applications that used = to >>> work work again and we can start a proper development with little >>> issues and patches that represent undisputed improvements. >>>=20 >>>=20 >>> =3D=3D=3D=3D what I consider important and relevant to current = development >>> ends here =3D=3D=3D=3D >>>=20 >>>>=20 >>>> On 14 May 2011, at 17:09, Reto Bachmann-Gmuer wrote: >>>>=20 >>>>> On Fri, May 13, 2011 at 5:46 PM, Henry Story = wrote: >>>>>> Reto wrote: >>>>>>> Clerrezza-489 and you also quote may statement of 463. okay, you = might say >>>>>>> that I'm stating rather than arguing. >>>>>> :-) >>>>>>> The argument: they are different thing, both intensionally = (cache and >>>>>>> source) as in many case extensionally (triples may differ). >>>>>>=20 >>>>>> in that sense I agree. >>>>>> But then the other point I made is also true, and that is that = different >>>>>> users may get different >>>>>> graphs back for the same remote resource. In fact those users may = be the >>>>>> same user at different times. Since those are all different = graphs by your >>>>>> definition above one should also give them different names. >>>>> We do not have support for this yet and I think its a feature >>>>> increasing complexity massively. >>>>=20 >>>> You are dealing with an architectural problem which cannot just be = dealt >>>> with in stages. You need to look at the problem as a whole, or you = will >>>> just end up with the problem we are having right now. It is better = to get this >>>> issue cleared up now, than have a mess of graph names in one year, = when a lot of >>>> applications depend on this. >>> This kind of against agile mantras and it seems to contrast very >>> strongly to what you just did: you changed the names and now want a >>> scientific study to change them again to solve the problems your >>> namechange introduced. >>>=20 >>>>=20 >>>> In any case it's not increasing anything massively, it is the = logical >>>> continuation of your point above. >>> If you propose a patch which changes names and deliver good = arguments >>> why the new names are massively better and support future usecases >>> without any disadvantage for addressing the current usecases than = I'm >>> sure this gets accepted, what you did is mix-in this namechange in a >>> whole bunch of patches. >>>=20 >>>>=20 >>>> Your argument was: >>>>=20 >>>> "they [the remote and the locally fetched graph] are different = thing, both >>>> intensionally (cache and source) as in many case extensionally = (triples may differ)." >>>>=20 >>>> And so it follows that graphs sent at different times may also = differ >>>> extensionally and should have different names too. >>>=20 >>> No, we are talking about MGraphs here. I know transtemporal identity >>> is a hard problem philosophically yet in practice we have quite = strong >>> intuition on what we consider to be the same thing over time. the >>> google website remains the google website even if they change the >>> design, same goes for the wikipedia page about google it remains the >>> wikipedia site about google (with the same URI) even after it was >>> changed, one never becomes the other. >>>=20 >>>>=20 >>>> You can't have it both ways, argue on intentionality for different = names and then >>>> refuse to see that temporally different graphs would also then need = different names. >>> I was talking about intensionality. Two terms have a same intension >>> only is in the same universe of evaluation and at the same point in >>> time they have the same extension. >>>=20 >>>>=20 >>>> ( Btw. there are good arguments that intentionally the local graph = if it is a cache >>>> does not differ from the remote one. In any case if you pursue this = too far you will >>>> find that you can never name any remote thing. ) >>>>=20 >>>>> I don't think that clerezza-490 need to be resolved urgently, but = anyway we >>>>> should proceed issue by issue, and the best resolution of an issue = is a minimal >>>>> resolution not one that tries to foresee and future issues. >>>>=20 >>>> I tend to see logical consequences of an argument as being = contained in the argument, >>>> and not being future issues that can be looked at later as somehow = being distinct. >>> yes, but: >>> 1. analysing till the very bottom inevitably leads to paralysis. >>> 2. this inconsistent with your intuition based named change without = discussion >>> 3. We have problems needing a fix (only to be as good as before your >>> patches) and you're not making a concrete proposal >>>=20 >>>>=20 >>>> Clerezza-490 that deals with different ways the server can present = itself to other >>>> servers, is not of course something that needs to be implemented = immediately. But it >>>> would be good that the naming solution we come up with can be = extended to that case >>>> and to the temporal case. >>>>=20 >>>> So I am invoking Clerezza-490 as something to help test the naming = ideas being put >>>> forward here. This is a logical test if you will. >>> see above >>>=20 >>>>=20 >>>>>> So local graph naming schemes should take that into account, = which is why I >>>>>> suggest that we have an API that can allow for extensibility = here. >>>>> We have currently things and we are naming them badly. >>>>>=20 >>>>> Prior to you r webproxy we had: >>>>> .cache as name for the cache of the webprofile >>>>> and >>>>> as uri for triples the user generated locally, >>>>> this can be seen as extensions to the remote profile with = information >>>>> (like preferred language) that happen not to be in the remote = profile >>>>>=20 >>>>> which was consistent with local users who only had >>>>> for the triples they control which include = both >>>>> the regular profile as well >>>>=20 >>>> yes, and both of those were not good solutions. >>>> The .cache solution is bound to create a problem if someone = remotely has >>>> a URI named http://some.example/resource.cache >>>>=20 >>>> It is bound to lead to nasty name clashes, with the same URI naming = two different things. >>> right, I'm admitting it wasn't ideal - but I preffere the seldom >>> clashes to the ambiguity by design. >>>=20 >>>>=20 >>>> Remote URIs are named by remote resources, so it makes more sense = to use the URI of the >>>> remote resource to name the graph of the remote resource. The = remote resource was named >>>> by the owner of the resource. We should respect that. >>> so we nshould not do caching, as the uri prefix http = implies >>> a preferred method for retrieving the resource which is definitively >>> different than getting it out of a local tdb store >>>=20 >>>>=20 >>>> If there are local additions to a remote graph, they should be = given a local >>>> URI. There is nothing simpler than this solution it seems to me. >>>>=20 >>>>>=20 >>>>> Now is the cache, >>>>=20 >>>> You can look at it that way, or you can think of it as the name of = the remote >>>> graph, with the contents being the cache of the remote graph. >>>>=20 >>>> If you were to make the local graph available publicly, it would = then of >>>> course need to have a local url tied into your namespace. Perhaps = this is a good >>>> way to think of the distinction. >>>=20 >>> I'm noty saying your proposal is absurd, but you introduced in a way >>> that breaks things an without discussion. now that I want to clean = the >>> mess you start writing socio-philosophical essays >>>=20 >>>>=20 >>>>=20 >>>>> not sure where additional >>>>> triples added locally get stored, i.e. where triples added to >>>>> webIdGraphsService.getWebIDInfo(webId).publicUserGraph are stored. >>>>=20 >>>>=20 >>>> They should be stored in graph names with a local URL clearly since = these are being stored by a local agent. And I think it will be = application specific what the names of those graphs should be. >>>>=20 >>>> So currently as an initial proposal I put them in >>> as a proposal ok, but you changed something that was working without >>> dissusing the consequences this e.g. for permissions. >>>=20 >>> >>>> Now imagine there are 2 or 3 applications on a clerezza instance, = that a remote user with his WebID uses. There is no reason these = applications should be putting all the information they generate for = that user in the same local graph. >>>>=20 >>>> A banking graph should put banking info in its graph and a blogging = graph into its graph. The way to do this is to give applications - like = users - access to namespaces. Perhaps the bank application that was = given control of the /bank namespace could coin graphs for remote users = in that space, eg /bank/id/{remoteWebID} and the blogging one in = /blog/id?{remoteWebID} . >>>>=20 >>>> By giving apps access to name spaces you can also make sure that = there won't be any clashes. >>> there is nothing that prevent application from making there own = graphs >>> for user information. >>>=20 >>>>=20 >>>> now, that could be a reason for having URIs like >>>>=20 >>>> mvn:/dev.net/application1/?user=3Dwebid... >>>>=20 >>>> But then you see that applications on different servers will have = name clashes too if they >>>> ever merge their databases. >>>>=20 >>>> The advantage of using the local published name is that this then = would allow simple dumps of databases and their merging in remote = databases without clashes. >>>>=20 >>>>> I'm not saying the old naming was perfect but it worked in a = somehow >>>>> consistent fashion for local and remote users. >>>>=20 >>>> It was very confusing to me at least, as I point out in = CLEREZZA-489. >>>>=20 >>>> And it furthermore is inconsistent with your point above that = remote graphs are >>>> intentionally different from the local version. >>>>=20 >>>>> Now my application taht used this feature is now longer working. >>>>=20 >>>> Well that is the problem of having an initial system that is = broken. >>>> It will be easy to fix this, and we should fix it well, not do a = half job of it, >>>> because this is a distributed naming problem. >>> I'm tired. I've nothing against a concrete counter proposal against >>> the one that started the tread, e.g. saying: "we must give every >>> instance a unique-id and this should be part of the >>> x-localinstance-uri" >>>=20 >>>=20 >>>>=20 >>>>>=20 >>>>>> in Clerezza-489 I wrote that one could describe each graph like = this in a >>>>>> special Cache graph perhaps. >>>>>> :g202323 a :Graph; >>>>>> =3D { ... }; >>>>>> :fetched=46rom >>>>> :fetchedBy >>>>> :representation ; >>>>>> :httpMeta [ etag "sdfsdfsddfs"; >>>>>> validTo "2012...."^^xsd:dateTime; >>>>>> ... redirected info? >>>>>> ] . >>>>>>=20 >>>>>> :g202324 a :Graph; >>>>>> =3D { ... }; >>>>>> :fetched=46rom >>>>> :fetchedBy >>>>> :representation ; >>>>>> :httpMeta [ etag "ddfsdfsddfd"; >>>>>> validTo "2012...."^^xsd:dateTime; >>>>>> ... redirected info? >>>>>> ] . >>>>>=20 >>>>> If we had barketing in RDF and our tooling would support it the = the >>>>> above might be somehow topical, answer to the question "how to = name >>>>> this?" "don't name it". >>>>=20 >>>> The above is just a way of writing the contents of the graph and = the metadata >>>> in the same file. That is what the >>>>=20 >>>> :g202323 =3D { ... } >>>>=20 >>>> is about. You don't need any special tools for that. If you use = Jena to get the graph >>>> named above you would get the content of the brackets. The point is = that the content >>>> from >>> Also in jena the graphs have a name, very profane sequence of >>> characters this discussion was about. So in clerezza of in jena in = the >>> metadata graph you have a name instead of {...} and for this name = you >>> will get {...} from the named graph store. >>>=20 >>>>=20 >>>> :fetched=46rom .. >>>> :fetchedBy ... >>>>=20 >>>> is not in the g202323 graph, but in a graph metadata graph. >>> obviously >>>>=20 >>>>> Please lets proceed issue by issue and make >>>>> sure every brick we place is really solid and separate this from >>>>> visionary long term stuff. >>>>=20 >>>> Ok, I hope you see that I introduced nothing new there. It's just = an >>>> n3 notation that makes it easy to write things out in an e-mail. >>> an n3 notaions that omits exactly what this discussion is about, >>> namely my nameing proposal and your -1 gainst it. >>>=20 >>>>=20 >>>> So please consider that point again in that light. >>>>=20 >>>>>>=20 >>>>>> Then this API could use information from this graph to and = information from >>>>>> the user's request >>>>>> to find the correct local graph he wants. >>>>> Still the local graph would have a name, probably - but as I said = its >>>>> irrelevant. Lets deal with the issues at hand, you changed the = names >>>>> of graph (which I agree didn't have the best possible names) with >>>>> names that I think are worse, lets find something we can agree = upon. >>>>> (otherwise, please roll back to the version with the orginal names >>>>> till we find a consensus). >>>>=20 >>>> Well I don't think rolling back would improve anything. I think = clearly >>>> this was an improvement. But I do think we can do better. >>> It a mixture between improvements and deterioration. following the >>> right process avoids the deterioations >>>=20 >>>=20 >>>>=20 >>>> So my thinking is that to reach consensus we can do this with an = API, without >>>> deciding what precisely the names should be. >>> Stop: I disagree with your new names and we have problems because of >>> your name changes and now you dont want to decide about names?! >>>=20 >>>> The best is just to lay out the >>>> requirements: >>>>=20 >>>> 1. mapping from a remote URI to the URI understood by the local = triple store >>>> and back. There should be no name clashes. It should be possible = to easily extend >>>> to have agent views and temporal views. >>>>=20 >>>> 2. method for applications to take hold of legitimate namespaces = in such a way that >>>> a clash of names is not possible. >>>=20 >>> If any proposal for changing names satisfies one of your criteria = less >>> than the staus before the poposal your applying the argument to the >>> concrete proposal is welcome. >>>=20 >>> Reto >>>=20 >>>=20 >>>>=20 >>>>=20 >>>> Henry >>>>=20 >>>>=20 >>>>>=20 >>>>> Reto >>>>>=20 >>>>>> Henry >>>>>> PS. Having said that one then may just wonder why local graphs = should ever >>>>>> have anything other than >>>>>> local URLs, since every time someone made a copy of a local graph = it would >>>>>> be different. >>>>=20 >>>> Social Web Architect >>>> http://bblfish.net/ >>>>=20 >>>>=20 >>=20 >> Social Web Architect >> http://bblfish.net/ >>=20 >>=20 Social Web Architect http://bblfish.net/