incubator-clerezza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reto Bachmann-Gmuer <reto.bachm...@trialox.org>
Subject Re: sketch of a compromise solution -- Re: [VOTE] Accept the proposed patch of CLEREZZA-540
Date Fri, 03 Jun 2011 11:29:58 GMT
On Wed, Jun 1, 2011 at 7:05 PM, Henry Story <henry.story@bblfish.net> wrote:
[...]

> > Yes, TcManager is the main entry point to the rdf data. I don't see any
> code
> > smell here. I a classical RDBMS java application a component will
> typically
> > use jdbc and rely on other components that use jdbc, here it's TcManager
> > instead of jdbc,
>
> The thing to do would be to look at the number of  connections that are
> generated to the
> DB. My feeling is that one could do the same and have only 1 or 2
> connections to the DB.
> Here we could quickly end up with 10 times more....
>
I don't know what you mean by DB conncetions and where you see this factor
10.

[...]

> >
> >>  On my fresh install of ZZ that is 20 times more information than the
> >> initial graph.
> >>
> > - What is 20 times more?
> > - What do you mean by "get"?
> >
> > A graphnode point to a resource and is designed for browsing from
> resource
> > to resource. It is not a graph but a node in a graph. The object is
> > associated to a base-graph which used to identify the propertied and
> > instanctiate another graphnode when hoping to a property value. The
> > underlying graph could for instance be Timbl's GGG (giant gloabl graph
> aka
> > the web).
>
> yes  ((but clearly you don't want to dereference the whole web when
> working...))
>
no, and nobody is doing this. If I include the uri pattern
<http(s)://.*/(.*/)*> I'm not blasting this mail to the size of the web.


>
> Also there will be contradictions in the information on the web. Some
> people may trust some graphs, other trust others.

Right, that's why the GraphNodeProvider trusts only the content-graph, which
is trusted qua being a platform service) and the graph resulting from
dereferencing the resource (trusted by conventional web-trust)


> Graphs can be merged easily in RDF - IF they  are believed both to be true.
> But what is believed to be true will depend on what possible world you
> believe yourself to be in. I argued this in "Beatnik: change your mind"
> in more detail, if that helps for people following this discussion
>
>  http://blogs.oracle.com/bblfish/entry/beatnik_change_your_mind
>
> From the point of view of WebID and security I want to be able to tell WHO
> said what. In many applications being able to be very clear about where
> something was said is going to  be essential to giving good feedback. Some
> example coming from the field I am working on below.
>

> So for a foaf-browser, I want to know when TimBl declares someone to be a
> friend, and differentiate that from when someone declares himself to be a
> friend of TimBL, which is a very different thing.

With the current service you have what TimBL says plus the platform-wide
truths of the content-graph, this may contain things like a link back to you
(the owner of the platform instance) or a statement like : TimBL rdf:type
ex: Spammer which might not be published in TimBL's profile


> When I get Dan Brickley's graph I may want to know all the people he
> mentions in his foaf profile - even if he does not mention them as
> foaf:knows related to him.

does this provide a new point?


> If the GraphNodeProvider returns a union graph of the documentation graph,

Again no, we're not returning a union graph we're returning a GraphNode, the
underlying graph is an implementation detail (was think if the
getGraph-method could be made less visible (protected or private) to avoid
this confusion)


> content graph,... and his foaf profile then when searching for all the
> foaf:Person

You don't search a GraphNode for all foaf:Person but the GraphNode
represents the foaf:Person you asked for.


> I will get the documentation writers too, the writers of content in the
> content graph, and who knows what else...

you will have properties pointing from that persons to all the comments he
left on the local instance, which can be quite handy (and which are from the
underlying content graph as they are probably not also contained in the
remote foaf:profile)


> many people will have no direct relation to Dan at all. People can say true
> things about Dan but those not be things Dan himself would say.
>
Yes, we only consider as true what we say ourseflf (i.e. the content graph)
and in particular circumstances also what Dan says.


>
> I believe these use cases are not limited to the foaf browser but to a very
> large category of semantic web applications. Give me some linked data
> application, and I will easily come up with use cases of the same kind.
>
That's why graphnodeprovider is a generic service and its not true that it
was designed for a particular and very specific application of mines in
mind.
[...]

> >
> > I thought I had heard people mention issues with speed on this list.
> >>
> > You may check archives of this list at:
> > http://mail-archives.apache.org/mod_mbox/incubator-clerezza-dev/
>
> Do you have a precise thread?
>
No, its you who thought he heard people mention issue with speed. Check the
archive and construct an argument if those speed issues relate to the issue
at hand, otherwise your remark " I thought I had heard people mention issues
with speed on this list" is purely demagogic and a hindrance to an effective
an fruitful discussion. (like saying "I've heard people finding the clerezza
code hard to read" when justifying a -1 against some code)

> [...]
> >>
> >> What I am wondering is in what cases is this needed? It seems like this
> may
> >> indeed what a particular application may require, but does it have to be
> >> a general service? The name certainly suggests a very general service,
> not
> >> one required for a particular application.
> >>
> > This is about ContentGraphProvider then, not about issue 540. It's the
> > ContentGraphProvider which provides the graph of instance-wide and public
> > information for the platform
>
> 540 the GraphNodeProvider delegates decisions to the ContentGraphProvider
> 544 uses the GraphNodeProvider that delegates to the ContentGraphProvider
> and ties it into the the core,
>    so that when a JSR311 class requests a named graph it then gets whatever
> the ContentGraphProvider decides
>    is trusted content.
>
I don't see any link to JSR311 but yes, the ContentGraphProvider provides
content that is  public and platform-wide trusted by default (the system
graph has higher trust).


>
> SKETCH OF A COMPROMISE SOLUTION
> ===============================
>
> Perhaps there is a way to allow make things more transparent, by for
> example having JSR311 classes that really
> want the full union returned by the  current ContentGraphProvider to have
> that, and other applications to get something more limited.
>
Again see no link to Jsr311. But I think this might be the mentioned
possible future enhancement to allow clients to specify the trust
boundaries.


> I suggest that we think of naming the content graph or at least build
> something so that those who need the content graph can ask for it clearly,
> and those who don't can make sure they don't get it.
>
The content graph is named, the virtual content graph isn't, but we could
give the virtaul content graph a name thus making it accessible in sparql
queries (using a TcProvider) but this seems completely unrelated to the
issue at hand.


>
>  - It may be useful then to name the content graph - it would be a union
> graph that could be specified by a SPARQL UNION
>    query for example or the equivalent.
>
Yes, if we give the Virtual content graph it can be used by the sparql
endpoint


>
>  - have a JSR311 class return a NamedGraphNode (or something like that)
> which can then call the CallbackRenderer.
>
What should a NamedGraphNode be? GarphNode can wrap a named or an anonymous
resource I see no need for a subclass for UriRef-GraphNodes, but we have
been discussing this in another thread, don't see a relation to the issue at
hand.

I don't know what you mean by the NamedGraphNode calling CallbackRenderer,
the CallbackRenderer is called by the renderlet.

   The NamedGraphNode could name the union graph, or could be a union graph
> - by reference. So all the code would do is
>
>    return new UnionNamedGraphNode(new
> UriRef("urn:x-localhost/contentGraph"),new UriRef("
> http://remote.example/resource/"))
>

I'm not getting it which one is the resource and which one the name of the
underlying graph, whats the difference to current GraphNodes.


>
>    or some nice syntactic sugar for that. For example for apps requiring
> the union of Content + other  that you need,  something like the following
> would be neat
>
>    return new ContentGraphNodePlus(UriRef("http://remote.example/resource/
> "))
>

If you're talking about GraphNodes (but I'm not sure where you in fact mean
graphs) the I don't see what you're introducing that is new

return new GraphNode(new UriRef("http://remote.example/resource/"), new
UnionMGraph(contentGraph, fooGraph, barGraph))


>
>    These objects would just be holders of the graph name(s), which the
> TcManagers can then hook up into the underlying triple store. Something
> along those lines would be very nice. One could easily write applications
> that get union of contents as you wish, and I could easily get very
> precisely defined graphs for security based application, or more flexible
> linked data graphs too.

You can do this (see example return statement above)


> It could also avoid the iterative way the GraphNodeProvider currently
> works.
>
Is this a reference to the if-then statements you criticed but never told me
what you mean despite me repatedly asking? Or what "iterative way" are you
reffering to?


>
> Having something like that would mean  that perhaps the  addition of the
> new method in CallbackRenderer
>
>   public void render(UriRef resource, GraphNode context, String mode,
>                        OutputStream os) throws IOException;
>
GarphNode is resource+context where the context is a graph. Now the
renderlet gets a graphnode to render, it shouldn't get any context from
anywhere else. the new render method is exactly to render a method with a
differnt context not avaialble directly to the renderlet. If the outer
renderlet already has (or can generate) the context for the nested rendering
the this can be done with existing (pre 540) infrastructure.


>
>
> would no longer be needed, or would be adapted somewhat.

what?


> It could also mean that the GraphNodeProvider could be a lot more general,
> as its name indicates it should be. The information about graphs hard coded
> into the provider could then be moved to a Graph (or GraphNode or NamedGraph
> or NamedGraphNode object). It would then be a lot clearer when looking at
> JSR311 code what was being returned.
>
Again this is not specific to jsr311 code. I think my proposed
GraphNodeProvider is quite generic but that additional features coould be
added.


>
> [[ ps: a thought
> One could perhpas write implementations of such a NamedGraph that would
> perhaps allow links to be followed outward (from accepted named graphs to
> others graphs it links to, up to a certain number of hops).
> ]]
>
Which seems to be exactly the context-switch allowed by ZZ-544


>
> >> Perhaps changing the name from GraphNodeProvider to
> >> ContentGraphPlusOtherProvider would make more sense.
> > It's a platform service that provides GraphNodes. Being a platform
> service
> > implies it usesthe platform means of getting trusted content. If it would
> > just dereference URIs the it would probably be placed in a subpackage of
> > clerezza.rdf.
>
> perhaps. But why not make things nice and general as explained above?
>
Where do you make something nice and more general? You're describing how
clients can do stuff without GraphNodeProvider what they of course can do.
And you're proposing new classes for what seems they can do as easily (but
more consistently and thus more elegantly) with the existing classes.



> Currently with changes to 544 and in particular the render method
>
>  public void render(UriRef resource, GraphNode context, String mode,
>                        OutputStream os) throws IOException;
>
>
> when a JSR311 class returns a URI,

A jsr311 returns a GraphNode, if it returns a URI then type-rendering is not
used (but another MessageBodyWriter, if available)


> the renderer does not get the graph named by that
> URI

No renderlets get invoked, but if it gets a graphNode the renderlets gets
that GraphNode which allows ecploring the resource with whatever graph the
jax-rs resource method chose to use. Choosing this graph is the business of
the application logic and certainly does not belong into the renderlet.


> but that graph and something else, defined in some unrelated package. For
> me this
> does not make it easy to understand the code.
>
Obviously you don't. Would be good we find way to improve understanding of
the clerezza architecture without requiring blocking the evolution by
casting -1


>
>
> >>> This might not match an intuitive understanding of "authoritative" and
> >> I'm
> >>> happy to redefine the issue so that no confusion arises.
> >>
> >> One thing I am not quite clear about yet, is who writes to the content
> >> graph? I see a lot of modules use it.
> >>
> > Modules can write to the content graph or add temporary additions to it.
> > Actually writing to the content graph should happen when public and
> trusted
> > information is added. An information is considered trusted when added by
> a
> > user with respective permission or verified by privileged code (e.g. that
> > allows the public to add see-also references).
>
> Good so say a trusted user of mine :joe truthfully says
>
Waht do you mean by "trusted user"? trust with no limits? (admin rights?)



>
>  b:danbri foaf:knows :joe .
>
> then currently when I ask for http://danbri.org/foaf.rdf#danbri
> I will get a graph that contains the above triple even if danbri does not
> make that
> claim. Sometimes that is good, and sometimes not.

Sometimes it's good to use clerezza, sometimes a hammer is more appropriate
;)


> In many cases as I have argued it will be
> important for me to know what danbri claims. Perhaps so I can ping him to
> tell him about
> my desire for him to claim friendship with me.
>
There's nothing to prevent you or that would make it hard to write such an
application, it's just not what the garphnodeprovider is for and it
definitively doesn't belong into the renderlet


>
> In the current API changes it won't be clear at all why when I ask for
>
>    <http://danbri.org/foaf.rdf#danbri>
>
> I get <http://danbri.org/foaf.rdf#danbri>  + 5 other graphs.

graph/resource distinction, what does the addition of a person and a graph
result in?


> Or it will require the developer
> to know the internals of clerezza to work this out, as I have just had to
> do myself.
>
It can well be, that clerezza will support sophisticated provenance
mechanism in future. Not sure however if the blocking of patches for the
existing base architecture fosters this developement.

[...]
>

> >
> >> 4. But instead of just having a GraphNodeProvider that just returns the
> >> graph, you have added some twists to
> >>  it and return more than jut the named graph. There is nothing to say
> that
> >> a named graph cannot be the union
> >>  of many other graphs, but it seems really arbitrary for me to get the
> >> documentation of clerezza along with the
> >>  triples of Tim Berners Lee's graph.
> >>
> >
> >>  Somehow things have gone a bit haywire at the end here.
> >
> > If you call getGraph on a GraphNode you're leaving the scope of the
> > GraphNode. Probably all this discussion would not be necessary if had
> been
> > using getNodeContext instead of getGraph. The NodeContext is what related
> to
> > the node. Using getGraph is a bit like doing the following:
> >
> > File file = SomeService.getFileDescribing("Tim Berners Lee")
> > file.getParent().getParent().getParent().listChildrenRecursively()
>
> I don't think that is a good way of looking at what graphs are useful for.
> Graphs are more
> like bubbles in a comic strip.
>
Yes, but here it's not about graph but resources (interpreted in a huge
universe of believes)


>
> I argue this very carefully in "Are OO languages Autistic?"
>
>  http://blogs.oracle.com/bblfish/entry/are_oo_languages_autistic
>
> This is a fundamental new programming element provided in the semantic web.
>
> So the context as you are defining it is not what I am looking for. I am
> really looking for the named graph - the entire claim made by a resource.

We don't have the notion of claims made by a resource. But it would be easy
to add a methos to GraphNodeProvider returning only what the web offers as
context of a resource


> This can be seen by considering the example I gave above where someone adds
> to the content graph information about Dan Brickley
>
>    b:danbri foaf:knows :joe .
>
> If I only get Dan Brickely's graph back that triple will not be there. If I
> get Dan Brickley's  + the content graph, then that information will appear
> even if I just ask for dan's node context. Also there may be information
> about
> people appearing in Dan Brickley's profile that are not directly linked by
> him, that I will
> also be interested in retrieving.
>
Use render(uriRef) method to have thos people rendered in their context.


> So the context is not the tool I need - and I don't think my use cases are
> special.
>
In the usecase of telling Dan that a true statement is missing indeed what
id provided by ZZ-540 is probably not what you need. But I think this
usecase is more special than seein all the comments a person posted and
other facts which are assumed to be true (by platform trust boundaries)
about a person. But this discussion is pointless as one feature doesn't
prevent the other from being implemented.


>
> >
> > The listed files can contain thigs that are completely unrelated to Tim
> > Berners Lee
> >
> >
> >
> >> And I think this is due to a bit of confusion of the needs
> >>  of your application with trying to keep the general architecture clean.
> >>
> > As I said, I did not made this particularly for an application, my wall
> > application is merely a demo. When we want to do something like a
> > foaf-browser we want to be able to display the resource in their context,
> > just a usecase.
>
> Ok, so that is where our disagreement lies. The node context is in many
> case not
> at all what we want. It both adds too much information and not enough.
>
Who is "we"? You have one usecase where one should have less information
accessible via the graphnode, there are other usecase (and imho more) where
we want all information we trust).

Wehn do we have not enough information?


>
> It may be that in the wall demo that is not visible. But in security
> matters and
> trust matters it will make a big difference.
>
Sorry, this seems like a demagogic null-sentence. Yes, we do care about
speed, we do care about trust and we do care about security. And the
proposed resolution of ZZ-540 and 544 brings an improvement, as it prevents
data from other trust boundaries having to be part of the base graph for the
graphnode returned by a root-resource method.

>
> >>
> >>  Now on the whole I have learnt a lot about Clerezza by following this,
> >> but I just can't say that this looks like
> >> a good long term solution.  We are constantly moving around and around
> >> something.
> >>
> > This is your impression. I hope my explanations to the concrete points
> you
> > mention could help changing this impression.
>
> I think it should now be clear how we can come to a solution that satisfies
> both
> our needs.
>

Yes: you revoke your -1 and you raising an issue for getting a resource
description only from the web for your particular usecase.
[...]


>
> > Would the rename be okay for you to accept the proposed path? (I really
> > would like to go back to productive work, so I rather have a horrible
> name
> > than seeing the project stalled by your veto).
>
> Well then the issue would be why this class should appear in the
> CallbackRenderer.
> No I think there should be a way from JSR311 code to ask to ask precisely
> for the
> type of GraphNode it wants with very little coding. So that for the use
> cases
> where walking the content graph is the right thing to do it is one line of
> code,
> and for cases where something more precise is needed it is also just one
> line of
> code. In any case it should be easy when reading the code to understand
> what is going
> to be displayed.
>
I don't think this is particularly hard to do, and with the issue I proposed
you raise above even easier.


>
> I hope this helps,

Maybe this thread helps understanding the clerezza architecture better. Yet
blocking development with a -1 seems quite a high price for this.

Reto


>
>        Henry
>
> >
> > Reto
> >
> > PS: You seem to be extensively using you're right to veto while ignoring
> > other's veto on your code, looking at
> > https://issues.apache.org/jira/browse/CLEREZZA-515 I see that the
> commits
> > have not been reverted even more than one week after my veto and request
> to
> > revert.
>
> Hmm, I did revert that using git. But I am not sure why that does not
> appear in the
> commits for that issue.... I see you brought that up in another thread.
>
>
> Social Web Architect
> http://bblfish.net/
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message