commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Ansell <ansell.pe...@gmail.com>
Subject Re: [COMMONSRDF] GroupID for incubation releases
Date Wed, 15 Apr 2015 00:20:11 GMT
On 14 April 2015 at 01:16, Reto Gmür <reto@apache.org> wrote:
> On Sat, Apr 11, 2015 at 12:39 PM, Peter Ansell <ansell.peter@gmail.com>
> wrote:
>
>> On 11 April 2015 at 22:11, Reto Gmür <reto@apache.org> wrote:
>> > On Mon, Mar 30, 2015 at 2:07 PM, Benedikt Ritter <britter@apache.org>
>> wrote:
>> >
>> >> Hello Reto,
>> >>
>> >> 2015-03-30 14:45 GMT+02:00 Reto Gmür <reto@apache.org>:
>> >>
>> >> > Hi all,
>> >> >
>> >> > The clerezza commons RDF proposal that was in the sandbox and is now
>> in
>> >> the
>> >> > clerezza-rdf-core repository has been changed to use
>> >> > org.apache.clerezza.commons-rdf.
>> >> >
>> >> > As you know if all goes well clerezza will be based in the result of
>> the
>> >> > incubating project. If however this project should unfortunately not
>> lead
>> >> > to something generic enough to be used for interfacing arbitrary data
>> as
>> >> > RDF and thus be usable for clerezza, then clerezza might reactivate
>> its
>> >> > commons-rdf proposal. It would then be up to commons to decide which
>> >> > proposal to adopt and under which name.
>> >> >
>> >>
>> >> We (the Apache Commons community) have already stated, that we don't
>> have
>> >> the necessary knowledge about RDF to make such a decision. I would
>> prefer
>> >> more people from clerezza joining this ML and build consensus with the
>> >> incubating commons rdf community about how an implementation of the RDF
>> >> specification should look like.
>> >>
>> >
>> > It might be hard to reach an acceptable solution if the result of one
>> year
>> > on Github are taken as unmodifiable except when there is 100% agreement
>> on
>> > a change.
>>
>> You were privy to all of the public discussions on GitHub, and many of
>> the private discussions just before setting up the project publicly.
>> Please do not imply that this was your first opportunity to bring up
>> these issues, or that we had not responded at all to the issues you did
>> bring up in the past.
>>
>
> The first discussions we had were at the end of 2012 on public apache
> mailing lists.
>
>
>
>>
>> Apart from the structural constraints of having an entire interface
>> driven API, and there being no clear reason why the interface names
>> themselves should be changed to suit you,
>
>
> I've changed clerezza to mach the names in rdf-commons classes. I would
> have preferred an decision on a casing conventions rather than just a
> statement that these types are named like that and will not change.
>
>
>> do you have other issues
>> where getting ~100 percent agreement has failed and you would like
>> some further discussion.
>>
>
> I've created COMMONSRDF-13 to address one of the main issues.
>
>
>>
>> > Even without having the commons community diving deep into RDF it might
>> > become clear that one API is better suited for triple stores and their
>> > requirement while the other is more suitable to exposing arbitrary data
>> > source using the RDF model. So if the result is not something that can be
>> > used in all usecases having two commons project relating to RDF but with
>> an
>> > distinct goal might also be an option.
>>
>> If you could articulate why the current interface based method makes
>> it unsuitable for your use cases it would definitely help. In
>> particular, some examples of where Clerezza could map a data source to
>> RDF somewhow, but it would be impossible with commons-rdf-api, then we
>> can start to discuss it further. Right now you have only implied that
>> the difficulty exists without articulating it.
>>
>
> I've written code that exposes a SPARQL endpoint using the clerezza version
> of RDF Commons, I've argued that it would be quite hard to do the same with
> the incubating Commons RDF proposal [1]. The reply on the list was that we
> will worry about that later, but in my opinion this shows a fundamental
> limitation of the current approach.

BlankNodes cannot be compared across different SPARQL queries. That is
a well known RDF issue, not just with SPARQL, and is not going to be
solved by anything except bulk execution of a single query to get all
of the BlankNodes back in a single query.

> Besides the SPARQL usecase, here's a simple usecase for wrapping data as
> RDF:
>
> interface Person {
>    String getFirstName();
>    String getLastName();
>    String getDiary()
> }
>
> interface DataBase() {
>   Interator<Person> list();
>   Interator<Person> filterByLastName(String);
>   Interator<Person> filterByFirstName(String);
> }
>
> No the task is to expose this dynamically as RDF (i.e. without duplicating
> the data).
>
> Wrapping this with clerezza one would wrap the Person instance in a
> blanknode, the identity of the BlankNode would depend on the identity of
> the Person instance. Doing the same with the incubating commons rdf
> proposal would require keeping a Bidirectional Weak Hashmap of from
> Blanknode identifiers to Person objects. I don't think many programmers
> would like to do the latter, so I don't think it is currently a suitable
> API for exposing arbitrary data using the RDF datamodel.

If there is no single or aggregate primary key for a person, then it
will not work reliably with any database,
RDF/Graph/Document/NoSQL/Relational/etc. It works internally in
Clerezza, in-memory, with the added BlankNode object wrapper, and
could be mapped using BlankNode.internalIdentifier. It isn't helpful
to use arguments about not liking a method to imply it is impossible.

Cheers,

Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message