incubator-clerezza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Seaborne <andy.seabo...@epimorphics.com>
Subject Re: Clerezza, Stanbol, Jena, Semantic Commons, WDYT?
Date Fri, 12 Nov 2010 15:46:51 GMT
There are various ways to approach the problem: given there is already 
existing code and data, my take is

    https://github.com/afs/JenaSesame

showing Jena API over Sesame, so you can plug a existing Jena 
application into Sesame storage with very few changes (factory calls to 
connect to the repository).  But the real standard for efficient 
processing is SPARQL because the granularity works better.

JenaSesame provides efficiency by, for example, passing query processing 
down and not executing over a narrow interface (if you want SPARQL 1.1 
over Sesame now, the narrow interface version will work as well).


In even in closed environments, SPARQL 1.1 helps. It's possible to write 
a 3-tier application using a SPARQL-DB as the database layer and stick 
to standards between the business logic and daatbase layers, giving you 
a choice of systems.

[[ Steve Harris, from [1]:
Five (boring) reasons why SW technology is good for companies

Strong Standards Interoperability comparatively good
	Less vendor lock-in
SPARQL Protocol HTTP based
	Fits well in SoA
Schemaless Data MI / BI
	Flexibility
Scalability Billions of triples with open source
	software, on basic hardware
I18N UTF-8
	Language tags
]]

	Andy


[1]
http://axel.deri.ie/presentations/20101111LightningTalksISWC2010.pdf

On 12/11/10 12:43, Henry Story wrote:
>
> On 11 Nov 2010, at 19:01, Andy Seaborne wrote:
>
>> Bertrand suggested that this conversation happened clerezza-dev list: main extracts
of messages in chronological order:
>
> Hi Andy,
>
> 	Quite a few years ago I suggested that the RDF APIs should be put through
> the Java Community Process in order to standardise the interfaces. At the time
> people told me it was too early to do so. It may be time to revisit this now. Reto's
> work in Clerezza may be a good place to start from. I have not really looked into
> the details of his RDF abstraction, but it seems to work. Of course if Sesame/Jena/
> Mulgara all agreed on one namespace set of interfaces then one could remove one
> layer of abstraction, making things more efficient presumably.
>
>     All I can say is that it is nice to be able to switch between Jena, Sesame and
> Mulgara to see what advantage each has. Doing a comparison between these different
> APIs is a lot of work. I suppose I'll have time to go into the details as I start
> using Clerezza more.
>
> 	Currently I am working on implementing WebID ( http://webid.info/spec ) in
> Clerezza. It is getting to be more and more timely to do so, with tools such as
> FireSheep hitting the headlines, and movements such as SSL everywhere catching on [1].
> DNSSEC is also changing the whole space here [2], as it will make it very easy
> to deploy SSL based servers.
>
>    I really like the way Clerezza is fully RDF based CMS (I know it's more than a CMS,
> but that is an easy way to explain what it does), and it is going to make developing
> what people term a Personal Data store - others a Social Web CMS - very easy.
>
>     By the way we want to start a WebID XG at the W3C. If anyone is interested please
let
> me know. We allrady have 4 members who put their name down.
>
>      http://esw.w3.org/Foaf%2Bssl/WebIdWorkingGroup
>
>     All the best,
>
> 	Henry
>
>
> [1] http://esw.w3.org/Foaf%2Bssl/FAQ#Is_SSL_not_really_expensive_server_side_to_Process.3F_To_expensive_for__Google_.3F
> [2] http://www.freedom-to-tinker.com/blog/sjs/major-internet-milestone-dnssec-and-ssl
>
>
>>
>> 	Andy
>>
>>
>> On 08/11/10 21:39, Reto Bachmann-Gmuer wrote:
>>> Hi Jeremy
>>>
>>> One of Clerezza aims was to use an RDF api that is maximally close to RDF
>>> abstract syntax and semantics, on this RDF core api we have different
>>> fa├žades and utilities as well as a frontend adapter implementing the jena
>>> API. Related standards like SPARQL and the various serialization formats are
>>> supported as well, respective engines can be added at runtime (when running
>>> in a OSGI container). We decided to design our own API as we found the
>>> various API available (jena, openrdf, rdf2go) would neither be as modular
>>> nor as close to the spec as we wanted them to be. The API comes with the
>>> typical utilities like a command line tool and a maven plugin for the
>>> transformation of vocabularies into classes
>>>
>>> Apart from core part tightly coupled to RDF and related specs Clerezza also
>>> provides a framework for implementing rest applications (JAX-RS). The
>>> encourages design pattern is that requests are answered in terms of RDF
>>> (i.e. a graph and typically a selected resource within this graph), clerezza
>>> takes care about content-negotiation and for RDF formats the serializer
>>> registered for that media type is used. For non RDF formats a template
>>> (typically a Scala Server Pages) is selected and takes care of the
>>> rendering.
>>>
>>> I described this parts of Clerezza because they seem to be quite close to
>>> what you suggest for commons. As it is hard to share utilities without
>>> having shared APIs for the core stuff our code deals with I think some
>>> efforts in this area could have the greatest benefit.
>>>
>>> If you have some time, I would like to encourage feedback on the respective
>>> APIs as currently used in Clerezza
>>>
>>> - The core API for (mutable) graphs in:
>>> http://incubator.apache.org/clerezza/mvn-site/org.apache.clerezza.rdf.core/apidocs/index.html
>>> - Utilities (including resource-centric API):
>>> http://incubator.apache.org/clerezza/mvn-site/org.apache.clerezza.rdf.utils/apidocs/index.html
>>>
>>> These two layers are similar to the Graph/Model separation in Jena.
>>>
>>> Cheers,
>>> Reto
>>>
>>
>> On 10/11/10 23:28, Jeremy Carroll wrote:
>>>> - The core API for (mutable) graphs in:
>>>> http://incubator.apache.org/clerezza/mvn-site/org.apache.clerezza.rdf.core/apidocs/index.html
>>>>
>>> http://incubator.apache.org/clerezza/mvn-site/org.apache.clerezza.rdf.core/apidocs/org/apache/clerezza/rdf/core/TripleCollection.html#filter(org.apache.clerezza.rdf.core.NonLiteral,%20org.apache.clerezza.rdf.core.UriRef,%20org.apache.clerezza.rdf.core.Resource)
>>>
>>>
>>> Iterator<Triple>  filter(NonLiteral subject, UriRef predicate, Resource
>>> object)
>>>
>>> vs
>>>
>>> http://jena.sourceforge.net/javadoc/com/hp/hpl/jena/graph/Graph.html#find(com.hp.hpl.jena.graph.Node,%20com.hp.hpl.jena.graph.Node,%20com.hp.hpl.jena.graph.Node)
>>>
>>>
>>> ExtendedIterator<Triple>  find(Node s, Node p, Node o)
>>>
>>> seems to be the fundamental choice.
>>>
>>> The latter was the choice Chris Dollin and I made in 2002/2003 and I
>>> still find it preferable, for program uniformity, to the closer to the
>>> spec choice in Clerezza.
>>> We were writing the spec at the same time, and I always saw it as a
>>> description of a Web exchange format, and not of a programming interface
>>> (for instance implementing RDF Semantics Rec is hard with the Clerezza
>>> interface).
>>>
>>> I am not quite sure what that means in terms of this discussion which is
>>> more procedural than technical.
>>> Like in all things people make different choices and have different
>>> preferences, and a decision to all use the same libraries would be a
>>> restriction in design freedom, on such issues, which might be good, or
>>> might be bad.
>>>
>>> ===
>>>
>>> On
>>> http://mail-archives.apache.org/mod_mbox/incubator-clerezza-dev/201011.mbox/%3CAANLkTinwbvRUOeMFHh8ohdVvESGM09Z0aFFGSerbWiFZ@mail.gmail.com%3E
>>>
>>> [[
>>>
>>> - graph isomorphism code
>>> ]]
>>>
>>> what are the goals of the Clerezza isomorphism code? The Jena code is
>>> essentially scoped to testing, so that I checked that small pathological
>>> cases were OK, and larger non-pathological cases, but it is not meant to
>>> have production level performance, particular on graphs for which
>>> something like nauty would be more appropriate.
>>
>> On 11/11/10 10:21, Andy Seaborne wrote:
>> ...
>>> Isn't the model interface operation a more appropriate comparision
>>> because that is what the application sees?
>>>
>>> StmtIterator listStatements(Resource s, Property p, RDFNode o)
>>>
>>> Graph.find is the SPI interface to storage. The Graph level has named
>>> variables, not just RDF terms.  SPARQL uses this, heavily.
>>>
>>> In SPARQL, literals can occur in any position during query processing.
>>> Patterns involving literals as subjects, or as predicates, just simply
>>> don't match the data (section 12.3.1).
>>>
>>> Once upon a time, when we were going Jena1->Jena2, the idea was that the
>>> application API was just one presentation.  There could be other RDF
>>> APIs over the SPI.  There's not been a second RDF presentation API but
>>> the design concept was there and still is.  All the interfaces in the
>>> API are mainly implemented only once, and I'm not aware of any users
>>> which use the extensibility within the Resource API anymore
>>> (Parliament/BBN used to - I think they now use an associated
>>> datastructure to map to internal information for any API
>>> resources/literals from their storage).  The Resource-level API
>>> implementation could be simplified if theer is only one implementation
>>> of that presentation.  There is generality in Jena that we thought was a
>>> good idea at the time but looking at way the world has gone since, not
>>> all of it is used or useful nowadays.  Better use of factory/interface
>>> at the SPI would be more helpful. The experimental Jena3 core also has
>>> extension nodes and graph nodes with an eye to future possible needs
>>> from the standards world.
>
> Social Web Architect
> http://bblfish.net/
>

Mime
View raw message