incubator-clerezza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reto Bachmann-Gmür <r...@apache.org>
Subject Re: Future of Clerezza and Stanbol
Date Sun, 11 Nov 2012 01:09:02 GMT
Hi Rupert and all,

(1) I agree with what you shay regarding the RDF api and think to keep this
effort more sustainable while not running the risk of polluting the api
with implementation specific requirements to graduate clerezza as apache
commons.rdf for that.

(2) Type-Based rendering is not something that can be implemented just by
adding MessageBodyWriters as different RDF resources do not result in
different java classes. For a framework providing resources as RDF typed
based rendering seems the straight forward approach to allow these
resources to be rendered in non rdf formats as well. For this we can still
use Freemarker (with LDPath templates) but our legacy template that are
require the class with the application logic to provide special hooks to
the templates goes against the concept of having a plugable UI that can be
left away for instances only to be used by machines. Keep in mind that an
infrastructure for providing templates in a better way is already there
since the introduction of LDVieable. Type Based rendering goes one step
further as the jax-rs root resource would no longer have to provide the
abstract template-path.

(3)
JSR-223 support: I suggested to drop this.

Scala support: I'm wondering myself why there is such a big PermGenSpace
need. I've just update clerezza trunk to use scala 2.9.2 this might have
improved things a bit. As the compiler classloading mechanism is changed in
2.10 I guess a bigger improvement might come with that version. Do you know
about user having a concrete issue with the additional ram requirement or
is it more the fact that's not nice having memory used without clear reason
that's bothering you?

Shell: The felix webconsole is there to install bunde, configure services
and so on. What you can do with the shell is actually invoking these
service's methods and explore exported package structures. Especially when
exploring API's I'm not yet familiar with the shell has been of great
benefit to me. Of course it's a module one can turn off.

Bundle-Dev-Tools: (These aren't yet in Stanbol.) Basically maven skeletons
can also be used as prototypes for the bundle-dev-tool (just some maven
magic needed). Of course it's question of style and size of the module if
one want the dynamic update and things working independently of the pom
dependencies or prefers to compile and redeploy. In the trunk version of
dev-tools there's also instant update for static files which makes it
particularly convenient when editing css and javascript. As long as no
duplication of archetype/skeleton is needed I don't see why not offer both
maven archetypes and skeletons.

Security:
You're suggesting one should configure the user, their password and
permission in some config files rather than storing them in RDF and having
a UI to edit them (Ok, I'm embarrassed that UI isn't there yet)? I think
when we're talking about some launchers being stateless we mean that usage
of  the (main) functionality it offers doesn't alter the state of the
system. If you intepret "stateless" very strictly then you would have to
drop most parts of the felix webconsole as http requests to install bundle
or configure services aren't stateless. For the user-configuration a simple
file-based TcProvider would of course be enough so no TDB is needed for
that.

I think we should see where we want to go as a community. For me the
important thing is that Stanbol remains very modular. I think statements
like "Stanbol is no semantic CMS" do not bring us further. It's important
that the stanbol services can be used as services and that many services
are stateless. But the contenthub is a component to manage content (the
entityhub to some degree as well), do we want to mandate a horrible user
interface just to comply with some catchphrase about what Stanbol is not?
Or do we want to reduce Stanbol to the be just the Enhancer and let the
other stuff to other projects?

I'd rather go for the vision of an ecosystem of modular semantic and
restful osgi components, but if the community wants to focus on the
enhancer I think a clear statement should be made to avoid unnecessary
arguments about memory consumption.

Cheers,
Reto


On Fri, Nov 9, 2012 at 10:56 AM, Rupert Westenthaler <
rupert.westenthaler@gmail.com> wrote:

> Hi all,
>
> let me share my throughs. Because this mail is rather long I tried to
> split it up in three separate section (1) RDF (2) RESTful/ Web
> Interface and (3) other related topics
>
>
> RDF libs:
> ====
>
> Out of the viewpoint of Apache Stanbol one needs to ask the Question
> if it makes sense to manage an own RDF API. I expect the Semantic Web
> Standards to evolve quite a bit in the coming years and I do have
> concern that the Clerezza RDF modules will be updated/extended to
> provide implementations of those. One example of such an situation is
> SPARQL 1.1 that is around for quite some time and is still not
> supported by Clerezza. While I do like the small API, the flexibility
> to use different TripleStores and that Clerezza comes with OSGI
> support I think given the current situation we would need to discuss
> all options and those do also include a switch to Apache Jena or
> Sesame. Especially Sesame would be an attractive option as their RDF
> Graph API [1] is very similar to what Clerezza uses. Apache Jena's
> counterparts (Model [2] and Graph [3]) are considerable different and
> more complex interfaces. In addition Jena will only change to
> org.apache packages with the next major release so a switch before
> that release would mean two incompatible API changes.
>
> My personal opinion is that we should keep using Clerezza for now.
> Invest some effort to improve the Clerezza RDF modules and than see
> how it further develops. Such an Effort should include
>
> *  to implement SPQRAL fast lane (as already discussed with Reto
> during ApacheCon). Fast lane would allow Clerezza to use the native
> SPARQL engine of the used Triplestore. Meaning that Clerezza only
> parses those parts of the SPARQL query to understand the RDF graph to
> execute the Query on. This information is than used to parse the query
> to the native SPARQL engine via an extended Interface of the
> TcProvide. The Clerezza SPARQL implementation would only be used in
> case the TcProvider does not provide a native SPARQL implementation of
> if the Query spans RDF graphs managed by different TcProvider
> instances. By that Clerezza users would be able to use any SPARQL
> feature provided by the used TripleStore.
> * update to the newest Jena versions (see also STANBOL-621; Peter
> Ansell's Clerezza fork on github [5] as well as Sebastian Schaffert's
> Jena bundle used for the Stanbol/LMF integration [5])
> * finish and release the SingleTdbDatasetTcProvider.java
> (CLEREZZA-691) as this is important for the Stanbol Ontology Manager
> component
> * move the Indexed in-memory graph (CLEREZZA-683) from the Stanbol
> code base to Clerezza and release it so that we can use it from their
> in Stanbol
> * provide an Clerezza JsonLD parser/serializer. This is critical for
> Stanbol as several CMS use this as preferred RDF serialization.
>
> [1]
> http://www.openrdf.org/doc/sesame2/api/org/openrdf/model/package-summary.html
> [2]
> http://jena.apache.org/documentation/javadoc/jena/com/hp/hpl/jena/rdf/model/Model.html
> [3]
> http://jena.apache.org/documentation/javadoc/jena/com/hp/hpl/jena/graph/Graph.html
> [4]
> https://github.com/ansell/clerezza/commit/37747324d980fad6a33caa3da00491da66900c37
> [5]
> https://bitbucket.org/srfgkmt/stanbol-lmf/src/f41c6c93f08872469dc2e2d64fc06ad75f76f003/lmf-jena/pom.xml
>
>
> RESTful API / Web Interface:
> =====================
>
> There are several shortcomings of the current implementation of the
> Stanbol RESTful services / Web UI modules ( o.a.stanbol.commons.web,
> o.a.stanbol.*.web, o.a.stanbol.*.jersey modules)
>
> * Jersey's use of java.util.ServiceLoader forces the use manual
> configuration of the JAX-RS components. A switch to an OSGI compatible
> implementation such as Apache Wink would be very welcome
> * The RESTful API documentation is currently written as HTML into
> Freemarker templates. This makes it really hard to maintain this
> documentation. I would really appreciate the possibility to use
> markdown (as used on the Webpage) for that
> * For Stanbol deployments of Stanbol it should be possible to exclude
> the WebUI so that only the RESTful services are available
>
> regarding :
>
> > Stanbol drops it's interretation of "REST" as "not for humans" and want
> to go to
> > allow integrating (wherever possible as modular and optional components)
> > media types designed for human consumptions and support REST approaches
> > there as well (thinking of the current back-button unfriendly UI).
>
> Adding support for a simple Table based representation of RDF data
> would indeed be an important feature. However having Resource (Entity)
> type specific rendering is out of the scope of Apache Stanbol (at
> least in my opinion). However AFAIK as soon as we switch to an OSGI
> compatible JAX-RS implementation users could add those easily by
> providing the according JAX-RS MessageBodyWriter.
>
> If there are people who would like to work it would be really great.
> If we could (re)use some stuff from Clerezza - even better. But things
> would need to keep simple as Stanbol is no semantic CMS.
>
> I would suggest to start development in an own branch and than have a
> discussion/vote based on an early prototype/demonstration.
>
>
> Other Topics
> =========
>
> ### Scala and jsr 223 (scripting in the JVM)
>
> I do have an issue with Scala as it adds >150MByte to the PermGen as
> soon as it is loaded. But as long as it is an optional dependency and
> users are aware of that when adding the dependency I am fine with it.
>
> ###  Shell
>
> Personally I do not find the shell very useful. For installing
> Bundles/Service configurations I prefer to use the Apache Sling
> FileInstaller. For deployment during development I like to use the
> Sling Maven Installer plugin. For creating new Stanbol Modules I
> rather suggest to create an extensive list of Maven Archetype (e.g.
> for Stanbol EnhancementEngines).
>
> As the Shell also depends on Scala the "+150MByte to the PermGen"
> issue also applies to the Shell.
>
> ### Security
>
> Having a security model in Apache Stanbol might be important for some
> use cases. Because of this I consider this an important topic. However
> one I have very little experience with.
>
> I would like to get rid of the dependencies to
> org.apache.clerezza:patform (AFAIK this is only needed for the
> configuration and this could be easily provided by the
> sling.properties file at runtime. Defaults can be provided in the
> commons.properties file already included in all Stanbol Launchers. I
> would also suggest to move the PermissionParser utility over to the
> Apache Stanbol Security modules.
> This two changes would allow to activate the security module also for
> the Stable (Stateless) launcher.
>
>
> best
> Rupert
>
>
> On Thu, Nov 8, 2012 at 2:39 PM, Hasan Hasan <hasan@trialox.org> wrote:
> > Comments inline...
> >
> > On Thu, Nov 8, 2012 at 1:00 PM, Reto Bachmann-Gmür <reto@apache.org>
> wrote:
> >
> >> Ok, sorry for jumping into this discussion so lately. I've been having
> >> quite some discussion on the matter here at apacheconeu. Also I had
> >> prositive feedback from my resentation of Clerezza yesterday.
> >>
> >> I think two things:
> >> - For high level platform component it is often not clear if the fit
> better
> >> into Stanbol or into Clerezza
> >> - The RDF Api shoud actually be independen both from triple store
> provider
> >> as well as from consumer
> >>
> >> So I think a good solution would be to have the RDF liraries comprising:
> >> - A modular and very spec oriented API for RDF and related standards
> >> - A set of serializing and parsing providers
> >> - Adapters to triple stores (where the api isn't provided by the triple
> >> store)
> >> basically that's what in the org.apache.clerezza.rdf.* packages
> >>
> >> That's the stuff that would fit well into Stanbol. Provided that stanbol
> >> drops it's interretation of "REST" as "not for humans" and want to go to
> >> allow integrating (wherever possible as modular and optional components)
> >> media types designed for human consumptions and support REST approaches
> >> there as well (thinking of the current back-button unfriendly UI).
> >>
> >
> > IMO, Clerezza is just too big for existing committers. If we could reduce
> > it to the
> > essential components dealing with rdf and leaving out templating and
> > rendering,
> > it may be easier to graduate.
> >
> > - Scala Server Pages
> >> - TypeRendering (selection of templates based on the rdf type of the
> >> returned response)
> >> - Security (already integrated to some degree, code based security to
> run
> >> bundles in a sandboxed manner is not)
> >> - Shell (already ships in the stanbol launcher, so here it's about
> >> 'adopting' the sources)
> >> - Dev tools: rapid development support (create sample projects, have
> source
> >> files as bundles)
> >>
> >> To the attic:
> >> - Triaxrs: The Clerezza jax-rs implementation is no longer needed as the
> >> same support (jax-rs components asosgi services) is now provided by
> apache
> >> wink
> >> -  jssr 223 support
> >>
> >> In my opinion there is no urgent need for action, it is true that there
> >> hasn't been a lot of action in clerezza but imho the project os going on
> >> even at a low pace  (as other projects like e.g. the recently graduated
> >> wink).
> >>
> >
> > Not sure about no urgent need for action. Maybe we should list the
> > requirements
> > to fulfil in order to be able to graduate. Wonder if we are able to meet
> > them.
> >
> > Cheers
> > Hasan
> >
> >
> >>
> >> Cheers,
> >> Reto
> >>
> >> On Thu, Nov 8, 2012 at 12:02 PM, Bertrand Delacretaz <
> >> bdelacretaz@apache.org
> >> > wrote:
> >>
> >> > On Thu, Nov 8, 2012 at 11:33 AM, Andy Seaborne <andy@apache.org>
> wrote:
> >> > > ...It's good to have the existing released artifacts remain - what
> >> about
> >> > after
> >> > > the donation?
> >> > >
> >> > > Presumably the moved modules will be released by the new host - will
> >> they
> >> > > use group id org.apache.clerezza? or move to the new host project
> group
> >> > id?
> >> > > I'd suggest renaming the group to the new project but realise it is
> a
> >> bit
> >> > > more disruptive...
> >> >
> >> > I think that's really up to whatever project adopts that code. In
> >> > theory package names should change but that's probably not convenient.
> >> >
> >> > Or maybe it's time to create a semantic module or two at
> >> > http://commons.apache.org/ ? If existing committers are willing to
> >> > support that with their work it should be easy to make it happen.
> >> >
> >> > -Bertrand
> >> >
> >>
>
>
>
> --
> | Rupert Westenthaler             rupert.westenthaler@gmail.com
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message