stanbol-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suat Gonul <>
Subject Re: User story: Don't want to lose the semantic information I already have inside my CMS
Date Thu, 08 Nov 2012 12:22:07 GMT
Hi Gabriel,

Currently, the component getting documents from a CMIS repository and
sending them to the Contenthub is the CMISContenthubFeeder[1] which is a
CMIS based implementation of ContenthubFeeder[2] interface. Actually,
this interface and its implementations are open to improve based on the
requirements of actual end users like you.

Currently, CMISContenthubFeeder assumes that documents in CMS have the
cmis:Document type. This component traverses all of the properties of a
document except the ones defined in the "excludedProperties" field
residing in the CMISContenthubFeeder class. This can make configurable

Another point is that: In the current version in the SVN, the properties
of a cmis:Document are sent to the Contenthub as explicit constraints
along with the ContentItem. We have changed this point such that the
properties of the document are added as a ContentItem part. This is a
more convenient way considering the structure of a ContentItem[3].

It's a long time since I have not worked on this feature and currently,
I have a problem when I get a CMIS session, so I cannot test this
feature. Once that issue is resolved I can add a configuration where you
can specify the properties to be excluded through the OSGi console.



On 11/8/2012 12:15 PM, Gabriel Vince wrote:
> Hello,
> I have a related question and suggestion.
> We are heavily using metadata in our CMS and considering to use
> Stanbol as a semantic extension and enrichment for the already stored
> metadata. Seems CMIS-RDF mapping stores only hardcoded set of CMIS
> object properties (see the CMSAdapterVocabulary).
> Whouldn't be more flexible to have an extendable configuration what
> properties to store? Or simply store all found CMIS properties as RDF
> triples. I'm currently working on this extension if somebody
> interested (along with Alfresco aspect support).
> Best regards
>           Gabriel
> On Thu, Nov 8, 2012 at 10:59 AM, Fabian Christ
> <> wrote:
>> Hi Rüdiger,
>> thank you for bringing this up. Yes, we should reshape our view on
>> traditional CMS and not only focus on plain text without any metadata. We
>> have to think about the right ways to ensure that we do "never lose any
>> semantics", like in "never lose content".
>> Best,
>>  - Fabian
>> 2012/11/8 Rüdiger Kurz <>
>>> Hi Staboler,
>>> during ApacheCon in Sinsheim I had some interesting conversations with
>>> Fabian, Rupert and Anil as result I want to summarize one of the
>>> discussions as an user story telling a typical requirement for us as CMS
>>> provider.
>>> Talking about traditional Content Management Systems and assuming that
>>> they don't store semantic informations is not correct. For example CMS
>>> Systems already deliver RDFa annotated HTML, nearly all systems are
>>> providing some tagging/categorizing mechanism. Specially OpenCms provides a
>>> generic approach to define a structured content and therefore we have the
>>> information that a specific field/item of a content has a specified type
>>> and a defined label. E.g. A technology event named ApacheCon takes place in
>>> Sinsheim from 05. Nov until 08. Nov 2012 is the information that is already
>>> stored in OpenCms. More over OpenCms is able to connect that event with all
>>> speakers/persons that will make a presentation on that event, ...
>>> What we would like to achieve is not only a plain text enhancement more
>>> over we are interested in telling Stanbol all informations and associations
>>> we already know. In other words we absolutely don't want to lose the
>>> semantic information that is already existent in OpenCms.
>>> A good starting point would be a REST endpoint providing the ability to
>>> retrieve a RDFa annotated HTML document and than extracts the RDFa in order
>>> to store those inside the semantic-index/entity-hub/... as I previously
>>> suggested on the list under the subject "Extend stanbol content hub for
>>> RDFa support". Maybe the content hub is not the right component, but the
>>> requirement of RDFa extraction is still existent.
>>> --
>>> Kind Regards,
>>> Rüdiger.
>>> -------------------
>>> Rüdiger Kurz
>>> Alkacon Software GmbH  - The OpenCms Experts
>> --
>> Fabian

View raw message