stanbol-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antonio David Perez Morales <ape...@zaizi.com>
Subject Re: Camel integration (was : Re: Community bonding period started)
Date Fri, 25 Jul 2014 10:23:51 GMT
Hi people

Regarding the last mail and continuing the work about the semantic search
use case (where a stanbol solr component was already implemented in order
to have some similar functionality like the old contenthub component), I
have decided to give a try and implement a Siren [1] component for Stanbol
workflow component. Siren is an extension of Solr that allows to store
semi-structured components , fitting perfectly with the idea of store
documents along with their related entities in order to allow subsequent
semantic searches.

The problem of the old content hub component (and also the problem of the
new stanbol solr component) is that all the semantic information per
document is stored in a plain form in the same Solr document (useful for
some kind of searches) making impossible to relate the extracted attributes
(properties) with their respective entities, losing the "parent-child"
(document-entities) structure.

I think it can be a great component for leveraging all the information
extracted by Stanbol in searches.

Please, feel free to comment or add whatever information you think useful
for this.

Regards

[1] http://sirendb.com/


On Mon, Jul 21, 2014 at 3:41 PM, Antonio David Perez Morales <
aperez@zaizi.com> wrote:

> Hi all
>
> As anticipated in the previous mail, I have develop a first version of the
> Stanbol Solr component. This component (by default managing the
> stanbol-solr camel protocol) extends the Camel Solr component, so all the
> properties used to configure it ca be used in this component as well.
>
> The component is responsible of extracting fields and values from the
> entities in the Content-Item and creates a Solr Document with the content
> and metadata to be indexed in Solr. In this first version, no filtering is
> being applied to the entities (for example, get the field-values only from
> the entity with higher confidence value).
>
> The first version of the component allows three conf parameters in a route:
>  - ldpath : LDPath program to be used to extract the values of the fields.
> As mentioned in the previous mail, if a different ldpath in th dereference
> engine is used then the properties to be extracted may not exist.
>  - fields : A comma-separated list of values containing the fields to be
> extracted from the entities and indexed in Solr.
>  - useDereferenceLdpath: If no ldpath program is defined, then this
> boolean flag allows to use the same ldpath program used by the dereference
> engine (getting it from the information contained in the content-item and
> passed in the HTTP request to the enhancer or configured in the
> chain/engine component). Default value is true.
>
> A sample route using this component could be the following:
> <routes xmlns="http://camel.apache.org/schema/spring">
>     <route id="stanbolsolr">
>           <from uri="direct://stanbolsolr" />
>           <to
> uri="chain://default?enhancer.engines.dereference.ldpath=%40prefix%20test%20%3A%20%3Chttp%3A%2F%2Ftest.org%2F%3E%3B%test%3Aname%3Drdfs%3Alabel%20%3A%3A%20xsd%3Astring%3B"
> />
>           <to uri="stanbol-solr://localhost:8983/solr" />
>      </route>
>  </routes>
>
> As a future extensions of this component, a new property specifying a
> configured dereference engine to use for the ldpath and filtering the
> entities to get only the one with the higher confidence value will be
> developed.
> With this component, we can have a some features similar to the old
> Stanbol content hub. So, i think improving this component we could achieve
> to have to content-hub back to Stanbol (but using an external Solr
> instance, which I think is good to not overloading the Stanbol application)
>
> Moreover, as part of the "use cases" project part and as discussed in the
> Stanbol IRC Channel, I'm also evaluating Siren [1], an extension of Solr
> bringing new and improved capabilities to it. It's very useful for
> structured document search.
> So my idea is to try to create a Siren component for Camel integrated in
> Stanbol, to bring the possibility to store (in an easy way) the content
> along with the extracted metadata in a structured way, instead of simply
> creating new fields for a document.
>
> Stay tuned for new advances.
> As always, comments are more than welcome.
>
> Regards
>
> [1] http://sirendb.com/
>
>
> On Wed, Jul 16, 2014 at 12:53 PM, Antonio David Perez Morales <
> aperez@zaizi.com> wrote:
>
>> Hi people
>>
>> Continuing with the project work , I have implemented some improvements
>> to chain and engine components to allow defining enhancer properties (like
>> enhancer.engines.dereference.ldpath) in the route component definition.
>> Example :
>> from(direct://test).to(engine://dereference-engine?enhancer.engines.dereference.ldpath=EXPRESSION).
>> As said in previous mails, the engines and chains have to be configured
>> through Felix console.
>>
>> Regarding the last discussion about a new kind of ContentHub back to
>> Stanbol as an use case for the workflow integration, I have successfully
>> created a custom Camel processor to create the document with the content
>> and enhancement metadata in order to be sent to Solr. It takes the LDPath
>> expression (configured in the dereference engine component via
>> enhancer.engines.dereference.ldpath query parameter or camel component
>> parameter) to extract the metadata to be indexed. So using a route like
>> from().to(chain://Default).process(ContentItemProcessor).to(solr://localhost:8983/solr),
>> we can have new indexed documents in Solr containing the text and the
>> extracted enhancement metadata in order to be use in semantic searchs in
>> the external Solr. Of course, the Solr schema needs to be created in the
>> remote Solr beforehand. It is only a brief proof of concept of such
>> functionality.
>>
>> My idea is to use an external Solr to store the content and semantic
>> metadata for semantic search purposes, as opposite of the old ContentHub
>> which was using an internal SolrYard, creating the schema from the
>> configured LDPath expression.
>>
>> The next step in this task will be create a custom StanbolSolr component,
>> able to perform the functionality of the previous processor and Solr, but
>> allowing configuring the LDPath, fields and properties to be extracted and
>> put as metadata in the new Solr document. These properties will be applied
>> to the ContentItem metadata, so if an entity dereference engine is
>> configured with a different LDPath expression or fields, maybe the
>> properties to be extracted will not exist.
>> As future improvement of this component, we could add a new conf
>> parameter specifying a configured dereference engine to be used before
>> applying the configuration.
>>
>> Stay tuned for further advances.
>>
>> As always if you have any questions or comments, please drop some lines
>> here.
>>
>> Regards
>>
>> PS: The example routes used are very simple and lineals, but for some
>> scenarios, parallel executions of engines, multicast, aggregator, etc
>> (supported by camel) could be used to speed up the enhancement process.
>>
>>
>>
>>
>>
>>
>> On Tue, Jul 8, 2014 at 9:46 AM, Antonio David Perez Morales <
>> aperez@zaizi.com> wrote:
>>
>>> Hi Rafa and all
>>>
>>> In my opinion, the Content Hub back in Stanbol for Semantic Search
>>> capabilities is a great use case to be implemented.
>>> Waiting for Florent's opinion, I could start first only with Solr (whose
>>> component already exists in Camel but it needs to be adapted like the
>>> ActiveMQ component) and creating a custom transformer bean for Camel to
>>> have the original Content Hub. After that, we could think to create the
>>> SIREn component and the new transformer for it, giving the users the
>>> possibility of use one of them.
>>>
>>> What do you think? Is It an interesting use case for the Camel
>>> integration application?
>>>
>>> Regards
>>>
>>>
>>> On Mon, Jul 7, 2014 at 4:27 PM, Rafa Haro <rharo@apache.org> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> El 01/07/14 10:20, Antonio David Perez Morales escribió:
>>>>
>>>>  Hi all
>>>>>
>>>>> Continuing with the project, I have managed successfully the
>>>>> integration of
>>>>> activemq camel component (and also jms) into the Stanbol Camel
>>>>> integration.
>>>>> This has been a hard task due to the dependencies needed by the
>>>>> component
>>>>> and also due to the fact that we had to provide an activemq component
>>>>> configurable through Felix web console.
>>>>>
>>>>> With this addition, we are in the position to integrate business logic
>>>>> into
>>>>> Stanbol routes through a message service provided by activemq (jms).
>>>>>
>>>> Nice Antonio, let's see is someone has an interesting use case to
>>>> implement in this context.
>>>>
>>>>
>>>>> As a first test, I have deployed a route which consumes messages
>>>>> (content)
>>>>> from an activemq queue, enhance them using the default chain and then
>>>>> write
>>>>> the result into a file. It's a simple test but it works quite well. In
>>>>> this
>>>>> case, Stanbol is working in a standalone mode, that is to say, we don't
>>>>> have to explicitly call Stanbol to enhance content but Stanbol is
>>>>> triggered
>>>>> based on some external events (a new queue message)
>>>>>
>>>>> As indicated in the previous mail, I still have some pending things to
>>>>> be
>>>>> done (because I couldn't do them last week) but in order to go forward
>>>>> with
>>>>> the project I ask you for some interesting use cases where to apply
>>>>> the new
>>>>> workflow component in order to give added value to it and also in
>>>>> order to
>>>>> develop and provide more workflow (camel) components useful for those
>>>>> and
>>>>> other use cases.
>>>>>
>>>> Awaiting for the community feedback and also for Florent's opinion
>>>> regarding the rest of the project, as I have expressed in recent emails,
>>>> I'm eager to see the Content Hub back in Stanbol. And this is because of,
>>>> from the point of view of the use of Stanbol in the enterprise, Semantic
>>>> Search is one of the most common use cases. So, to have an enterprise
>>>> search backend as the last component of a processing route in any
>>>> architecture where stanbol could be plugged sounds key for me. In recent
>>>> discussions at the Stanbol IRC channel, we have been analysing Siren (
>>>> https://github.com/rdelbru/SIREn), a Lucene/Solr extension which major
>>>> advantage is the possibility to index tree structures, allowing then to
>>>> index structured data without losing full text search capabilities. To
>>>> refactor old ContentHub component to use Siren is out of scope of this
>>>> project but, in my opinion, an interesting use case could be to develop a
>>>> Siren Camel Component and a transformer from ContentItem to Siren Object
or
>>>> whatever and integrate both in Stanbol.
>>>>
>>>> What do you guys think?
>>>>
>>>> Cheers,
>>>> Rafa
>>>>
>>>>
>>>>
>>>>
>>>>> Regards
>>>>>
>>>>>
>>>>> On Mon, Jun 23, 2014 at 6:16 PM, Antonio David Perez Morales <
>>>>> aperez@zaizi.com> wrote:
>>>>>
>>>>>  Hi Stanbolers
>>>>>>
>>>>>> The GSoC 2014 midterm is here and I want to give you a summary of
the
>>>>>> work
>>>>>> already done so far:
>>>>>>
>>>>>> - Adapted previous Camel integration PoC done by Florent into Stanbol
>>>>>> 1.0
>>>>>> version.
>>>>>> - Improved EngineComponent used by Camel to execute Enhancement
>>>>>> Engines
>>>>>> (configured through Stanbol web console as usual) using the engine://
>>>>>> uri
>>>>>> scheme in routes.
>>>>>> - Created ChainComponent used by Camel to execute Enhancement Chains
>>>>>> using
>>>>>> the chain:// uri scheme in routes (both Camel components are provided
>>>>>> as
>>>>>> OSGI components, so the uri scheme can be changed through the Stanbol
>>>>>> web
>>>>>> console)
>>>>>> - Created a custom artifact for Apache Felix Fileinstall in order
to
>>>>>> be
>>>>>> able to install routes defined in Camel Spring XML DSL placing a
>>>>>> route file
>>>>>> (with 'route' extension) in the stanbol/fileinstall directory
>>>>>> - Created a custom archetype to ease the development of bundles
>>>>>> containing
>>>>>> route definitions in Java DSL. The archetype generates a class
>>>>>> extending
>>>>>> 'RouteBuilder' which creates a default Camel direct endpoint used
by
>>>>>> other
>>>>>> Stanbol Workflow components to execute the route.
>>>>>> - Created a first version of Workflow API, which contains different
>>>>>> OSGI
>>>>>> components which allow registering Camel components/routes,
>>>>>> start/stop/execute routes, add/remove components used in routes,
etc.
>>>>>> - REST endpoint is provided to test the execution of routes using
REST
>>>>>> requests (/flow/{routeId} )
>>>>>> - Modified the PoC full launcher to use all the new bundles to
>>>>>> support the
>>>>>> workflow feature.
>>>>>> - Installed JBoss developer studio which comes with Camel support
in
>>>>>> order
>>>>>> to create routes in a visual way with the possibility to be exported
>>>>>> as
>>>>>> Spring XML DSL format
>>>>>>
>>>>>> Some pending things I will try to do during this week:
>>>>>> - Improve the web package to create the needed endpoints to query
the
>>>>>> registered routes, registered camel components, etc
>>>>>> - Improve the web package to remove classes copied from Stanbol jersey
>>>>>> module used for testing
>>>>>> - Update README.md files in the repository with all the new
>>>>>> information
>>>>>> - Document the installation and configuration of JBoss developer
>>>>>> studio
>>>>>> for Camel routes creation
>>>>>> - Create all the JIRA issued related to the work already done
>>>>>>
>>>>>>
>>>>>> For the second part of the project, I would like to read some comments
>>>>>> about interesting use cases in order to develop the needed Stanbol
and
>>>>>> Camel components to support them.
>>>>>>
>>>>>> If you have any comment, please drop some lines in order to discuss
>>>>>> the
>>>>>> new things to be done.
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, Jun 14, 2014 at 3:39 PM, Antonio David Perez Morales <
>>>>>> aperez@zaizi.com> wrote:
>>>>>>
>>>>>>  Hi guys
>>>>>>>
>>>>>>> Continuing with the project, and as part of the refactoring/new
>>>>>>> architecture I have started to modify some workflow components
in
>>>>>>> order to
>>>>>>> create a better API and architecture based on OSGI components.
As a
>>>>>>> first
>>>>>>> step and in order to have the same behavior than the current
one
>>>>>>> (regarding
>>>>>>> enhancement process), a chain component has been created to simulate
>>>>>>> the
>>>>>>> chain behaviour. This new component uses internally the ChainManager
>>>>>>> and
>>>>>>> EnhancementJobManager component to perform the business logic.
This
>>>>>>> way, a
>>>>>>> new protocol 'chain' can be used in the routes deployed in Stanbol.
>>>>>>> The
>>>>>>> chains are configured in the same way, using Stanbol admin console.
>>>>>>>
>>>>>>> Now, we can combine single engine executions with chains executions
>>>>>>> in
>>>>>>> routes deployed in Stanbol using the alternatives described in
>>>>>>> previous
>>>>>>> mails and in the issue [1]. Both engines and chains are configured
>>>>>>> through
>>>>>>> Stanbol admin console. You can see the refactoring advances in
[2] (a
>>>>>>> branch used for refactoring the current PoC of Workflow in Stanbol
>>>>>>> 1.0). Of
>>>>>>> course, the Camel EIP and other Camel components can be used
in the
>>>>>>> deployed routes as well.
>>>>>>>
>>>>>>> With the new Camel routes support, we can have a Stanbol running
and
>>>>>>> enhancing content without receiving any HTTP request to start
the
>>>>>>> enhancement process, because the routes can be triggered by external
>>>>>>> events
>>>>>>> ocurred in a queue, database, etc. Moreover the semantic lifting
>>>>>>> process
>>>>>>> can be splitted and merged with some application steps, so the
issue
>>>>>>> [3]
>>>>>>> requesting asynchronous call support for enhancement could be
solved.
>>>>>>>
>>>>>>> Anyway, if some of you have any suggestions for new components
to be
>>>>>>> deployed for the second part of the project, or another kind
of
>>>>>>> suggestion,
>>>>>>> please drop here some lines to continue with the discussion.
>>>>>>>
>>>>>>> Regards
>>>>>>>
>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>>>> [2]
>>>>>>> https://github.com/adperezmorales/stanbol-camel-
>>>>>>> workflow/tree/refactoring
>>>>>>> [3] https://issues.apache.org/jira/browse/STANBOL-263
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jun 11, 2014 at 10:01 AM, Antonio David Perez Morales
<
>>>>>>> aperez@zaizi.com> wrote:
>>>>>>>
>>>>>>>  Hi people
>>>>>>>>
>>>>>>>> As part of the GSoC project for the midterm and according
to the
>>>>>>>> issue
>>>>>>>> [1], a custom Apache Felix Fileinstall artifact has been
created in
>>>>>>>> order
>>>>>>>> to deploy Camel routes defined in XML (Spring DSL) placing
a file
>>>>>>>> with
>>>>>>>> .route extension in a configured directory (like stanbol/fileinstall
>>>>>>>> directory). Moreover since this artifact depends on Fileinstall
>>>>>>>> bundle, the
>>>>>>>> created launcher has been modified to have that bundle in
the OSGI
>>>>>>>> context
>>>>>>>> by default.
>>>>>>>>
>>>>>>>> So, once the current Camel integration POC has been integrated
in
>>>>>>>> Stanbol 1.0 and extended to support the deployment of routes
>>>>>>>> defined by
>>>>>>>> Java DSL (through bundles) and XML (route files), the next
step
>>>>>>>> will be
>>>>>>>> thinking and redesigning the current architecture trying
to avoid
>>>>>>>> the
>>>>>>>> duplicated code and providing a more extendable and easy
to use
>>>>>>>> Workflow
>>>>>>>> API, because with the current integration only direct routes
can be
>>>>>>>> triggered using REST API which means that the defined routes
must be
>>>>>>>> configured properly using a direct endpoint consumer. Anyway,
routes
>>>>>>>> starting in some other way like timers are triggered directly
in the
>>>>>>>> deployment, so this has to be taken into account for the
new API
>>>>>>>> (and REST
>>>>>>>> API).
>>>>>>>>
>>>>>>>> In parallel and for the second part, new Stanbol Camel components
>>>>>>>> will
>>>>>>>> be developed in order to be used in new routes. So if any
of you
>>>>>>>> have use
>>>>>>>> cases for this involving Stanbol components, please drop
some lines
>>>>>>>> here in
>>>>>>>> order to prioritize the Stanbol Camel components to be developed.
>>>>>>>>
>>>>>>>> Comments and suggestions are more than welcome
>>>>>>>>
>>>>>>>> Regards
>>>>>>>>
>>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>>>>> [2] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jun 2, 2014 at 7:00 PM, Antonio David Perez Morales
<
>>>>>>>> aperez@zaizi.com> wrote:
>>>>>>>>
>>>>>>>>  Hi stanbolers
>>>>>>>>>
>>>>>>>>> As part of the issue [1] , I have created a maven archetype
useful
>>>>>>>>> to
>>>>>>>>> generate Camel routes in Java DSL.
>>>>>>>>> The archetype generates a Java project with all the dependencies
>>>>>>>>> and
>>>>>>>>> one Java class with a method which has to be filled.
In this
>>>>>>>>> method, Camel
>>>>>>>>> Java DSL syntax is used to create the route.
>>>>>>>>> By default and as a first approach, the class will use
the route
>>>>>>>>> name
>>>>>>>>> given during the project creation to enable a Camel direct
>>>>>>>>> endpoint with
>>>>>>>>> such name.
>>>>>>>>> The code of the first archetype version can be found
at [2].
>>>>>>>>>
>>>>>>>>> The next task will be providing a Felix custom artifact
to be able
>>>>>>>>> to
>>>>>>>>> deploy XML-based routes in Stanbol, placing a custom
file in the
>>>>>>>>> Stanbol
>>>>>>>>> datafiles directory.
>>>>>>>>> After that, it will be time to think and redesign the
architecture
>>>>>>>>> to
>>>>>>>>> integrate Camel workflows inside Stanbol in a better
way, more
>>>>>>>>> configurable
>>>>>>>>> and extendable.
>>>>>>>>>
>>>>>>>>> Comments and suggestions are more than welcome
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>>
>>>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>>>>>> [2] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, May 30, 2014 at 8:03 PM, Antonio David Perez
Morales <
>>>>>>>>> aperez@zaizi.com> wrote:
>>>>>>>>>
>>>>>>>>>  Hi all
>>>>>>>>>>
>>>>>>>>>> After a hard fight this week, I managed to get it
work the
>>>>>>>>>> Florent's
>>>>>>>>>> proof of concept code in the Stanbol 1.0 branch [1]
>>>>>>>>>> The code is uploaded in my github account [3]. As
I said in a
>>>>>>>>>> previous
>>>>>>>>>> mail, I prefer to do it separately and after the
project,
>>>>>>>>>> uploading the
>>>>>>>>>> developed code into a Stanbol branch.
>>>>>>>>>>
>>>>>>>>>> The 1.0.0 version has some changes in how the Jersey
endpoints are
>>>>>>>>>> registered and also new classes and packages, so
it was not a
>>>>>>>>>> trivial task
>>>>>>>>>> to make work the current proof of concept. Moreover
I don't like
>>>>>>>>>> to simply
>>>>>>>>>> copy and paste code and make the needed changes.
I always want to
>>>>>>>>>> understand how the things work and how they are developed
in
>>>>>>>>>> order to be
>>>>>>>>>> able to change/modify them or develop new code around
them.
>>>>>>>>>>
>>>>>>>>>> The steps done to achieve it have been the following:
>>>>>>>>>> - Updated pom files to the Stanbol 1.0.0-SNAPSHOT
version
>>>>>>>>>> - Updated bundle levels in bundlelist package to
fit the Stanbol
>>>>>>>>>> 1.0
>>>>>>>>>> version levels
>>>>>>>>>> - Adapted cameljobmanager package code to Stanbol
1.0.0-SNAPSHOT
>>>>>>>>>> classes and using Java OSGI annotations instead of
SCR
>>>>>>>>>> annotations in
>>>>>>>>>> Javadoc
>>>>>>>>>> - Updated flow web package to Stanbol 1.0.0-SNAPSHOT
classes and
>>>>>>>>>> modified needed resources
>>>>>>>>>> - Added Java OSGI annotations to the route (WeightedChain)
>>>>>>>>>> instead of
>>>>>>>>>> SCR annotations in javadoc
>>>>>>>>>> - Updated launcher to use the 1.0.0-SNAPSHOT packages
and needed
>>>>>>>>>> bundles
>>>>>>>>>>
>>>>>>>>>> So now, the http://localhost:8080/flow endpoint will
use the only
>>>>>>>>>> Camel route (defined by WeightedChain) to call all
the registered
>>>>>>>>>> Enhancement Engines (ordered by EnhancementEngine
order property).
>>>>>>>>>> For testing purposes, the /flow/{flowName} has been
removed,
>>>>>>>>>> because
>>>>>>>>>> all this code needs to be re-designed and re-implemented
so I
>>>>>>>>>> only wanted
>>>>>>>>>> to make it work to have a first (simple) integration
in Stanbol
>>>>>>>>>> 1.0. This
>>>>>>>>>> functionality will be added again to trigger custom
routes once
>>>>>>>>>> the next
>>>>>>>>>> step (defined below) is developed.
>>>>>>>>>>
>>>>>>>>>> The next step [2] will be support to write and configure
routes
>>>>>>>>>> in XML
>>>>>>>>>> format, putting the file in datafiles in order to
be loaded by a
>>>>>>>>>> Felix
>>>>>>>>>> custom artifact (as Rupert pointed out in a previous
mail) and
>>>>>>>>>> create a
>>>>>>>>>> Maven archetype to create bundles defining routes
which will be
>>>>>>>>>> loaded
>>>>>>>>>> using the Felix bundle tab. If necessary, as we talked
in previous
>>>>>>>>>> messages, a REST endpoint receiving routes in XML
can be
>>>>>>>>>> developed as an
>>>>>>>>>> alternative to the first approach. This is my objective
for the
>>>>>>>>>> midterm.
>>>>>>>>>>
>>>>>>>>>> After the midterm, the new Stanbol components for
Apache Camel
>>>>>>>>>> will be
>>>>>>>>>> developed and also the new architecture for Camel
in Stanbol.
>>>>>>>>>>
>>>>>>>>>> Comments on this and for use cases for Stanbol Camel
components
>>>>>>>>>> are
>>>>>>>>>> more than welcome.
>>>>>>>>>>
>>>>>>>>>> Regards
>>>>>>>>>>
>>>>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>>>>>>>>>> [2] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>>>>>>> [3] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, May 27, 2014 at 6:18 PM, Antonio David Perez
Morales <
>>>>>>>>>> aperez@zaizi.com> wrote:
>>>>>>>>>>
>>>>>>>>>>  Hi people
>>>>>>>>>>>
>>>>>>>>>>> I have already started to work on [1] to integrate
current
>>>>>>>>>>> Florent's
>>>>>>>>>>> code into Stanbol 1.0.
>>>>>>>>>>> As a first approach, only changing the dependency
versions to new
>>>>>>>>>>> Stanbol 1.0, many issues have arisen:
>>>>>>>>>>>   - Deprecated use of classes
>>>>>>>>>>>   - Classes which have changed from package
>>>>>>>>>>>   - Some classes not necessary now
>>>>>>>>>>>   - Classes not used which were causing conflicts
>>>>>>>>>>>   - ...
>>>>>>>>>>>
>>>>>>>>>>> So now I'm trying to resolve all these problems
to replicate the
>>>>>>>>>>> same
>>>>>>>>>>> behavior from 0.9 into 1.0. I will upload the
code to a Github
>>>>>>>>>>> repository
>>>>>>>>>>> in my account (which will be pushed later into
a Stanbol branch
>>>>>>>>>>> after the
>>>>>>>>>>> project) in order to track the advances.
>>>>>>>>>>> Once I can resolve all these problems, I will
take a look to the
>>>>>>>>>>> Felix Custom Artifacts poiinted out by Rupert
in a previous
>>>>>>>>>>> message to find
>>>>>>>>>>> out the best way to deploy (and manage) route
configurations
>>>>>>>>>>> (felix
>>>>>>>>>>> artifacts, watchservice java, rest endpoint to
receive xml
>>>>>>>>>>> routes, etc).
>>>>>>>>>>>
>>>>>>>>>>> Comments on this and future tasks are more than
welcome.
>>>>>>>>>>>
>>>>>>>>>>> Regards
>>>>>>>>>>>
>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>   On Tue, May 27, 2014 at 9:53 AM, Rafa Haro
<rharo@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>  Hi Rupert, Florent and Antonio
>>>>>>>>>>>>
>>>>>>>>>>>> El 27/05/14 08:51, Rupert Westenthaler escribió:
>>>>>>>>>>>>
>>>>>>>>>>>>   As the result of Enhancement Routes is
content + metadata I
>>>>>>>>>>>> can not
>>>>>>>>>>>>
>>>>>>>>>>>>> see what you want to "store" in the Entityhub
that is about
>>>>>>>>>>>>> managing
>>>>>>>>>>>>> Entities.
>>>>>>>>>>>>>
>>>>>>>>>>>>>   >  - entityhub: To query/update
the entityhub component
>>>>>>>>>>>>> Maybe. If you can come up with a good
use case ^^
>>>>>>>>>>>>>
>>>>>>>>>>>>>   >  - contenthub: To develop a new
content-hub using
>>>>>>>>>>>>> chain/engine
>>>>>>>>>>>>>
>>>>>>>>>>>>>> components
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> and solr/elasticsearch/whatever
component (solr and
>>>>>>>>>>>>>>> elasticsearch
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> component
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> already exist in Camel)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> IMO implementing a new Contenthub
like component is outside
>>>>>>>>>>>>> the
>>>>>>>>>>>>> scope
>>>>>>>>>>>>> of this GSoC project. However If there
is already
>>>>>>>>>>>>> Solr/Elasticsearch
>>>>>>>>>>>>> component it would be a really useful
thing
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>  Regarding this, in my opinion, the use
case of an eventual
>>>>>>>>>>>> integration with a Content hub is probably
one of the most
>>>>>>>>>>>> clear for this
>>>>>>>>>>>> project. I'm not sure if that is what Antonio
was trying to
>>>>>>>>>>>> explain but,
>>>>>>>>>>>> with a single route using as last endpoint
Solr or any other
>>>>>>>>>>>> backend
>>>>>>>>>>>> system, we would be almost cloning the same
functionality than
>>>>>>>>>>>> the previous
>>>>>>>>>>>> ContentHub implementation (Stanbol 0.12).
Entities could be
>>>>>>>>>>>> dereferenced
>>>>>>>>>>>> using the EntityHub before storing the content
along with the
>>>>>>>>>>>> metadata,
>>>>>>>>>>>> which is the point of integration of the
EntityHub in such use
>>>>>>>>>>>> case. And
>>>>>>>>>>>> even most interesting, now with the integration
of Marmotta
>>>>>>>>>>>> contributed by
>>>>>>>>>>>> Rupert, it would be possible to use a whole
graph for
>>>>>>>>>>>> dereferencing, so
>>>>>>>>>>>> "simply" routing components like Enhancer->Marmotta->Solr
>>>>>>>>>>>> sounds to me like
>>>>>>>>>>>> an interesting use case.
>>>>>>>>>>>>
>>>>>>>>>>>> wdyt?
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> Rafa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>
>>>
>>
>

-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message