manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rafa Haro <rh...@apache.org>
Subject Re: ManifoldCF transformation connector for Apache Stanbol
Date Fri, 11 Dec 2015 15:12:41 GMT
Hi Dileepa,

The problem is not in that part on the code, it is rather on this part:

if (entity != null) { Collection<String> properties = entity.
getProperties(); for (String property : properties) { String
targetFieldName = derefFields.get(property); Set<String> propValues =
entityPropertyMap.get(targetFieldName); if (propValues == null) {
propValues = new HashSet<String>(); } Collection<String> entityPropValues =
entity.getPropertyValues(property); propValues.addAll(entityPropValues);
entityPropertyMap.put(targetFieldName, propValues); } }
You are collecting from the EnhancementStructure response just only the
configured dereferenced fields and LDPath fields are ignored. Also, there
is a potential bug in that code if there is no dereferencing field
configured for a certain entity property here:

String targetFieldName = derefFields.get(property);

targetFieldName would be Null then. Instead of trying to index every
property, you should just collect the configured ones by the user (or at
least, if the user wants all of them, provide a configuration option for
that).

Anyway, going back to LDPath issue, please take into account that when you
define a field you must use a custom Namespace and Prefix for later being
able to retrieve that property from the entity. If you don't do that,
Stanbol will provide a random namespace for that property. Check this
example from RedLink SDK:

https://github.com/redlink-gmbh/redlink-java-sdk/blob/master/src/test/java/io/redlink/sdk/AnalysisTest.java#L423-443

Hope that helps

On Fri, Dec 11, 2015 at 3:57 PM Karl Wright <daddywri@gmail.com> wrote:

> The next step would be to pull this code into an svn branch.  This is
> something I can tackled after the 2.3 release candidate is put together.
>
> Thanks,
> Karl
>
>
> On Fri, Dec 11, 2015 at 9:07 AM, Dileepa Jayakody <djayakody@zaizi.com>
> wrote:
>
> > Hi Rafa,
> >
> > Thanks for reviewing my code and for your feedback. Please see my
> comments
> > inline below.
> >
> >
> > On Fri, Dec 11, 2015 at 6:51 PM, Rafa Haro <rharo@apache.org> wrote:
> >
> > > Hi Dileepa,
> > >
> > > This seems to be going in the right direction clearly now in my
> opinion.
> > > Quick comments after a first review:
> > >
> > >
> > >    - Rejecting a document because it can't be enhanced is kind of
> tough.
> > >    You are preventing a document to be finally indexed because the
> > > enhancement
> > >    didn't perform correctly, probably it is better just to let them
> > > continue
> > >    the workflow within the system
> > >
> >
> > Got your point. Will remove that part from the code
> >
> >
> > >    - As I can deduce for the code, you are correctly extracting the
> > >    configured dereferenced fields, but you are not processing at all
> the
> > >    LDPath results
> > >
> > > I'm passing the LDPath program as an enhancer parameter to Stanbol to
> > retrieve the enhancement result according to the LDPath program (which is
> > given as a text string in the connector UI).
> > If the user has not defined a LDPath program and added derefence fields
> in
> > the UI instead, then the enhancement request will be built using the
> > dereference fields as enhancer parameters.
> >
> >
> > If neither a LDPath or dereference fields are given in the transformation
> > UI, then I just call the given enhancement chain without any other
> enhancer
> > paramaters.
> >
> > Please refer below code segment where I do this and let me know if it
> needs
> > more improvements.
> >
> >             // ldpath program is given priority if it's set
> >             if (ldPath != null)
> >             {
> >                 parameters =
> >
> >
> EnhancerParameters.builder().setChain(chain).setContent(content).setLDpathProgram(ldPath).build();
> >             }
> >             else if (!derefFields.isEmpty())
> >             {
> >                 parameters =
> >
> >
> EnhancerParameters.builder().setChain(chain).setContent(content).setDereferencingFields(
> >                         derefFields.keySet()).build();
> >             }
> >             else
> >             {
> >                 parameters =
> > EnhancerParameters.builder().setChain(chain).setContent(content).build();
> >             }
> >             eRes = enhancerClient.enhance(parameters);
> >
> >
> > Thanks,
> > Dileepa
> >
> >
> > >
> > > Cheers,
> > > Rafa
> > >
> > >
> > >
> > >
> > > On Fri, Dec 11, 2015 at 1:05 PM Dileepa Jayakody <djayakody@zaizi.com>
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > As per our discussion I have modified the Stanbol Connector so that
> it
> > > adds
> > > > all extracted entity URIs and entity attributes to the repository
> > > document
> > > > as fields.
> > > >
> > > > On a separate branch I have committed this code to our github project
> > > > sensefy-connectors.
> > > > You can find the source code here:
> > > >
> > > >
> > >
> >
> https://github.com/zaizi/sensefy-connectors/tree/feature/SENSEFY-1453-modify-stanbol-connector/transformation/mcf-stanbol-connector
> > > > Let me know your feedback.
> > > >
> > > > I will write a blog post on how to add it in a connection and get
> > > > ehancement results and share it with you.
> > > >
> > > > Thanks,
> > > > Dileepa
> > > >
> > > >
> > > >
> > > > On Mon, Dec 7, 2015 at 6:29 PM, Karl Wright <daddywri@gmail.com>
> > wrote:
> > > >
> > > > > Hi Dileepa,
> > > > >
> > > > > You cannot create sub-documents in a transformation connector.  And
> > > > adding
> > > > > that capability to the framework is not possible; we would be
> missing
> > > key
> > > > > bookkeeping logic if that was allowed.
> > > > >
> > > > > Karl
> > > > >
> > > > >
> > > > > On Mon, Dec 7, 2015 at 6:59 AM, Dileepa Jayakody <
> > djayakody@zaizi.com>
> > > > > wrote:
> > > > >
> > > > > > Hi Karl,
> > > > > >
> > > > > > Thanks a lot for the pointer.
> > > > > >
> > > > > > Stanbol doesn't update an existing document, it generates a new
> > > > response
> > > > > > with requested enhancement details for the content enhansment
> > > request.
> > > > > > For example for a request like : "Paris is a city in France"
> > > following
> > > > > RDF
> > > > > > response [1] is given by Stanbol.
> > > > > >
> > > > > > In the Stanbol connector, enhancement artifacts such as
> > > TextAnnotations
> > > > > > and EntityAnnotations are extracted from the RDF response, to
> > > generate
> > > > > the
> > > > > > entity abstractions and add them to the mcf repository document.
> > > > > Currently
> > > > > > in the Stanbol connector we have added these entity abstractions
> as
> > > > JSON
> > > > > > strings to a multi-valued 'entities' field in the repository
> > document
> > > > and
> > > > > > we parse that JSON in the SolrWrapper output connector to index
> in
> > > > > separate
> > > > > > Solr cores (primary documents, linked entities and entity types
> > with
> > > > > their
> > > > > > attributes).
> > > > > >
> > > > > > Can we can have a primary repository document and create sub
> > > documents
> > > > > for
> > > > > > the extracted entities? Is it possible to generate sub documents
> > for
> > > a
> > > > > > repo-document in a transformation connector?
> > > > > >
> > > > > > Thanks.
> > > > > > Dileepa
> > > > > >
> > > > > > [1] Sample Stanbol response
> > > > > >
> > > > > > {
> > > > > >   "@context": {
> > > > > >     "dbp-ont": "http://dbpedia.org/ontology/",
> > > > > >     "dc": "http://purl.org/dc/terms/",
> > > > > >     "dc:created": {
> > > > > >       "@type": "xsd:dateTime"
> > > > > >     },
> > > > > >     "enhancer": "http://fise.iks-project.eu/ontology/",
> > > > > >     "enhancer:confidence": {
> > > > > >       "@type": "xsd:double"
> > > > > >     },
> > > > > >     "enhancer:end": {
> > > > > >       "@type": "xsd:int"
> > > > > >     },
> > > > > >     "enhancer:entity-reference": {
> > > > > >       "@type": "@id"
> > > > > >     },
> > > > > >     "enhancer:entity-type": {
> > > > > >       "@type": "@id"
> > > > > >     },
> > > > > >     "enhancer:extracted-from": {
> > > > > >       "@type": "@id"
> > > > > >     },
> > > > > >     "enhancer:start": {
> > > > > >       "@type": "xsd:int"
> > > > > >     },
> > > > > >     "entityhub": "
> > > > > http://stanbol.apache.org/ontology/entityhub/entityhub#
> > > > > > ",
> > > > > >     "foaf": "http://xmlns.com/foaf/0.1/",
> > > > > >     "foaf:depiction": {
> > > > > >       "@type": "@id"
> > > > > >     },
> > > > > >     "owl": "http://www.w3.org/2002/07/owl#",
> > > > > >     "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
> > > > > >     "schema": "http://schema.org/",
> > > > > >     "xsd": "http://www.w3.org/2001/XMLSchema#"
> > > > > >   },
> > > > > >   "@graph": [
> > > > > >     {
> > > > > >       "@id": "http://dbpedia.org/resource/France",
> > > > > >       "@type": [
> > > > > >         "dbp-ont:Country",
> > > > > >         "dbp-ont:Place",
> > > > > >         "dbp-ont:PopulatedPlace",
> > > > > >         "http://www.opengis.net/gml/_Feature",
> > > > > >         "owl:Thing",
> > > > > >         "schema:Country",
> > > > > >         "schema:Place"
> > > > > >       ],
> > > > > >       "foaf:depiction": [
> > > > > >         "
> > > > > >
> > > http://upload.wikimedia.org/wikipedia/commons/c/c3/Flag_of_France.svg
> > > > ",
> > > > > >         "
> > > > > >
> > > > >
> > > >
> > >
> >
> http://upload.wikimedia.org/wikipedia/commons/thumb/c/c3/Flag_of_France.svg/200px-Flag_of_France.svg.png
> > > > > > "
> > > > > >       ],
> > > > > >       "rdfs:comment": {
> > > > > >         "@language": "en",
> > > > > >         "@value": "France, officially the French Republic, is a
> > > > > > unitary semi-presidential republic in Western Europe with several
> > > > > > overseas territories and islands located on other continents and
> in
> > > > > > the Indian, Pacific, and Atlantic oceans. Metropolitan France
> > extends
> > > > > > from the Mediterranean Sea to the English Channel and the North
> > Sea,
> > > > > > and from the Rhine to the Atlantic Ocean. It is often referred to
> > as
> > > > > > l’Hexagone because of the geometric shape of its territory."
> > > > > >       },
> > > > > >       "rdfs:label": [
> > > > > >         {
> > > > > >           "@language": "en",
> > > > > >           "@value": "France"
> > > > > >         },
> > > > > >         {
> > > > > >           "@language": "fr",
> > > > > >           "@value": "France"
> > > > > >         },
> > > > > >       ]
> > > > > >     },
> > > > > >
> > > > > >     {
> > > > > >       "@id": "http://dbpedia.org/resource/Paris",
> > > > > >       "@type": [
> > > > > >         "dbp-ont:Place",
> > > > > >         "dbp-ont:PopulatedPlace",
> > > > > >         "dbp-ont:Settlement",
> > > > > >         "http://www.opengis.net/gml/_Feature",
> > > > > >         "owl:Thing",
> > > > > >         "schema:Place"
> > > > > >       ],
> > > > > >       "foaf:depiction": [
> > > > > >         "
> > > > > >
> > > > >
> > > >
> > >
> >
> http://upload.wikimedia.org/wikipedia/commons/6/6e/Paris_-_Eiffelturm_und_Marsfeld2.jpg
> > > > > > ",
> > > > > >         "
> > > > > >
> > > > >
> > > >
> > >
> >
> http://upload.wikimedia.org/wikipedia/commons/thumb/6/6e/Paris_-_Eiffelturm_und_Marsfeld2.jpg/200px-Paris_-_Eiffelturm_und_Marsfeld2.jpg
> > > > > > "
> > > > > >       ],
> > > > > >       "geo:lat": 48.8567,
> > > > > >       "geo:long": 2.3508,
> > > > > >       "rdfs:comment": {
> > > > > >         "@language": "en",
> > > > > >         "@value": "Paris is the capital and largest city of
> France.
> > > It
> > > > > > is situated on the river Seine, in northern France, at the heart
> of
> > > > > > the Île-de-France region (or Paris Region, French: Région
> > > parisienne).
> > > > > > As of January 2008 the city of Paris, within its administrative
> > > limits
> > > > > > largely unchanged since 1860, has an estimated population of
> > > 2,211,297
> > > > > > and a metropolitan population of 12,089,098, and is one of the
> most
> > > > > > populated metropolitan areas in Europe."
> > > > > >       },
> > > > > >       "rdfs:label": [
> > > > > >
> > > > > >         {
> > > > > >           "@language": "en",
> > > > > >           "@value": "Paris"
> > > > > >         },
> > > > > >         {
> > > > > >           "@language": "fr",
> > > > > >           "@value": "Paris"
> > > > > >         },
> > > > > >       ]
> > > > > >     },
> > > > > >    }
> > > > > >     {
> > > > > >       "@id":
> > "urn:enhancement-8db13707-1ecd-b4df-90ad-52447c8f2c84",
> > > > > >       "@type": [
> > > > > >         "enhancer:Enhancement",
> > > > > >         "enhancer:TextAnnotation"
> > > > > >       ],
> > > > > >       "dc:created": "2015-12-07T11:22:07.740Z",
> > > > > >       "dc:creator":
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> "org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhancementEngine",
> > > > > >       "dc:type": "dbp-ont:Place",
> > > > > >       "enhancer:confidence": 0.6017613,
> > > > > >       "enhancer:end": 5,
> > > > > >       "enhancer:extracted-from":
> > > > > > "urn:content-item-sha1-c8ae372ed26679df14da13050dd432fd32c527e3",
> > > > > >       "enhancer:selected-text": {
> > > > > >         "@language": "en",
> > > > > >         "@value": "Paris"
> > > > > >       },
> > > > > >       "enhancer:selection-context": {
> > > > > >         "@language": "en",
> > > > > >         "@value": "Paris is in France"
> > > > > >       },
> > > > > >       "enhancer:start": 0
> > > > > >     },
> > > > > >     {
> > > > > >       "@id":
> > "urn:enhancement-b2855552-0e46-62f5-cd33-9f84ab32e547",
> > > > > >       "@type": [
> > > > > >         "enhancer:Enhancement",
> > > > > >         "enhancer:EntityAnnotation"
> > > > > >       ],
> > > > > >       "dc:created": "2015-12-07T11:22:07.748Z",
> > > > > >       "dc:creator":
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> "org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEngine",
> > > > > >       "dc:relation":
> > > > > > "urn:enhancement-e9c9c187-2d69-2c1f-6552-e76111430d4a",
> > > > > >       "enhancer:confidence": 1.0,
> > > > > >       "enhancer:entity-label": {
> > > > > >         "@language": "en",
> > > > > >         "@value": "France"
> > > > > >       },
> > > > > >       "enhancer:entity-reference": "
> > > http://dbpedia.org/resource/France
> > > > ",
> > > > > >       "enhancer:entity-type": [
> > > > > >         "dbp-ont:Country",
> > > > > >         "dbp-ont:Place",
> > > > > >         "dbp-ont:PopulatedPlace",
> > > > > >         "schema:Country",
> > > > > >         "schema:Place",
> > > > > >         "http://www.opengis.net/gml/_Feature",
> > > > > >         "owl:Thing"
> > > > > >       ],
> > > > > >       "enhancer:extracted-from":
> > > > > > "urn:content-item-sha1-c8ae372ed26679df14da13050dd432fd32c527e3",
> > > > > >       "entityhub:site": "dbpedia"
> > > > > >     },
> > > > > >     {
> > > > > >       "@id":
> > "urn:enhancement-c50474e4-ea0e-03ff-5db5-a25f4c8dae45",
> > > > > >       "@type": [
> > > > > >         "enhancer:Enhancement",
> > > > > >         "enhancer:EntityAnnotation"
> > > > > >       ],
> > > > > >       "dc:created": "2015-12-07T11:22:07.748Z",
> > > > > >       "dc:creator":
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> "org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEngine",
> > > > > >       "dc:relation":
> > > > > > "urn:enhancement-e9c9c187-2d69-2c1f-6552-e76111430d4a",
> > > > > >       "enhancer:confidence": 0.25715446,
> > > > > >       "enhancer:entity-label": {
> > > > > >         "@language": "en",
> > > > > >         "@value": "Vichy France"
> > > > > >       },
> > > > > >       "enhancer:entity-reference": "
> > > > > > http://dbpedia.org/resource/Vichy_France",
> > > > > >       "enhancer:entity-type": [
> > > > > >         "dbp-ont:Country",
> > > > > >         "dbp-ont:Place",
> > > > > >         "dbp-ont:PopulatedPlace",
> > > > > >         "schema:Country",
> > > > > >         "schema:Place",
> > > > > >         "http://www.opengis.net/gml/_Feature",
> > > > > >         "owl:Thing"
> > > > > >       ],
> > > > > >       "enhancer:extracted-from":
> > > > > > "urn:content-item-sha1-c8ae372ed26679df14da13050dd432fd32c527e3",
> > > > > >       "entityhub:site": "dbpedia"
> > > > > >     },
> > > > > >     {
> > > > > >       "@id":
> > "urn:enhancement-de07bc41-e4a1-f510-3f93-99ebfd8c39f4",
> > > > > >       "@type": [
> > > > > >         "enhancer:Enhancement",
> > > > > >         "enhancer:EntityAnnotation"
> > > > > >       ],
> > > > > >       "dc:created": "2015-12-07T11:22:07.748Z",
> > > > > >       "dc:creator":
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> "org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEngine",
> > > > > >       "dc:relation":
> > > > > > "urn:enhancement-8db13707-1ecd-b4df-90ad-52447c8f2c84",
> > > > > >       "enhancer:confidence": 0.1493264,
> > > > > >       "enhancer:entity-label": {
> > > > > >         "@language": "en",
> > > > > >         "@value": "Paris Commune"
> > > > > >       },
> > > > > >       "enhancer:entity-reference": "
> > > > > > http://dbpedia.org/resource/Paris_Commune",
> > > > > >       "enhancer:entity-type": [
> > > > > >         "dbp-ont:Country",
> > > > > >         "dbp-ont:Place",
> > > > > >         "dbp-ont:PopulatedPlace",
> > > > > >         "schema:Country",
> > > > > >         "schema:Place",
> > > > > >         "owl:Thing"
> > > > > >       ],
> > > > > >       "enhancer:extracted-from":
> > > > > > "urn:content-item-sha1-c8ae372ed26679df14da13050dd432fd32c527e3",
> > > > > >       "entityhub:site": "dbpedia"
> > > > > >     },
> > > > > >     {
> > > > > >       "@id":
> > "urn:enhancement-e9c9c187-2d69-2c1f-6552-e76111430d4a",
> > > > > >       "@type": [
> > > > > >         "enhancer:Enhancement",
> > > > > >         "enhancer:TextAnnotation"
> > > > > >       ],
> > > > > >       "dc:created": "2015-12-07T11:22:07.740Z",
> > > > > >       "dc:creator":
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> "org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhancementEngine",
> > > > > >       "dc:type": "dbp-ont:Place",
> > > > > >       "enhancer:confidence": 0.99354976,
> > > > > >       "enhancer:end": 18,
> > > > > >       "enhancer:extracted-from":
> > > > > > "urn:content-item-sha1-c8ae372ed26679df14da13050dd432fd32c527e3",
> > > > > >       "enhancer:selected-text": {
> > > > > >         "@language": "en",
> > > > > >         "@value": "France"
> > > > > >       },
> > > > > >       "enhancer:selection-context": {
> > > > > >         "@language": "en",
> > > > > >         "@value": "Paris is in France"
> > > > > >       },
> > > > > >       "enhancer:start": 12
> > > > > >     }
> > > > > >   ]
> > > > > > }
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Mon, Dec 7, 2015 at 4:23 PM, Karl Wright <daddywri@gmail.com>
> > > > wrote:
> > > > > >
> > > > > > > Hi Dileepa,
> > > > > > >
> > > > > > > Repository connectors have an abstraction that allows them to
> > > > generate
> > > > > > > compound documents (where a document has a primary identifier,
> > and
> > > > > there
> > > > > > > are subdocuments that share that primary identifier and have a
> > > > > secondary
> > > > > > > identifier).  This sounds a bit like what you are describing.
> > Does
> > > > > > Stanbol
> > > > > > > work by decorating an existing document, or does it work by
> > > > generating
> > > > > > all
> > > > > > > content for a document?
> > > > > > >
> > > > > > > Karl
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Dec 7, 2015 at 5:12 AM, Dileepa Jayakody <
> > > > djayakody@zaizi.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi All,
> > > > > > > >
> > > > > > > >
> > > > > > > > While thanking you all for your input on Stanbol connector
> > > > > > requirement, I
> > > > > > > > would like to continue with modifying the Stanbol connector
> to
> > be
> > > > > > > > compatible with any output connector. If you guys can give
> some
> > > > > > guidance
> > > > > > > on
> > > > > > > > how the entity metadata should be added to the repository
> > > document
> > > > I
> > > > > > can
> > > > > > > > modify the stanbol connector accordingly.
> > > > > > > >
> > > > > > > > From Rafa's comments, I gathered we can add the entity
> metadata
> > > to
> > > > > the
> > > > > > > > repo.doc as key value pairs.
> > > > > > > > However this idea is not yet clear to me. There could be 'N'
> > > number
> > > > > of
> > > > > > > > entities in a document and each of them will have some common
> > > > > > attributes
> > > > > > > > such as name, id, type and specific attributes for particular
> > > > entity
> > > > > > > type.
> > > > > > > > I'm not clear on how to maintain that structure of N number
> of
> > > > > entities
> > > > > > > > with their attributes in a repo.document as key value pairs
> and
> > > > make
> > > > > > them
> > > > > > > > LDPath compatible for retrieval in an output connector.
> > > > > > > >
> > > > > > > > @Rafa
> > > > > > > > If you can please elaborate on your suggestion it would be
> > > greatly
> > > > > > > helpful
> > > > > > > > to me.
> > > > > > > > All other suggestions are also welcome.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Dileepa
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, Nov 13, 2015 at 7:00 PM, Karl Wright <
> > daddywri@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > I, too, agree.  Somebody will need to turn this connector
> > into
> > > > one
> > > > > > that
> > > > > > > > > plays by the rules.  It may be possible for someone on the
> > team
> > > > > here
> > > > > > to
> > > > > > > > do
> > > > > > > > > that, but it won't be me; I'm seriously overextended at the
> > > > moment.
> > > > > > It
> > > > > > > > > would be best if someone who knew the connector well could
> do
> > > the
> > > > > > > > necessary
> > > > > > > > > work.
> > > > > > > > >
> > > > > > > > > Karl
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Fri, Nov 13, 2015 at 5:45 AM, Rafa Haro <
> > > > rharoapache@gmail.com>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I must agree with Antonio. When I started to work on
> this I
> > > was
> > > > > > > > expecting
> > > > > > > > > > the connector to work by just extracting the entities and
> > > > > entities
> > > > > > > > > metadata
> > > > > > > > > > and put them as plain metadata of the documents, probably
> > > > > following
> > > > > > > > > LDPATH
> > > > > > > > > > queries configuration
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > This is probably ok for Sensefy but I don’t think this
> > could
> > > be
> > > > > > > > suitable
> > > > > > > > > > to be included in the project. But this is only my
> opinion.
> > > Of
> > > > > > > course,
> > > > > > > > a
> > > > > > > > > > version of the connector that fully respect the
> ManifoldCF
> > > > > > > architecture
> > > > > > > > > > would be more than welcome in my opinion
> > > > > > > > > >
> > > > > > > > > > On Fri, Nov 13, 2015 at 11:38 AM, Antonio David Pérez
> > Morales
> > > > > > > > > > <adperezmorales@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi
> > > > > > > > > > > The removal of the SolrWrapper is a must. It was a
> > > > requirement
> > > > > > for
> > > > > > > an
> > > > > > > > > > > internal project which has nothing to do here with a
> > normal
> > > > > > > operation
> > > > > > > > > of
> > > > > > > > > > > Manifold, so forcing the users to use Solr does not fit
> > the
> > > > > > > Manifold
> > > > > > > > > > > philosophy.
> > > > > > > > > > > In my opinion, at this moment, a Stanbol connector with
> > > such
> > > > a
> > > > > > big
> > > > > > > > > > > dependency which will not fit almost any use case is
> not
> > > very
> > > > > > > useful.
> > > > > > > > > > > You should think a way to convert Stanbol connector
> into
> > a
> > > > > normal
> > > > > > > > > > > Transformation connector without assuming that a
> specific
> > > > > output
> > > > > > > > > > connector
> > > > > > > > > > > will be used.
> > > > > > > > > > > Regards
> > > > > > > > > > > 2015-11-13 11:20 GMT+01:00 Dileepa Jayakody <
> > > > > djayakody@zaizi.com
> > > > > > >:
> > > > > > > > > > >> Hi guys,
> > > > > > > > > > >>
> > > > > > > > > > >> I have developed a Stanbol connector for MCF. You can
> > > check
> > > > it
> > > > > > out
> > > > > > > > > from
> > > > > > > > > > our
> > > > > > > > > > >> github repo here:
> > > > > > > > > > >>
> > > > > > > > > > >>
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/zaizi/sensefy-connectors/tree/master/transformation/mcf-stanbol-connector
> > > > > > > > > > >>
> > > > > > > > > > >> It requires the SolrWrapper output connector which
> > indexes
> > > > > > > enhanced
> > > > > > > > > > >> documents, entities and entityTypes in separate Solr
> > > cores.
> > > > > > > > Basically
> > > > > > > > > it
> > > > > > > > > > >> requires 3 separate solr cores configured with a
> > specific
> > > > Solr
> > > > > > > > schema
> > > > > > > > > > for
> > > > > > > > > > >> primary documents, entities and entityTypes
> separately.
> > > This
> > > > > was
> > > > > > > > done
> > > > > > > > > > for
> > > > > > > > > > >> our specific use-case.
> > > > > > > > > > >>
> > > > > > > > > > >> The SolrWrapper code is here :
> > > > > > > > > > >>
> > > > > > > > > > >>
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/zaizi/sensefy-connectors/tree/master/output/mcf-solrwrapperconnector
> > > > > > > > > > >>
> > > > > > > > > > >> Perhaps we can discuss and remove the Stanbol
> > connector's
> > > > > > > dependency
> > > > > > > > > > with
> > > > > > > > > > >> SolrWrapper and have it working with any output
> > connector.
> > > > > > > > > > >> Please note that the Stanbol connector currently has a
> > bug
> > > > in
> > > > > > the
> > > > > > > UI
> > > > > > > > > > >> (editSpecification) which I'm working on at the
> moment.
> > > > After
> > > > > > > fixing
> > > > > > > > > > that I
> > > > > > > > > > >> will update here. And also I will provide
> documentations
> > > for
> > > > > > > > > configuring
> > > > > > > > > > >> the connector.
> > > > > > > > > > >>
> > > > > > > > > > >> Thanks,
> > > > > > > > > > >> Dileepa
> > > > > > > > > > >>
> > > > > > > > > > >> On Thu, Jul 9, 2015 at 8:36 PM, Antonio David Pérez
> > > Morales
> > > > <
> > > > > > > > > > >> adperezmorales@gmail.com> wrote:
> > > > > > > > > > >>
> > > > > > > > > > >> > Hi Joshua
> > > > > > > > > > >> >
> > > > > > > > > > >> > It is not the list for that, but Marmotta is already
> > > > > > integrated
> > > > > > > in
> > > > > > > > > > Apache
> > > > > > > > > > >> > Stanbol. You can take a look at this issue
> > > > > > > > > > >> > https://issues.apache.org/jira/browse/STANBOL-1165
> .
> > > > > > > > > > >> >
> > > > > > > > > > >> > Anyway, as I said this is not the list for that, so
> > > let's
> > > > > use
> > > > > > > the
> > > > > > > > > > proper
> > > > > > > > > > >> > list for these things.
> > > > > > > > > > >> >
> > > > > > > > > > >> > Regards
> > > > > > > > > > >> >
> > > > > > > > > > >> >
> > > > > > > > > > >> >
> > > > > > > > > > >> > 2015-07-09 15:29 GMT+02:00 Joshua Dunham <
> > > > > > > joshua.dunham@gmail.com
> > > > > > > > >:
> > > > > > > > > > >> >
> > > > > > > > > > >> > > Hey Dileepa,
> > > > > > > > > > >> > >
> > > > > > > > > > >> > >       In case you were interested, I pinged the
> > list a
> > > > few
> > > > > > > days
> > > > > > > > > ago
> > > > > > > > > > >> > asking
> > > > > > > > > > >> > > for integration tips for Apache Marmotta.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > I got some great tips on how to do this which
> could
> > > help
> > > > > > you.
> > > > > > > > > Since
> > > > > > > > > > >> > > Marmotta is a drop in replacement for Clarezza on
> > > > Stanbol
> > > > > it
> > > > > > > may
> > > > > > > > > be
> > > > > > > > > > >> > easier
> > > > > > > > > > >> > > for you to take this way.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > I'm not a Java programmer but I'm bringing this
> > > problem
> > > > to
> > > > > > the
> > > > > > > > > > >> > development
> > > > > > > > > > >> > > staff at my company for assistance. If you like
> the
> > > > > Marmotta
> > > > > > > > > > approach
> > > > > > > > > > >> we
> > > > > > > > > > >> > > may gain more traction solving the same
> integration.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > I'm also integrating Marmotta with Stanbol so the
> > > effect
> > > > > > would
> > > > > > > > be
> > > > > > > > > > the
> > > > > > > > > > >> > same
> > > > > > > > > > >> > > except not using the Stanbol API for data import
> in
> > > > favor
> > > > > of
> > > > > > > > > > Marmotta.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > Best,
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > -J
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > > On Jul 9, 2015, at 1:03 AM, Dileepa Jayakody <
> > > > > > > > > djayakody@zaizi.com
> > > > > > > > > > >
> > > > > > > > > > >> > > wrote:
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > Hi all,
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > Thanks you for the feedback and offering your
> help
> > > in
> > > > > > this.
> > > > > > > > > > >> > > > Let me get back to you on where to start the
> code
> > > > base.
> > > > > > > > > > >> > > > As the first step, I would like to start by
> > > creating a
> > > > > > > > > > architecture
> > > > > > > > > > >> > > diagram
> > > > > > > > > > >> > > > for the connector.
> > > > > > > > > > >> > > > I will send the diagram for your review soon.
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > Thanks,
> > > > > > > > > > >> > > > Dileepa
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > --
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > ------------------------------
> > > > > > > > > > >> > > > This message should be regarded as confidential.
> > If
> > > > you
> > > > > > have
> > > > > > > > > > received
> > > > > > > > > > >> > > this
> > > > > > > > > > >> > > > email in error please notify the sender and
> > destroy
> > > it
> > > > > > > > > > immediately.
> > > > > > > > > > >> > > > Statements of intent shall only become binding
> > when
> > > > > > > confirmed
> > > > > > > > in
> > > > > > > > > > hard
> > > > > > > > > > >> > > copy
> > > > > > > > > > >> > > > by an authorised signatory.
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > Zaizi Ltd is registered in England and Wales
> with
> > > the
> > > > > > > > > registration
> > > > > > > > > > >> > number
> > > > > > > > > > >> > > > 6440931. The Registered Office is Brook House,
> 229
> > > > > > Shepherds
> > > > > > > > > Bush
> > > > > > > > > > >> Road,
> > > > > > > > > > >> > > > London W6 7AN.
> > > > > > > > > > >> > >
> > > > > > > > > > >> >
> > > > > > > > > > >>
> > > > > > > > > > >> --
> > > > > > > > > > >>
> > > > > > > > > > >> ------------------------------
> > > > > > > > > > >> This message should be regarded as confidential. If
> you
> > > have
> > > > > > > > received
> > > > > > > > > > this
> > > > > > > > > > >> email in error please notify the sender and destroy it
> > > > > > > immediately.
> > > > > > > > > > >> Statements of intent shall only become binding when
> > > > confirmed
> > > > > in
> > > > > > > > hard
> > > > > > > > > > copy
> > > > > > > > > > >> by an authorised signatory.
> > > > > > > > > > >>
> > > > > > > > > > >> Zaizi Ltd is registered in England and Wales with the
> > > > > > registration
> > > > > > > > > > number
> > > > > > > > > > >> 6440931. The Registered Office is Brook House, 229
> > > Shepherds
> > > > > > Bush
> > > > > > > > > Road,
> > > > > > > > > > >> London W6 7AN.
> > > > > > > > > > >>
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > ------------------------------
> > > > > > > > This message should be regarded as confidential. If you have
> > > > received
> > > > > > > this
> > > > > > > > email in error please notify the sender and destroy it
> > > immediately.
> > > > > > > > Statements of intent shall only become binding when confirmed
> > in
> > > > hard
> > > > > > > copy
> > > > > > > > by an authorised signatory.
> > > > > > > >
> > > > > > > > Zaizi Ltd is registered in England and Wales with the
> > > registration
> > > > > > number
> > > > > > > > 6440931. The Registered Office is Brook House, 229 Shepherds
> > Bush
> > > > > Road,
> > > > > > > > London W6 7AN.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > ------------------------------
> > > > > > This message should be regarded as confidential. If you have
> > received
> > > > > this
> > > > > > email in error please notify the sender and destroy it
> immediately.
> > > > > > Statements of intent shall only become binding when confirmed in
> > hard
> > > > > copy
> > > > > > by an authorised signatory.
> > > > > >
> > > > > > Zaizi Ltd is registered in England and Wales with the
> registration
> > > > number
> > > > > > 6440931. The Registered Office is Brook House, 229 Shepherds Bush
> > > Road,
> > > > > > London W6 7AN.
> > > > > >
> > > > >
> > > >
> > > > --
> > > >
> > > > ------------------------------
> > > > This message should be regarded as confidential. If you have received
> > > this
> > > > email in error please notify the sender and destroy it immediately.
> > > > Statements of intent shall only become binding when confirmed in hard
> > > copy
> > > > by an authorised signatory.
> > > >
> > > > Zaizi Ltd is registered in England and Wales with the registration
> > number
> > > > 6440931. The Registered Office is Brook House, 229 Shepherds Bush
> Road,
> > > > London W6 7AN.
> > > >
> > >
> >
> > --
> >
> > ------------------------------
> > This message should be regarded as confidential. If you have received
> this
> > email in error please notify the sender and destroy it immediately.
> > Statements of intent shall only become binding when confirmed in hard
> copy
> > by an authorised signatory.
> >
> > Zaizi Ltd is registered in England and Wales with the registration number
> > 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
> > London W6 7AN.
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message