manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: ManifoldCF transformation connector for Apache Stanbol
Date Mon, 07 Dec 2015 10:53:34 GMT
Hi Dileepa,

Repository connectors have an abstraction that allows them to generate
compound documents (where a document has a primary identifier, and there
are subdocuments that share that primary identifier and have a secondary
identifier).  This sounds a bit like what you are describing.  Does Stanbol
work by decorating an existing document, or does it work by generating all
content for a document?

Karl


On Mon, Dec 7, 2015 at 5:12 AM, Dileepa Jayakody <djayakody@zaizi.com>
wrote:

> Hi All,
>
>
> While thanking you all for your input on Stanbol connector requirement, I
> would like to continue with modifying the Stanbol connector to be
> compatible with any output connector. If you guys can give some guidance on
> how the entity metadata should be added to the repository document I can
> modify the stanbol connector accordingly.
>
> From Rafa's comments, I gathered we can add the entity metadata to the
> repo.doc as key value pairs.
> However this idea is not yet clear to me. There could be 'N' number of
> entities in a document and each of them will have some common attributes
> such as name, id, type and specific attributes for particular entity type.
> I'm not clear on how to maintain that structure of N number of entities
> with their attributes in a repo.document as key value pairs and make them
> LDPath compatible for retrieval in an output connector.
>
> @Rafa
> If you can please elaborate on your suggestion it would be greatly helpful
> to me.
> All other suggestions are also welcome.
>
> Thanks,
> Dileepa
>
>
> On Fri, Nov 13, 2015 at 7:00 PM, Karl Wright <daddywri@gmail.com> wrote:
>
> > I, too, agree.  Somebody will need to turn this connector into one that
> > plays by the rules.  It may be possible for someone on the team here to
> do
> > that, but it won't be me; I'm seriously overextended at the moment.  It
> > would be best if someone who knew the connector well could do the
> necessary
> > work.
> >
> > Karl
> >
> >
> > On Fri, Nov 13, 2015 at 5:45 AM, Rafa Haro <rharoapache@gmail.com>
> wrote:
> >
> > > I must agree with Antonio. When I started to work on this I was
> expecting
> > > the connector to work by just extracting the entities and entities
> > metadata
> > > and put them as plain metadata of the documents, probably following
> > LDPATH
> > > queries configuration
> > >
> > >
> > >
> > >
> > > This is probably ok for Sensefy but I don’t think this could be
> suitable
> > > to be included in the project. But this is only my opinion. Of course,
> a
> > > version of the connector that fully respect the ManifoldCF architecture
> > > would be more than welcome in my opinion
> > >
> > > On Fri, Nov 13, 2015 at 11:38 AM, Antonio David Pérez Morales
> > > <adperezmorales@gmail.com> wrote:
> > >
> > > > Hi
> > > > The removal of the SolrWrapper is a must. It was a requirement for an
> > > > internal project which has nothing to do here with a normal operation
> > of
> > > > Manifold, so forcing the users to use Solr does not fit the Manifold
> > > > philosophy.
> > > > In my opinion, at this moment, a Stanbol connector with such a big
> > > > dependency which will not fit almost any use case is not very useful.
> > > > You should think a way to convert Stanbol connector into a normal
> > > > Transformation connector without assuming that a specific output
> > > connector
> > > > will be used.
> > > > Regards
> > > > 2015-11-13 11:20 GMT+01:00 Dileepa Jayakody <djayakody@zaizi.com>:
> > > >> Hi guys,
> > > >>
> > > >> I have developed a Stanbol connector for MCF. You can check it out
> > from
> > > our
> > > >> github repo here:
> > > >>
> > > >>
> > >
> >
> https://github.com/zaizi/sensefy-connectors/tree/master/transformation/mcf-stanbol-connector
> > > >>
> > > >> It requires the SolrWrapper output connector which indexes enhanced
> > > >> documents, entities and entityTypes in separate Solr cores.
> Basically
> > it
> > > >> requires 3 separate solr cores configured with a specific Solr
> schema
> > > for
> > > >> primary documents, entities and entityTypes separately. This was
> done
> > > for
> > > >> our specific use-case.
> > > >>
> > > >> The SolrWrapper code is here :
> > > >>
> > > >>
> > >
> >
> https://github.com/zaizi/sensefy-connectors/tree/master/output/mcf-solrwrapperconnector
> > > >>
> > > >> Perhaps we can discuss and remove the Stanbol connector's dependency
> > > with
> > > >> SolrWrapper and have it working with any output connector.
> > > >> Please note that the Stanbol connector currently has a bug in the
UI
> > > >> (editSpecification) which I'm working on at the moment. After fixing
> > > that I
> > > >> will update here. And also I will provide documentations for
> > configuring
> > > >> the connector.
> > > >>
> > > >> Thanks,
> > > >> Dileepa
> > > >>
> > > >> On Thu, Jul 9, 2015 at 8:36 PM, Antonio David Pérez Morales <
> > > >> adperezmorales@gmail.com> wrote:
> > > >>
> > > >> > Hi Joshua
> > > >> >
> > > >> > It is not the list for that, but Marmotta is already integrated
in
> > > Apache
> > > >> > Stanbol. You can take a look at this issue
> > > >> > https://issues.apache.org/jira/browse/STANBOL-1165 .
> > > >> >
> > > >> > Anyway, as I said this is not the list for that, so let's use
the
> > > proper
> > > >> > list for these things.
> > > >> >
> > > >> > Regards
> > > >> >
> > > >> >
> > > >> >
> > > >> > 2015-07-09 15:29 GMT+02:00 Joshua Dunham <joshua.dunham@gmail.com
> >:
> > > >> >
> > > >> > > Hey Dileepa,
> > > >> > >
> > > >> > >       In case you were interested, I pinged the list a few
days
> > ago
> > > >> > asking
> > > >> > > for integration tips for Apache Marmotta.
> > > >> > >
> > > >> > > I got some great tips on how to do this which could help
you.
> > Since
> > > >> > > Marmotta is a drop in replacement for Clarezza on Stanbol
it may
> > be
> > > >> > easier
> > > >> > > for you to take this way.
> > > >> > >
> > > >> > > I'm not a Java programmer but I'm bringing this problem
to the
> > > >> > development
> > > >> > > staff at my company for assistance. If you like the Marmotta
> > > approach
> > > >> we
> > > >> > > may gain more traction solving the same integration.
> > > >> > >
> > > >> > > I'm also integrating Marmotta with Stanbol so the effect
would
> be
> > > the
> > > >> > same
> > > >> > > except not using the Stanbol API for data import in favor
of
> > > Marmotta.
> > > >> > >
> > > >> > > Best,
> > > >> > >
> > > >> > > -J
> > > >> > >
> > > >> > > > On Jul 9, 2015, at 1:03 AM, Dileepa Jayakody <
> > djayakody@zaizi.com
> > > >
> > > >> > > wrote:
> > > >> > > >
> > > >> > > > Hi all,
> > > >> > > >
> > > >> > > > Thanks you for the feedback and offering your help
in this.
> > > >> > > > Let me get back to you on where to start the code base.
> > > >> > > > As the first step, I would like to start by creating
a
> > > architecture
> > > >> > > diagram
> > > >> > > > for the connector.
> > > >> > > > I will send the diagram for your review soon.
> > > >> > > >
> > > >> > > > Thanks,
> > > >> > > > Dileepa
> > > >> > > >
> > > >> > > > --
> > > >> > > >
> > > >> > > > ------------------------------
> > > >> > > > This message should be regarded as confidential. If
you have
> > > received
> > > >> > > this
> > > >> > > > email in error please notify the sender and destroy
it
> > > immediately.
> > > >> > > > Statements of intent shall only become binding when
confirmed
> in
> > > hard
> > > >> > > copy
> > > >> > > > by an authorised signatory.
> > > >> > > >
> > > >> > > > Zaizi Ltd is registered in England and Wales with the
> > registration
> > > >> > number
> > > >> > > > 6440931. The Registered Office is Brook House, 229
Shepherds
> > Bush
> > > >> Road,
> > > >> > > > London W6 7AN.
> > > >> > >
> > > >> >
> > > >>
> > > >> --
> > > >>
> > > >> ------------------------------
> > > >> This message should be regarded as confidential. If you have
> received
> > > this
> > > >> email in error please notify the sender and destroy it immediately.
> > > >> Statements of intent shall only become binding when confirmed in
> hard
> > > copy
> > > >> by an authorised signatory.
> > > >>
> > > >> Zaizi Ltd is registered in England and Wales with the registration
> > > number
> > > >> 6440931. The Registered Office is Brook House, 229 Shepherds Bush
> > Road,
> > > >> London W6 7AN.
> > > >>
> > >
> >
>
> --
>
> ------------------------------
> This message should be regarded as confidential. If you have received this
> email in error please notify the sender and destroy it immediately.
> Statements of intent shall only become binding when confirmed in hard copy
> by an authorised signatory.
>
> Zaizi Ltd is registered in England and Wales with the registration number
> 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
> London W6 7AN.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message