uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benedict Holland <benedict.m.holl...@gmail.com>
Subject Re: UIMA on Spark mimicking CPE pipelines
Date Thu, 28 Sep 2017 18:39:38 GMT
That is a great suggestion. I will add it to the list of project tasks
since that would also be a smart extension to get working soon.

Thanks,
~Ben

On Thu, Sep 28, 2017 at 12:24 PM, Nicolas Paris <niparisco@gmail.com> wrote:

> hi ben
>
> you can mimic a yarn instance by creating a slave and a master. this
> would confirm no serialization problem are involved
>
> Le 28 sept. 2017 à 16:55, Benedict Holland écrivait :
> > Hello All,
> >
> > It does, in fact, look like it works with standalone instances. We don't
> > have an environment to test with yarn, but given how it works, it looks
> > like it should work fine. The only thing is, each node will have to have
> > access to the database that the CPE runs over. I was actually thinking
> > about making the Dataset<Row> collection be created from the CPE
> getNext()
> > method until hasNext returns false, but I think that will cause memory
> > problems with huge databases.
> >
> > Hopefully, I will have more information on exactly what I can release
> over
> > the next upcoming days. I am pushing to provide a minimum working example
> > with a MySQL schema and a small setup guide.
> >
> > ~Ben
> >
> > On Thu, Sep 28, 2017 at 3:29 AM, Nicolas Paris <niparisco@gmail.com>
> wrote:
> >
> > > Hey ben
> > >
> > > thanks for the feedbacak, looks interesting approach
> > > have you both validate your approach on standalone/yarn spark instances
> > > ?
> > >
> > > thanks
> > > Le 26 sept. 2017 à 21:02, Benedict Holland écrivait :
> > > > Hello all,
> > > >
> > > > I have a working application that essentially implements the CPE
> within a
> > > > spark context. The best part about this is that it does not use
> UIMAFit
> > > or
> > > > any 3rd party applications. It simply uses hadoop, spark, UIMA, and
> > > > OpenNLP.
> > > >
> > > > Users are able to configure, design, and build the UIMA pipeline
> using
> > > all
> > > > of the eclipse XML plugin applications. Instead of running the
> > > application
> > > > via the CPE.process() driver from a main class, it will run from the
> > > > foreach() function on the Dataframe<Row> object.
> > > >
> > > > Oh also, it plugs into a database to get the text and to write
> results.
> > > >
> > > > Would the UIMA community be interested in getting a working example
> put
> > > > together? If so, please feel free to contact me. I think this could
> be an
> > > > excellent example of what people would like to use and your examples
> are
> > > > particularly good.
> > > >
> > > > Thanks,
> > > > ~Ben
> > >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message