uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Baker James D <JDBA...@mail.dstl.gov.uk>
Subject RE: [UK OFFICIAL] Baleen - UIMA Based Text Analytics Framework
Date Tue, 06 Oct 2015 07:49:21 GMT
Classification: UK OFFICIAL

Hi Jens,

I haven't tried, but I suspect it wouldn't be as straight forward as taking a Baleen component
and using it with another UIMA based system. Whilst Baleen is UIMA based, we did have to augment
UIMA with a lot of additional functionality to get it to do what we wanted. As that additional
functionality doesn't exist in those other pipelines (or, at least, not in the same form)
it's unlikely that the components will work without modification. 

It's more likely though that components could go the other way (e.g. a vanilla UIMA component
working in Baleen), although again we haven't tried that.

James

-----Original Message-----
From: jens@grivolla.net [mailto:jens@grivolla.net] On Behalf Of Jens Grivolla
Sent: 05 October 2015 15:13
To: user@uima.apache.org
Subject: Re: [UK OFFICIAL] Baleen - UIMA Based Text Analytics Framework

Hi James, this looks interesting and there seem to be quite a few components available.

How interoperable is it with e.g. DKPro (or other UIMA components), i.e.
could I just take AEs from Baleen and use them within a DKPro pipeline?

Thanks,
Jens

On Mon, Oct 5, 2015 at 1:14 PM, Baker James D <JDBAKER@mail.dstl.gov.uk>
wrote:

> Classification: UK OFFICIAL
>
> Afternoon everyone,
>
> In response to Petr's comments, we have added some additional 
> information to the Wiki section of the Baleen GitHub repo. We haven't 
> added any new information (yet), but we have collated information that 
> is already available into one place to make it more accessible. If 
> there are any specific areas that people feel could do with more 
> attention, please let us know and we'll see what we can do.
>
> http://scanmail.trustwave.com/?c=7240&d=qZCS1uyk5gD9tOFh0wP7Fe5NVCDXd7
> SVw3smRPqkFw&u=https%3a%2f%2fgithub%2ecom%2fdstl%2fbaleen%2fwiki
>
> Thanks,
> James
>
>
> -----Original Message-----
> From: Petr Baudis [mailto:pasky@ucw.cz]
> Sent: 28 September 2015 21:23
> To: Baker James D
> Cc: user@uima.apache.org
> Subject: Re: [UK OFFICIAL] Baleen - UIMA Based Text Analytics 
> Framework
>
>   Hi!
>
> On Mon, Sep 28, 2015 at 02:31:03PM +0100, Baker James D wrote:
> > I would like to draw your attention to a text analytics framework 
> > that
> has just been released by Dstl (part of the UK Ministry of Defence). 
> It uses UIMA as part of its underlying architecture but provides 
> additional functionality on top of that, and simplifies much of the 
> user configuration and experience, as well as the development process.
> A number of collection readers, annotators and consumers are included as part of the
framework.
> >
> > The tool is called Baleen, and is released under Apache Software 
> > License
> 2.
> >
> > There is more information about the tool on the press release 
> > (http://scanmail.trustwave.com/?c=7240&d=qZCS1uyk5gD9tOFh0wP7Fe5NVCD
> > Xd7SVwyApRKj2Gw&u=https%3a%2f%2fwww%2egov%2euk%2fgovernment%2fnews%2
> > fdstl-adds-to-open-source-software%29
> > and on the GitHub page
> > (http://scanmail.trustwave.com/?c=7240&d=qZCS1uyk5gD9tOFh0wP7Fe5NVCD
> > Xd7SVwyEsE_3xQQ&u=https%3a%2f%2fgithub%2ecom%2fdstl%2fbaleen%29
>   Thanks for the heads up.  However, I haven't found any clear summary 
> of what is the framework capable of right now - I think you might want 
> to expand the generic description a bit with some examples and 
> use-cases.  I have been looking around a bit and seems like e.g.
>
>
> http://scanmail.trustwave.com/?c=7240&d=qZCS1uyk5gD9tOFh0wP7Fe5NVCDXd7
> SVwykuQKuhGg&u=https%3a%2f%2fgithub%2ecom%2fdstl%2fbaleen%2fblob%2fmas
> ter%2fbaleen%2fbaleen-annotators%2fsrc%2fmain%2fjava%2fuk%2fgov%2fdstl
> %2fbaleen%2fannotators%2fcleaners%2fMergeAdjacentQuantities%2ejava
>
> is something that could be pretty useful, but you might want to make 
> it easier to discover the capabilities to get more users / contributors.
>
>   Best,
>
>                                 Petr Baudis
>
> "This e-mail and any attachment(s) is intended for the recipient only.
>  Its unauthorised use,
> disclosure, storage or copying is not permitted.  Communications with 
> Dstl are monitored and/or recorded for system efficiency and other 
> lawful purposes, including business intelligence, business metrics and 
> training.  Any views or opinions expressed in this e-mail do not 
> necessarily reflect Dstl policy."
>
> "If you are not the intended recipient, please remove it from your 
> system and notify the author of the email and centralenq@dstl.gov.uk"
>

"This e-mail and any attachment(s) is intended for the recipient only.   Its unauthorised
use, 
disclosure, storage or copying is not permitted.  Communications with Dstl are monitored and/or

recorded for system efficiency and other lawful purposes, including business intelligence,
business 
metrics and training.  Any views or opinions expressed in this e-mail do not necessarily reflect
Dstl policy."

"If you are not the intended recipient, please remove it from your system and notify the author
of 
the email and centralenq@dstl.gov.uk"
Mime
View raw message