Return-Path: X-Original-To: apmail-uima-user-archive@www.apache.org Delivered-To: apmail-uima-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D9C28109A1 for ; Thu, 23 Jan 2014 14:13:40 +0000 (UTC) Received: (qmail 42129 invoked by uid 500); 23 Jan 2014 14:13:37 -0000 Delivered-To: apmail-uima-user-archive@uima.apache.org Received: (qmail 42091 invoked by uid 500); 23 Jan 2014 14:13:35 -0000 Mailing-List: contact user-help@uima.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@uima.apache.org Delivered-To: mailing list user@uima.apache.org Received: (qmail 42078 invoked by uid 99); 23 Jan 2014 14:13:33 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Jan 2014 14:13:33 +0000 Received: from localhost (HELO [192.168.1.107]) (127.0.0.1) (smtp-auth username rec, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Jan 2014 14:13:33 +0000 Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\)) Subject: Re: uima-fit and uima annotators (in my case Whitespace annotator) From: Richard Eckart de Castilho In-Reply-To: Date: Thu, 23 Jan 2014 12:13:30 -0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: To: user@uima.apache.org X-Mailer: Apple Mail (2.1510) Hi, can you provide the full code for your sample pipeline? I think that = would make it easier to help. With the present information, I can only give some general advice. - it is not mandatory to have the type system java classes (JCas = wrappers) present in a project if none of your components (Readers, AEs, = CCs) use them. - it is possible to manually load a type system description (TSD) and = pass it to the components. But then the TSD is the second argument to = the createXXXDescription call, e.g. createEngineDescription(SimpleCC.class, tsd,=20 SimpleCC.PARAM_OUTPUT_DIR, "=85"); - the type systems of all components in a pipeline is automatically = merged when a pipeline is run (e.g. using SimplePipeline.runPipeline). = Thus, it would also work to pass a TSD with all types used in the = pipeline only to the reader, but not to any of the subsequent = components. - alternatively, it is possible to have uimaFIT automatically detect = your types [1]. If you do that, there is no need at all to pass the TSD = to the component - it happens automatically. createEngineDescription(SimpleCC.class, SimpleCC.PARAM_OUTPUT_DIR, "=85"); - if you want to retrieve annotation from the CAS without using the JCas = wrappers, you can have a look at the CasUtil class. E.g. CasUtil.select(cas, CasUtil.getType(cas, "my.package.name.MyType")) Mind, this call works only if "MyType" inherits from the built-in = "Annotation" type. Otherwise, you would use "selectFS" instead of = "select". I would recommend using the CAS/CasUtil only if you want to implement a = generic component that can be configured to work with different types. = If your component is fixed to a certain type system, then using the = JCas/JCasUtil is much more convenient. -- Richard [1] = http://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#ugr.tools= .uimafit.typesystem On 23.01.2014, at 06:21, Luca Foppiano wrote: > Hi Everybody, > I'm starting playing with uima-fit and I'm trying to integrate the > whitespace annotator into my simple pipeline composed by a collection > reader a simple AE (plays with the text, doesn't annotate) and I want = to > add a whitespace annotator to be applied to the text. >=20 > I've download the trunk version of the Whitespace annotator on github, = I've > extracted the type system definition from the descriptor XML and = referenced > it from uimafit. The pipeline worked without crashing. >=20 > Now I want to add an AE that takes the annotations and do something = with > that (print them for example). >=20 > I could not find a way to work around the fact the type system java = class > were not present in the project, is this a mandatory requirement? >=20 > What I've tried is to do something like: >=20 > //Get the type autogeneated type system (SentenceAnnotation, > TokenAnnotation) > TypeDescription[] types =3D tsd.getTypes(); >=20 > [...] > //..and try to pass them to my annotator > AnalysisEngineDescription casConsumer =3D > AnalysisEngineFactory.createEngineDescription(SimpleCC.class, > SimpleCC.OUTPUT_DIR_PARAM, > "/home/lf84914/development/epo/apl/data/out", > * types, null*); >=20 > but then, in the AE's code, I have no idea how to use them. >=20 > Any suggestions? >=20 > Thank everybody in advance. > --=20 > Luca Foppiano >=20 > Software Engineer > +31615253280 > luca@foppiano.org > www.foppiano.org