Return-Path: X-Original-To: apmail-ctakes-dev-archive@www.apache.org Delivered-To: apmail-ctakes-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4B79B11920 for ; Sat, 17 May 2014 18:10:14 +0000 (UTC) Received: (qmail 11127 invoked by uid 500); 17 May 2014 17:45:09 -0000 Delivered-To: apmail-ctakes-dev-archive@ctakes.apache.org Received: (qmail 8550 invoked by uid 500); 17 May 2014 17:45:07 -0000 Mailing-List: contact dev-help@ctakes.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ctakes.apache.org Delivered-To: mailing list dev@ctakes.apache.org Received: (qmail 99535 invoked by uid 99); 17 May 2014 17:33:59 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 17 May 2014 17:33:59 +0000 X-ASF-Spam-Status: No, hits=3.2 required=5.0 tests=FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of chakraborti.anirban@googlemail.com designates 209.85.192.42 as permitted sender) Received: from [209.85.192.42] (HELO mail-qg0-f42.google.com) (209.85.192.42) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 17 May 2014 17:33:55 +0000 Received: by mail-qg0-f42.google.com with SMTP id q107so6402527qgd.29 for ; Sat, 17 May 2014 10:33:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=v4JIQ+1OVywXpiQPUV9NtTHc4VT2sLiZV5wYaXsAS9M=; b=V+j632iyF+fIphC2AwwcuKDiSz6rC2Dbida8FszvS63mbjRXrIPnbTULqvIvqx5tIw MJO6pE7rEttsmwJmccU6zWI4iOYMKrHpmCx1VB8jeLQcNGfjviyxcoXozE2RXkY/mBxS mFYhEo5SX9Wr9BXaALOmmz8WA+ZPz4KOr6bY7YS9vF2oZfm6261ckXFu/HmGH2AwWaWF IWtknrL2FwSu8/p+NiQkzC1ggt1pAf1yZ7VxNa66xxp0ix5cmvjMSmqOrLNAgiY9L+/T SVm0ElAVgsnwokKFYpNaK5ilquvAle8mIR18W860+gGtMIOsPT15lae2xPHsUYePiT6z QpxQ== MIME-Version: 1.0 X-Received: by 10.224.24.201 with SMTP id w9mr33414213qab.72.1400348011629; Sat, 17 May 2014 10:33:31 -0700 (PDT) Received: by 10.140.91.82 with HTTP; Sat, 17 May 2014 10:33:31 -0700 (PDT) In-Reply-To: References: Date: Sat, 17 May 2014 23:03:31 +0530 Message-ID: Subject: Re: markable types From: Anirban Chakraborti To: dev@ctakes.apache.org Content-Type: multipart/alternative; boundary=001a11c2a260cb90d304f99bed32 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c2a260cb90d304f99bed32 Content-Type: text/plain; charset=UTF-8 Thanks Timothy, I get the point but would be greatly helpful if you have an illustrative example of a tree structure describing the branches and the nodes generated by Ctakes. I have got an hang how to parse the tree now. On Thu, May 15, 2014 at 5:03 PM, Miller, Timothy < Timothy.Miller@childrens.harvard.edu> wrote: > Anir -- I'm not sure I understand your question but from your example it > doesn't sound like a tree exactly. If you just want a list of medication > concepts you can do something like: > > List meds = new ArrayList<>(JCasUtil.select(jcas, > MedicationMention.class)); > (I believe MedicationMention is the correct class but check your output.) > > If you really do want to put them into a syntax tree, there are also > methods for doing that in AnnotationTreeUtils class. > > getAnnotationTree(JCas, Annotation) will give you the tree for the whole > sentence containing the annotation you give it > annotationNode(JCas, Annotation) will give you the smallest subtree tree > covering the annotation you give it. > insertAnnotationNode(JCas, TopTreebankNode, Annotation, String) will > insert a node into the tree specified at the level specified by the > annotation with the category specified by the string. So for example if you > had meds as above you could then do: > > for(MedicationMention med : meds){ > AnnotationTreeUtils.insertAnnotationNode(jcas, > AnnotationTreeUtils.getAnnotationTree(jcas, med), med, "MEDICATION") > } > > which would insert a new node into every tree with the label "MEDICATION" > in every position where a medication was found. > > One caveat to the above code is that these methods actually will change > the tree in the cas. That might be ok for some use cases but for many you > want to work on a tree outside the cas so that's why there is also methods: > getTreeCopy(JCas, TopTreebankNode) > getTreeCopy(JCas, TreebankNode) > > if you use the getAnnotationTree method to obtain the tree you want, then > you can get a copy from these methods, then use the insert methods and do > something with them immediately (like print them out), without altering the > originals in the cas if other AEs may use them. > > Tim > > > > ________________________________________ > From: Anirban Chakraborti [chakraborti.anirban@googlemail.com] > Sent: Sunday, May 11, 2014 9:15 AM > To: dev@ctakes.apache.org > Subject: Re: markable types > > Steven, > > Would you have any example code of tree parser so the output can be > arranged as per need. I mean, after successful annotation, I want to > extract certain concepts like medication only and arrange them in a new > tree so that all annotation in reference to medication concept and their > sources are listed together. > > Anir > > > On Sun, May 11, 2014 at 3:55 PM, Steven Bethard >wrote: > > > I don't think "not something anyone would want extracted" should be an > > argument against anything. We already have constituent and dependency > > parse trees in the type system, and those would fall under that > > category. > > > > So +1 on markables in the type system. (In general, +1 on moving > > module-specific types to the standard type system. I'm not sure what > > the real benefit of splitting them out is...) > > > > Steve > > > > On Fri, May 9, 2014 at 11:53 AM, Miller, Timothy > > wrote: > > > What do people think about taking the "markable" types out of the > > > coreference project and adding them to the standard type system? This > is > > > a pretty standard concept in coreference that doesn't really have a > > > great natural representation in the current type system -- it > > > encompasses IdentifiedAnnotations as well as pronouns ("It", "him", > > > "her") and some determiners ("this"). > > > > > > The drawback I can see is that it is probably not something anyone > would > > > want extracted -- ultimately you want the actual coref pairs or chains. > > > But it is useful for things like representing gold standard input or > > > splitting coreference resolution into separate markable recognition and > > > relation classification steps. > > > > > > Tim > > > > > > --001a11c2a260cb90d304f99bed32--