Return-Path: Delivered-To: apmail-incubator-uima-user-archive@locus.apache.org Received: (qmail 11292 invoked from network); 9 Jan 2009 17:38:08 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 9 Jan 2009 17:38:08 -0000 Received: (qmail 17933 invoked by uid 500); 9 Jan 2009 17:38:08 -0000 Delivered-To: apmail-incubator-uima-user-archive@incubator.apache.org Received: (qmail 17912 invoked by uid 500); 9 Jan 2009 17:38:08 -0000 Mailing-List: contact uima-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: uima-user@incubator.apache.org Delivered-To: mailing list uima-user@incubator.apache.org Received: (qmail 17901 invoked by uid 99); 9 Jan 2009 17:38:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Jan 2009 09:38:08 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [132.187.3.28] (HELO mailrelay.rz.uni-wuerzburg.de) (132.187.3.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Jan 2009 17:38:00 +0000 Received: from virusscan.mail (localhost [127.0.0.1]) by mailrelay.mail (Postfix) with ESMTP id DFFBB198E56 for ; Fri, 9 Jan 2009 18:37:37 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by virusscan.mail (Postfix) with ESMTP id D344A198E48 for ; Fri, 9 Jan 2009 18:37:37 +0100 (CET) Received: from [132.187.15.81] (win6081.informatik.uni-wuerzburg.de [132.187.15.81]) by mailmaster.uni-wuerzburg.de (Postfix) with ESMTP id BFB2A198E3E for ; Fri, 9 Jan 2009 18:37:37 +0100 (CET) Message-ID: <49678B5F.8040203@ki.informatik.uni-wuerzburg.de> Date: Fri, 09 Jan 2009 18:37:35 +0100 From: =?ISO-8859-1?Q?Peter_Kl=FCgl?= User-Agent: Thunderbird 2.0.0.19 (Windows/20081209) MIME-Version: 1.0 To: uima-user@incubator.apache.org Subject: Re: CAS Viewer References: <49664807.3030300@uni-wuerzburg.de> <991df1420901081343s19c37225q4c41b3b2cbbbf1d3@mail.gmail.com> <49671AF2.4050800@uni-wuerzburg.de> <991df1420901090849n24bcbbeex704fbd89f13bdcba@mail.gmail.com> In-Reply-To: <991df1420901090849n24bcbbeex704fbd89f13bdcba@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: by amavisd-new at uni-wuerzburg.de X-Virus-Checked: Checked by ClamAV on apache.org Hi Tong, > When processing input files that contain HTML tags, most of annotators will > "clean-up" the HTML tags before doing any further processing. As the result > of that, the xmiCAS doesn't contain the original HTML text anymore. > Ah ok. Visual and layout information is quite important for my extraction tasks. My rule language has the capability to dynamically filter all kinds and combinations of markup and annotations types. Therefore the original HTML text stays the main artifact in the xmiCAS even if the tags contain no valuable information. I plan to integrate "external" annotators with restrictions also in that manner. > I think the most useful feature of your plug-in is its capability to allow > users to edit the xmiCAS in the browser window similar to editing the HTML > page with an HTML Editor (Please corect me if I am wrong). > I am not sure if I understand you. The structure or text of the HTML cannot be modified by the CEV plugin (the rule language does such things). I think the only real advantage to the CAS Viewer and the CAS Editor is that the CEV can display annotations of an HTML artifact in some kind of browser and the user can create new annotations in this browser. It is really painfully to review or edit annotations in the HTML source. There is probably no reason (except maybe the extension point) to use the CEV plugin instead of the CAS Viewer if you are just processing plain text. > Having some xmiCAS samples will help us to understand the plug-in's > capability. > Yes, I will provide a simple example next week. Have a nice weekend! Peter