Return-Path: Delivered-To: apmail-uima-user-archive@www.apache.org Received: (qmail 72480 invoked from network); 30 Dec 2010 17:25:20 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 30 Dec 2010 17:25:20 -0000 Received: (qmail 60673 invoked by uid 500); 30 Dec 2010 17:25:19 -0000 Delivered-To: apmail-uima-user-archive@uima.apache.org Received: (qmail 60625 invoked by uid 500); 30 Dec 2010 17:25:19 -0000 Mailing-List: contact user-help@uima.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@uima.apache.org Delivered-To: mailing list user@uima.apache.org Received: (qmail 60617 invoked by uid 99); 30 Dec 2010 17:25:19 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 30 Dec 2010 17:25:19 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of duluthted@gmail.com designates 209.85.214.47 as permitted sender) Received: from [209.85.214.47] (HELO mail-bw0-f47.google.com) (209.85.214.47) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 30 Dec 2010 17:25:14 +0000 Received: by bwz10 with SMTP id 10so11197025bwz.6 for ; Thu, 30 Dec 2010 09:24:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:reply-to:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:content-type:content-transfer-encoding; bh=Pgb4UjRijBXtroA3jXhVqwWDQdMuvbb8U2A1pY+Albk=; b=h4DDl+TD3fijuc0XCijo/oVE1l0tEAHIR96Jpsn1kxQdcgJtsY73DS0qxirQEk1okH cb52brHUHmG2MgiTkA22WoRdbVEDoXYcsbo+YXjKv681GJe/ql8gM2BHIDL5IpPKeWZf AA8+AGM39NolUIpa+9ZgobHXgfP+oi7+C+5sI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:reply-to:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type :content-transfer-encoding; b=LFe5rv5OfdtvOZj7NYw8x3iD1DPQo8kMzgy6oMzR8ibMxyMXBFLwQFGP1gbum39y2e Waa6+9fiMYVTUraQO51oCLZsgALxrcFapqdqiXiCcg3L5gI/GbmbuPqOE0Q1kBRUgJIS NVuU+y2kgu7DRGH4l7MAQKtNR4g1kEj5K7OeQ= MIME-Version: 1.0 Received: by 10.204.34.130 with SMTP id l2mr189482bkd.212.1293729893033; Thu, 30 Dec 2010 09:24:53 -0800 (PST) Sender: duluthted@gmail.com Reply-To: tpederse@d.umn.edu Received: by 10.204.77.203 with HTTP; Thu, 30 Dec 2010 09:24:53 -0800 (PST) In-Reply-To: References: Date: Thu, 30 Dec 2010 11:24:53 -0600 X-Google-Sender-Auth: Bkq-m2smU8W4UpcGByWGU2VtqtY Message-ID: Subject: Re: basic question on sharing results from ./documentAnalyzer.sh demo From: Ted Pedersen To: user@uima.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable PS Just a few details on a second experiment, as there was an interesting little twist that initially confused me. This time I just used Analysis Engine : PersonTitleAnnotator.xml and ran as described below. What was nice about this was that all the possible titles as defined in the xml file were shown to me in the CPE Gui, so I could review those and remove or add as needed.... But, initially I did not get any titles identified! Instead I got the following error.... No output is being produced by the PersonTitleAnnotator because the Result Specification did not contain a request for the type example.PersonTitle with the language 'x-unspecified' (Note: this message will only be shown once.) So, on a hunch I specified the language as English (en) via the field provided for that in the CPE (which is blank by default it seems), and then I re-ran and got results. Note that before getting results I added Professor to the list of titles (via the CPE). Anyway, after doing the above with the PersonTitle Analysis Engine, I got the following results... <++++NEW DOCUMENT++++> DOCUMENT URI:file:/home/ted/data/test.txt uima.tcas.DocumentAnnotation Professor Jimmy Smith and Mr. John Smith are friends. They both live in Mankato and like the Minnesota Gophers, but they aren't too happy with Coach Jones. example.PersonTitle Professor org.apache.uima.examples.SourceDocumentInformation example.PersonTitle Mr. So...very nice. Thanks! Ted On Thu, Dec 30, 2010 at 10:56 AM, Ted Pedersen wrote: > Thank you!!! Mission accomplished. :) > > Just to make a few notes on how I did this (in the event anyone else > ever wonders, and to make sure I didn't do this in a weird way...) > > I created a plain text input file that consisted of the following.... > > Professor Jimmy Smith and Mr. John Smith are friends. They both live > in =A0Mankato and like the Minnesota Gophers, but they aren't too happy > with =A0Coach Jones. > > Then, I started > > bin/cpeGui.sh > > to get the Collection Processing Engine Configurator going...When that > was running, I loaded the directory in which my file was found, as > well as the following (all found in the examples/descriptors > directory): > > Collection Reader : FileSystemCollectionReadme.xml > Analysis Engine : NamesAndPersonTitles_TAE.xml > CAS Consumer : AnnotatorPrinter.xml > > And I clicked. Then I found the following in my output directory in a > file called annotprint. > > <++++NEW DOCUMENT++++> > DOCUMENT URI:file:/home/ted/data/test.txt > > uima.tcas.DocumentAnnotation Professor Jimmy Smith and Mr. John Smith > are friends. They both live in =A0Mankato and like the Minnesota > Gophers, but they aren't too happy with =A0Coach Jones. > example.Name Professor Jimmy Smith > org.apache.uima.examples.SourceDocumentInformation > example.Name Mr. John Smith > example.Name Minnesota Gophers > example.Name Coach Jones > > Which is exactly the sort of information I wanted, and note, I can > send it to you in an email message. :) > > As you can tell, I'm pretty new at this - given that, I feel like I > should ask if this is this the standard way to set this up, or is > there another way to go that is more common? (That said, I'm pretty > content with what I did here, so asking mostly out of curiosity). > > Thanks! > Ted > > On Thu, Dec 30, 2010 at 9:19 AM, Eddie Epstein wrot= e: >> Try adding the following sample annotator to the end of your pipeline: >> $UIMA_HOME/examples/descriptors/cas_consumer/AnnotationPrinter.xml >> >> Eddie >> >> On Wed, Dec 29, 2010 at 1:09 PM, Ted Pedersen wrote= : >>> Greetings all, >>> >>> I'm fairly new to UIMA, and to get myself oriented I've been running >>> the documentAnalyzer.sh demo/samples, and it's proven to be pretty >>> easy to use and quite informative (about what you can do with UIMA). >>> >>> One thing I'd like to be able to do is cut some output and send that >>> to colleagues who aren't necessarily using UIMA, so as to say - look! >>> I gave this input file to the NamesAndPersonTitles_TAE.xml >>> function/descriptor, and this is what I got! >>> >>> Let's assume they don't have UIMA installed, and that I don't want to >>> send them a screen shot (yes, I'm old school in that regard). Rather, >>> I'd just like to send them a text based file they can read in a >>> relatively simple way. >>> >>> It doesn't have to be exactly this format, but just to give you an idea= ... >>> >>> If my input is... >>> >>> Mr. Smith works at IBM. >>> >>> Then I'd like to send something like.... >>> >>> Mr. Smith works at IBM. >>> >>> (Actual results, doesn't seem to recognize IBM. :) Note that I just >>> wrote the above manually.... >>> >>> Anyway, I'd just like to have these results in a somewhat simple, >>> readable, mailable form. I would even settle for being able to cut and >>> paste from the right hand column where the annotation details are >>> shown, to get something like.... >>> >>> Person Title ("Mr.") >>> begin=3D0 >>> end=3D3 >>> Name ("Mr. Smith") >>> begin =3D 0 >>> begin =3D 9 >>> >>> Note that I had to do that manually...anyway, the specific format >>> doesn't actually matter (doesn't need to be either of the above >>> precisely) just something that conveys the output of UIMA in a way >>> that can be read by a human and send via email... >>> >>> BTW, I did see the HTML and XML options on the Results Display Format >>> buttons on Analysis Results, but when I try and use those to see what >>> they do that just seems to hang and nothing is displayed. I saw some >>> output directories interactive_temp and interactive_out, but those >>> just contained the input text and the .xmi output (which I don't find >>> particularly readable. :) >>> >>> Any thoughts, suggestions, arguments as to why this is a bad idea, >>> etc. are of course welcome. >>> >>> Cordially, >>> Ted >>> >>> -- >>> Ted Pedersen >>> http://www.d.umn.edu/~tpederse >>> >> > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > --=20 Ted Pedersen http://www.d.umn.edu/~tpederse