Return-Path: X-Original-To: apmail-ctakes-dev-archive@www.apache.org Delivered-To: apmail-ctakes-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D255718972 for ; Mon, 15 Feb 2016 20:36:30 +0000 (UTC) Received: (qmail 25479 invoked by uid 500); 15 Feb 2016 20:36:25 -0000 Delivered-To: apmail-ctakes-dev-archive@ctakes.apache.org Received: (qmail 25419 invoked by uid 500); 15 Feb 2016 20:36:25 -0000 Mailing-List: contact dev-help@ctakes.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ctakes.apache.org Delivered-To: mailing list dev@ctakes.apache.org Received: (qmail 25402 invoked by uid 99); 15 Feb 2016 20:36:25 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Feb 2016 20:36:25 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 0860A1806B4 for ; Mon, 15 Feb 2016 20:36:25 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.198 X-Spam-Level: * X-Spam-Status: No, score=1.198 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id VM9dlOYU3fjO for ; Mon, 15 Feb 2016 20:36:24 +0000 (UTC) Received: from mail-ig0-f171.google.com (mail-ig0-f171.google.com [209.85.213.171]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id B00405FBC7 for ; Mon, 15 Feb 2016 20:36:23 +0000 (UTC) Received: by mail-ig0-f171.google.com with SMTP id g6so16805125igt.1 for ; Mon, 15 Feb 2016 12:36:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=oddbCmc2VdCiDxo6tuTEHynYdcYKXVzsWkNKDVZFErQ=; b=E7lyvrgziCiV+ODFpjUPFIKR7VLGvfDmRcRjhig06tRhdBJWdlIen1jSCHoxnK6mZG x0QREubvE2gDsRwLZXTirS6pI12rq+FN6yTxnGB84YoFq/A6Pt/BSRvuzM3qDWCpJr6N Z81PgRZ6EHbspcsZkqySXQTVYLgmsm6Q69r/z0Os9cxoPGlXlVGMixK8peKaNgitOJD6 zXGAp+/oHrAhP/AHerbnglKXXrqDV/pbpcOgwr8kLoSDhQATMuLph9P8pW9EFEWz1KGn oJpK1L+Lm3pZ2rfgmQygfsuQXQFb6Cei7hIBocu4+L30Rn4fIf/KJTtNqYk+CqlDZFBo z1sQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to :content-type; bh=oddbCmc2VdCiDxo6tuTEHynYdcYKXVzsWkNKDVZFErQ=; b=MoJwbjYT292qeHUttnraBBfv/7aJzn8w7QEW28R5wUx1mLwtnt1wJzS74i5trNFNtH ZF9GBOSyEhHmy1zXkVdVb3b5f5MrsQtBdohqaPrvZlAZU9/6zL/PhsY3PGIhtUYjJDrS FOpcqEph/T9bNmOeYz1+NLmpAZeNiDilfQr4sX9tYlA4UWEatSc7Y5PjQn0utVazBcuq igf38Rd1JmZ9A3ke9JNAONtWOSqgRCvKL0NQkB+8XAKJHQG0eAief9MJcTHbfBo0nSbm ky5t9yKmRF66oUZePinMmPrQb4y9FzwqaBUwGMGkG0uhR27qcrgMmvfidCT/DmzraEWW A93g== X-Gm-Message-State: AG10YORMn2j3/h795+ozaoQcm5mkyeiPSqrNFYwH1e0dSDm1cDZF49pmR5hnVcmOzSBXxXuRxESslvjDWIkZkA== MIME-Version: 1.0 X-Received: by 10.50.143.108 with SMTP id sd12mr3401753igb.73.1455568583085; Mon, 15 Feb 2016 12:36:23 -0800 (PST) Received: by 10.36.71.194 with HTTP; Mon, 15 Feb 2016 12:36:23 -0800 (PST) Date: Mon, 15 Feb 2016 15:36:23 -0500 Message-ID: Subject: LVG documentation From: Jessica Glover To: dev@ctakes.apache.org Content-Type: multipart/alternative; boundary=001a1135e07e57592c052bd4f8fc --001a1135e07e57592c052bd4f8fc Content-Type: text/plain; charset=UTF-8 Hello, I would like to add a brief explanation and an example in the LVG documentation as to why it says in the Component Use Guide that LVG is effectively required for good results in dictionary lookup, but before I do, I'd like to understand it a bit better myself. I have an example sentence that yielded different results when I ran it through CuisOnlyUMLSProcessor with and without LVGAnnotator enabled. "Nasal canals are free of masses or apparent polyps." No LVG: Identified Annotations: "Nasal", "polyps" With LVG: Identified Annotations: "Nasal", "canals", "masses", "polyps" My guess would be that the canonical (in this case, singular) form of these words is in the UMLS dictionary but the word tokens themselves are not. Can I generalize to say that using LVG gives a better chance of getting a dictionary hit for a missed word token by also looking up relevant variants of that token? Thanks, Jessica --001a1135e07e57592c052bd4f8fc--