Return-Path: Delivered-To: apmail-uima-user-archive@www.apache.org Received: (qmail 97419 invoked from network); 28 Apr 2010 11:51:44 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 28 Apr 2010 11:51:44 -0000 Received: (qmail 36611 invoked by uid 500); 28 Apr 2010 11:51:44 -0000 Delivered-To: apmail-uima-user-archive@uima.apache.org Received: (qmail 36474 invoked by uid 500); 28 Apr 2010 11:51:43 -0000 Mailing-List: contact user-help@uima.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@uima.apache.org Delivered-To: mailing list user@uima.apache.org Received: (qmail 36466 invoked by uid 99); 28 Apr 2010 11:51:43 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Apr 2010 11:51:43 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of eaepstein@gmail.com designates 74.125.82.175 as permitted sender) Received: from [74.125.82.175] (HELO mail-wy0-f175.google.com) (74.125.82.175) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Apr 2010 11:51:35 +0000 Received: by wyf22 with SMTP id 22so206687wyf.6 for ; Wed, 28 Apr 2010 04:51:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=p4wB0U+a1TJtY4xCsW9Fr94p9c28cYxWOQ6bT97gq5o=; b=YbYUG6Iiw+Kg2+88+nJ4m/xywuBdVdVLj0L7kexskEwKVEcLv1x3RhkekiRhJKjJUg zG5tF0tE0DWjYcBxPPZMr33XK/xXgihYLywIPs9KpDMMJy+Pd1nsVw0BXMIuZyA/9jd0 H6up0+8ZLlsDyDKuEyoHV8V9oMF+WKI00Sh5Q= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=HH/vxzT3qe0ZbWDotdp8A0Zpb3kOTEc/TnvWORxUb64qvofGbLOPzb9foMPJtMO4+4 HGfzJ7rNOh27d0vh0CzgSsXRbv39yv6UVPoCoGINufkgn9ZFT92WKigoAMjbRMep//Pu FS63lygJCGj4ckDRlyRxkCRb/Rq1bL8ZwTseI= MIME-Version: 1.0 Received: by 10.216.91.9 with SMTP id g9mr3447921wef.194.1272455475211; Wed, 28 Apr 2010 04:51:15 -0700 (PDT) Received: by 10.216.165.208 with HTTP; Wed, 28 Apr 2010 04:51:15 -0700 (PDT) In-Reply-To: References: <4BD5836A.50603@schor.com> <4BD6FBD5.1090900@gmx.de> Date: Wed, 28 Apr 2010 07:51:15 -0400 Message-ID: Subject: Re: Restrictions on sofa data array From: Eddie Epstein To: user@uima.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org It did occur to me after that the string array may not be supported, but this could be considered a lack of imagination in the original implementati= on rather than any fundamental design limitation :) The changes to allow use of a string array for Sofa data would not be much. Additional work probably needed for delta-CAS serialization used with remot= e UIMA AS service replies. Regards, Eddie On Wed, Apr 28, 2010 at 3:00 AM, Klaus Rothenh=E4usler = wrote: > Eddie Epstein writes: > >> >> On Tue, Apr 27, 2010 at 10:59 AM, Thilo Goetz wrote: >> > My understanding is that he wants the tokens as primitives, >> > not the characters. =A0Annotation offsets could then be token >> > offsets, not character offsets. =A0That's perfectly reasonable >> > for some tasks. =A0We usually create annotations with the start >> > offset being the start of some token, and the end offset the >> > end of some token. =A0Then it's hard to find the tokens that >> > are "covered" by the annotation, which is why we have >> > subiterators, which are not super efficient. =A0And so on. >> > I like the idea, but I have no idea how compatible it is with >> > UIMA's idea of views and sofas. >> >> A StringArrayFS can be used as Sofa data. > > Are you sure? The documentation explicitly denies that > (http://uima.apache.org/downloads/releaseDocs/2.3.0-incubating/ > docs/html/tutorials_and_users_guides/ > tutorials_and_users_guides.html#ugr.tug.aas.setting_accessing_sofa_data): > > [...] > aCas.setSofaDataArray(feature_structure_primitive_array, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0mime_type_string); > [...] > Feature Structure primitive arrays are all the UIMA Array types except > arrays of Feature Structures, Strings, and Booleans. Typically, these > are arrays of bytes, but can be other types, such as floats, longs, > etc. > [...] > > And as far as I can see in the implementation for > setLocalSofaData(FeatureStructure aFS) in SofaFSImpl.java this > restriction is endorsed through following lines (77-92): > [...] > Type type =3D aFS.getType(); > if (!type.isArray()) { > =A0 CASRuntimeException e =3D new CASRuntimeException(...); > =A0 throw e; > } > if ( =A0 !type.getName().equals(CAS.TYPE_NAME_BYTE_ARRAY) > =A0 =A0&& !type.getName().equals(CAS.TYPE_NAME_DOUBLE_ARRAY) > =A0 =A0&& !type.getName().equals(CAS.TYPE_NAME_FLOAT_ARRAY) > =A0 =A0&& !type.getName().equals(CAS.TYPE_NAME_INTEGER_ARRAY) > =A0 =A0&& !type.getName().equals(CAS.TYPE_NAME_LONG_ARRAY) > =A0 =A0&& !type.getName().equals(CAS.TYPE_NAME_SHORT_ARRAY)) { > =A0 CASRuntimeException e =3D new CASRuntimeException( > =A0 =A0 CASRuntimeException.INAPPROP_TYPE, > =A0 =A0 new String[] { > =A0 =A0 "Byte/Float/Integer/Short/String/Long/Double Array", > =A0 =A0 type.getName() }); > =A0 throw e; > } > [...] > > Even though the exception message suggests differently. > > Am I missing something? > > Regards > --Klaus > >