From user-return-7942-archive-asf-public=cust-asf.ponee.io@uima.apache.org Thu Jul 5 15:35:28 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id F3DC3180657 for ; Thu, 5 Jul 2018 15:35:27 +0200 (CEST) Received: (qmail 50776 invoked by uid 500); 5 Jul 2018 13:35:22 -0000 Mailing-List: contact user-help@uima.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@uima.apache.org Delivered-To: mailing list user@uima.apache.org Received: (qmail 50756 invoked by uid 99); 5 Jul 2018 13:35:21 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 Jul 2018 13:35:21 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id F2221C0102 for ; Thu, 5 Jul 2018 13:35:20 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.869 X-Spam-Level: * X-Spam-Status: No, score=1.869 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id fG68drCSw2ph for ; Thu, 5 Jul 2018 13:35:19 +0000 (UTC) Received: from mail-pl0-f46.google.com (mail-pl0-f46.google.com [209.85.160.46]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 231D95F5FC for ; Thu, 5 Jul 2018 13:35:19 +0000 (UTC) Received: by mail-pl0-f46.google.com with SMTP id b1-v6so1274617pls.5 for ; Thu, 05 Jul 2018 06:35:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=DSoTjak8zp5BgAd+Z+0sNHlDPtj4cTgEqdCJao/8ioc=; b=hjTydm7PMecBmYUwOk0foWmo8XnyXvW3EWPQO9UVcjZsjHmdAQc+Lrk37vv8fvyJQG Z6znhzNmAw+0VoZ39hvtN09eL5D8WXzy9mm317RMEaZ1QRbr1+xKGSNvqT68QdXHK05c JasWMKAuaTt2W1BO8hqxFuMNn28o2C3WPu7PoFBuKVRXgzB42o1FkkuobSyBUKOkF+ch 4tdD+3dpIgpCtAOKK4rKXvqg1cL1uzs+nK5aU3SbLRtkEfe+zZn7ob8ZDzsA+l9hzkjd kaxPSysv0m80PHuZn7CRqJ5/TLNdiFaj5zQNPsuR8rZ/2HXrnrsk37xiz2i6ZFYsjy5D PGLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=DSoTjak8zp5BgAd+Z+0sNHlDPtj4cTgEqdCJao/8ioc=; b=QwgyAaR03dOZmHryooix0yz7zmEcH1Z3IAympBDBA4tSItyuFuu8IITS9t72JOAMvC uVde85h0u3SYdlL7q8YE03tLXJcabTGltb5quDG9BM42zuuWNKVUfoa7d7d37d08O0g/ GviKehcKtWZA+kbQlHyz/CUgsXIMj15ZMH90wo0sTpJtf/0k66G9ODGQbMbr/hkOPgVn aadZyLqllI1W5SuhTAqUf8Qw6eFzRYGoj7VqxZar/U+2lzmmH3693C2HNpl/mVYJ32XU /w5EQVxv1csGSu+WMp+S3IdokvsV8V60fDw7Ff1ikNJqRgPsxYLf58BLe8HpBbFUO2Ys z38w== X-Gm-Message-State: APt69E2+ORQvcdMYvI5es3YCf6cNFS9fmdQsFBQFHVUpGuuR/CmLRTL8 Hp9ajR6JxJW7wUpPmEWM9RjM20A7ixtn3oiUOIvFoA== X-Google-Smtp-Source: AAOMgpeibbcF5V2hp4W5mnaII6/CFa6wlHMv9yi8Q0B+eDwLf3jZutEOqSNiHFljggITW6BWnFxemD5skC/IwBgOGyY= X-Received: by 2002:a17:902:1566:: with SMTP id b35-v6mr6288164plh.107.1530797711818; Thu, 05 Jul 2018 06:35:11 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a17:90a:670e:0:0:0:0 with HTTP; Thu, 5 Jul 2018 06:35:11 -0700 (PDT) In-Reply-To: References: <5B2201AA.4010105@orkash.com> From: Eddie Epstein Date: Thu, 5 Jul 2018 09:35:11 -0400 Message-ID: Subject: Re: Problem in running DUCC Job for Arabic Language To: user@uima.apache.org Content-Type: multipart/alternative; boundary="000000000000d603760570409d49" --000000000000d603760570409d49 Content-Type: text/plain; charset="UTF-8" So if you run the AE as a DUCC UIMA-AS service and send it CASes from some UIMA-AS client it works OK? The full environment for all processes that DUCC launches are available via ducc-mon under the Specification or Registry tab for that job or managed reservation or service. Please see if the LANG setting for the service is different from the LANG setting for the job. One can also see the LANG setting for a linux process-id by doing: cat /proc//environ The LANG to be used for a DUCC process can be set by adding to the --environment argument "LANG=xxx" as needed Thanks, Eddie On Thu, Jul 5, 2018 at 6:47 AM, rohit14csu173@ncuindia.edu < rohit14csu173@ncuindia.edu> wrote: > Hey, > Yeah you got it right the first snippet comes in CR before the data goes > in CAS. > And the second snippet is in the first annotator or analysis engine(AE) of > my Aggregate Desciptor. > I am pretty sure this is an issue of the CAS used by DUCC because if i use > service of DUCC in which we are supposed to send the CAS and receive the > same CAS with added features from DUCC i get accurate results. > > But the only problem comes in submitting a job where the cas is generated > by DUCC. > This can also be a issue of the enviornment(Language) of DUCC because the > default language is english. > > Bets Regards > Rohit > > On 2018/07/03 13:11:50, Eddie Epstein wrote: > > Rohit, > > > > Before sending the data into jcas if i force encode it :- > > > > > > String content2 = null; > > > content2 = new String(content.getBytes("UTF-8"), "ISO-8859-1"); > > > jcas.setDocumentText(content2); > > > > > > > Where is this code, in the job CR? > > > > > > > > > > > > And when i go in my first annotator i force decode it:- > > > > > > String content = null; > > > content = new String(jcas.getDocumentText.getBytes("ISO-8859-1"), > > > "UTF-8"); > > > > > > > And is this in the first annotator of the job process, i.e. the CM? > > > > Please be as specific as possible. > > > > Thanks, > > Eddie > > > --000000000000d603760570409d49--