Return-Path: X-Original-To: apmail-uima-user-archive@www.apache.org Delivered-To: apmail-uima-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1EA9B179F1 for ; Thu, 1 Oct 2015 14:53:12 +0000 (UTC) Received: (qmail 55409 invoked by uid 500); 1 Oct 2015 14:53:11 -0000 Delivered-To: apmail-uima-user-archive@uima.apache.org Received: (qmail 55362 invoked by uid 500); 1 Oct 2015 14:53:11 -0000 Mailing-List: contact user-help@uima.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@uima.apache.org Delivered-To: mailing list user@uima.apache.org Received: (qmail 55351 invoked by uid 99); 1 Oct 2015 14:53:11 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Oct 2015 14:53:11 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 10B1C1A369D for ; Thu, 1 Oct 2015 14:53:11 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.001 X-Spam-Level: X-Spam-Status: No, score=-0.001 tagged_above=-999 required=6.31 tests=[SPF_PASS=-0.001] autolearn=disabled Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id H5xwjId8N2M7 for ; Thu, 1 Oct 2015 14:53:04 +0000 (UTC) Received: from gateway30.websitewelcome.com (gateway30.websitewelcome.com [192.185.180.41]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 3842B204D2 for ; Thu, 1 Oct 2015 14:53:04 +0000 (UTC) Received: by gateway30.websitewelcome.com (Postfix, from userid 500) id E2218DB86FC6A; Thu, 1 Oct 2015 09:52:56 -0500 (CDT) Received: from gator3253.hostgator.com (gator3253.hostgator.com [198.57.247.217]) by gateway30.websitewelcome.com (Postfix) with ESMTP id CE1A7DB86FBF6 for ; Thu, 1 Oct 2015 09:52:56 -0500 (CDT) Received: from [129.34.20.19] (port=25566 helo=[9.2.55.44]) by gator3253.hostgator.com with esmtpsa (TLSv1.2:DHE-RSA-AES128-SHA:128) (Exim 4.85) (envelope-from ) id 1ZhfE8-00072H-8N for user@uima.apache.org; Thu, 01 Oct 2015 09:52:56 -0500 Subject: Re: How to correclty implement delta serialization in locally deployed CPE pipeline? To: user@uima.apache.org References: From: Marshall Schor X-Enigmail-Draft-Status: N1110 Message-ID: <560D48D1.4050401@schor.com> Date: Thu, 1 Oct 2015 10:53:05 -0400 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - gator3253.hostgator.com X-AntiAbuse: Original Domain - uima.apache.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - schor.com X-BWhitelist: no X-Source-IP: 129.34.20.19 X-Exim-ID: 1ZhfE8-00072H-8N X-Source: X-Source-Args: X-Source-Dir: X-Source-Sender: ([9.2.55.44]) [129.34.20.19]:25566 X-Source-Auth: msa+schor.com X-Email-Count: 6 X-Source-Cap: bWlzY2hvcjttaXNjaG9yO2dhdG9yMzI1My5ob3N0Z2F0b3IuY29t Hi, A little more detail of what you're doing may help us figure out what's happening. What API(s) are you using to do the serialization? -Marshall On 9/29/2015 2:57 PM, José Tomás Atria wrote: > Hello all, > > I've been trying to wrap my head around this for a while, and I can't seem > to get it to work. Could someone please explain what is the most > straightforward way of implementing delta serialization in a local, > multithreaded CPE pipeline? > > So far, I've tried using a collection reader that uses a > SharedSerializationData that is stored in the current UIMA session, and > creates a CAS marker that is also stored in a map in the current UIMA > session under a CAS identifier key, and then using this > SharedSerializationData oject and the marker retrieved from the UIMA > session from the CAS identifier to serialize the delta to disk, but this > procedure causes an OutOfMemory exception if I try to process all of my > data (Not that much in my opinion, ~2000 CASes). > > I assume that I'm missing some basic aspect of the API, but after trying to > deal with it for a while I just gave up... > > A more specific version, as far as I could understand: Delta serialization > requires a SharedSerializationData object and a CAS marker. What is the > correct way to create, store and retrieve these in a simple, > multi-threaded, locally deployed CPE processing pipeline? (i.e. No need to > support AS or DUCC facilities, etc). > > Any help would be greatly appreciated. > Thanks! > jta >