Return-Path: Delivered-To: apmail-incubator-uima-user-archive@locus.apache.org Received: (qmail 84308 invoked from network); 22 Aug 2007 08:29:40 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 22 Aug 2007 08:29:40 -0000 Received: (qmail 24303 invoked by uid 500); 22 Aug 2007 08:29:37 -0000 Delivered-To: apmail-incubator-uima-user-archive@incubator.apache.org Received: (qmail 24182 invoked by uid 500); 22 Aug 2007 08:29:37 -0000 Mailing-List: contact uima-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: uima-user@incubator.apache.org Delivered-To: mailing list uima-user@incubator.apache.org Received: (qmail 24173 invoked by uid 99); 22 Aug 2007 08:29:37 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 Aug 2007 01:29:37 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of twgoetz@gmx.de designates 213.165.64.20 as permitted sender) Received: from [213.165.64.20] (HELO mail.gmx.net) (213.165.64.20) by apache.org (qpsmtpd/0.29) with SMTP; Wed, 22 Aug 2007 08:29:36 +0000 Received: (qmail invoked by alias); 22 Aug 2007 08:29:15 -0000 Received: from blueice1n1.de.ibm.com (EHLO [9.152.14.84]) [195.212.29.163] by mail.gmx.net (mp009) with SMTP; 22 Aug 2007 10:29:15 +0200 X-Authenticated: #25330878 X-Provags-ID: V01U2FsdGVkX187KlN4tmU7DiewHnyxo8pDjlk9UExouTDLWfQo2S HPy8NpXFXC+3zm Message-ID: <46CBF3DA.6070609@gmx.de> Date: Wed, 22 Aug 2007 10:29:14 +0200 From: Thilo Goetz User-Agent: Thunderbird 2.0.0.6 (Windows/20070728) MIME-Version: 1.0 To: uima-user@incubator.apache.org Subject: Re: Questions regarding the Heap class and the heap size. References: <46CA4DC0.7070905@schor.com> <46CADE9C.1030204@gmx.de> In-Reply-To: X-Enigmail-Version: 0.95.3 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Y-GMX-Trusted: 0 X-Virus-Checked: Checked by ClamAV on apache.org Danai Wiriyayanyongsuk wrote: > Thanks Marshall and Thilo for shading some light. > > > > Besides the instances of feature structures (which I guess that it usually > does not require much of the "Heap.heap" space), are there any kinds of > information that might require big chunks of the "Heap.heap" space e.g. > hundreds of array's elements that I should be aware of? [...] All the data that your analysis generates (with a few exceptions) lives on the heap. So depending on how many annotations you create, the heap may grow very large. It is usually several times the size of the input document. I've personally had applications where the CAS (most of which is the heap) would on average be about 50 times the size of the input document. Unfortunately there is no good way to get at this data via APIs. The way I got this information was by triggering Java heap dumps and looking at the size of the data structures on the Java heap. HTH, Thilo