Return-Path: Delivered-To: apmail-mahout-user-archive@www.apache.org Received: (qmail 50918 invoked from network); 18 Jun 2010 14:13:54 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 18 Jun 2010 14:13:54 -0000 Received: (qmail 86762 invoked by uid 500); 18 Jun 2010 14:13:54 -0000 Delivered-To: apmail-mahout-user-archive@mahout.apache.org Received: (qmail 86626 invoked by uid 500); 18 Jun 2010 14:13:53 -0000 Mailing-List: contact user-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mahout.apache.org Delivered-To: mailing list user@mahout.apache.org Received: (qmail 86618 invoked by uid 99); 18 Jun 2010 14:13:53 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Jun 2010 14:13:53 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jamborta@gmail.com designates 74.125.82.170 as permitted sender) Received: from [74.125.82.170] (HELO mail-wy0-f170.google.com) (74.125.82.170) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Jun 2010 14:13:45 +0000 Received: by wyf22 with SMTP id 22so1053809wyf.1 for ; Fri, 18 Jun 2010 07:13:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=JPQlcmRIaCsuzrn0MjVT/r0fmCjgo52HQlLRPn/6LXo=; b=KLqtE+x/lP5JGA6RlbOGIWb6xVqNbKOYAsR9u103mMfBgasF1ZV0ba5hG31W7XbNTV wjTsCh2DwCoDqCDR8JSOommDFN0cuGVp3oxB7awnNI0CaVwyZ8xkmxYVoCJ/3vfQxxvZ yLElX7DU1yy8yZlo04oTXus3q0xH4ItTyqRxc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; b=Bb1NPZ4PFVALCBfeRlhWJTxyNoUCVdXKLlZhIiJUd8ZsvWJ/yEwP+NdWz3LTLFN39i C9Ddumx8fPUDKCAnXAsNn6PZqlBiRH86Am7Kp4sU5tElX0K7IxnBA3XATXwPlUqdrwDO sKkGLDn6VUStFP9MsoZepiZsqzly7B0IvHyUI= Received: by 10.216.89.20 with SMTP id b20mr870101wef.58.1276870403692; Fri, 18 Jun 2010 07:13:23 -0700 (PDT) Received: from [128.16.8.86] (oxy.cs.ucl.ac.uk [128.16.8.86]) by mx.google.com with ESMTPS id n66sm3241755wej.13.2010.06.18.07.13.23 (version=TLSv1/SSLv3 cipher=RC4-MD5); Fri, 18 Jun 2010 07:13:23 -0700 (PDT) Message-ID: <4C1B7F02.4020806@gmail.com> Date: Fri, 18 Jun 2010 15:13:22 +0100 From: Tamas Jambor User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB; rv:1.9.1.7) Gecko/20100111 Thunderbird/3.0.1 MIME-Version: 1.0 To: user@mahout.apache.org Subject: Re: out for memory References: <1276271105796-888907.post@n3.nabble.com> <1276271586879-888926.post@n3.nabble.com> <1276273162484-889014.post@n3.nabble.com> <1276774920737-902465.post@n3.nabble.com> <1276776558358-902524.post@n3.nabble.com> <20100617142106.3e803030@essen.neofonie.priv> <1276851856989-905163.post@n3.nabble.com> <4C1B4B6A.9080101@gmail.com> <4C1B7059.3030203@gmail.com> <4C1B777D.70500@gmail.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Thanks. In the end, I got away with 7.3GB, I guess this is as good as it gets for now. On 18/06/2010 14:52, Sean Owen wrote: > GDM needs roughly 28 bytes per preference, all told. I'd expect it > alone consumes "just" 2.8GB of heap on the Netflix data set. The rest > may be consumed by SVD matrices and your application storage and other > JVM stuff. > > 10GB seems very large. Are you setting the new generation size to be > relatively small? otherwise the defaults will probably waste a couple > gigabytes of heap. > > Do what works best for you though. > > On Fri, Jun 18, 2010 at 2:41 PM, Tamas Jambor wrote: > >> I mean I used to run SVD with my implementation storing only three arrays >> (user, item, rating), that I was able to fit in 3GB memory. I guess >> GenericDataModel takes up quite a lot of memory, because the data is indexed >> by users and by items, which is not necessary for SVD >> >> On 18/06/2010 14:21, Sean Owen wrote: >> >>> Memory requirements may be much higher for this algorithm as it builds >>> large intermediate data structures to compute the SVD. Yes I think the >>> simple data fits in 3GB or so. Sounds like you have solved your >>> problem by supplying more memory. >>> >>> >>> >>