Return-Path: X-Original-To: apmail-mahout-user-archive@www.apache.org Delivered-To: apmail-mahout-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 82A1511312 for ; Fri, 6 Jun 2014 09:13:48 +0000 (UTC) Received: (qmail 48256 invoked by uid 500); 6 Jun 2014 09:13:47 -0000 Delivered-To: apmail-mahout-user-archive@mahout.apache.org Received: (qmail 48188 invoked by uid 500); 6 Jun 2014 09:13:46 -0000 Mailing-List: contact user-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mahout.apache.org Delivered-To: mailing list user@mahout.apache.org Received: (qmail 48178 invoked by uid 99); 6 Jun 2014 09:13:46 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Jun 2014 09:13:46 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ssc.open@googlemail.com designates 74.125.82.178 as permitted sender) Received: from [74.125.82.178] (HELO mail-we0-f178.google.com) (74.125.82.178) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Jun 2014 09:13:42 +0000 Received: by mail-we0-f178.google.com with SMTP id p10so442794wes.37 for ; Fri, 06 Jun 2014 02:13:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=message-id:date:from:reply-to:user-agent:mime-version:to:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=oe5CqRSFIcJgecMNVqU09EAE8RetzdV8PUjE/Jd1lCw=; b=OAlC8UuQkpljHVeU9Me8JUJZESmzZFOsc3FcVBGjZhyWlWQpGOBhlZRP1eqUpK36Pa znc/d91+1pwpV4p0peGSS57kLZb6tbAHmA6/VDH0b7m+DZ2Y1mCdbAvAMIMZdC36LGhD 3S5io6dC18c44xwGTh+YMooID8BbkUk11goWMNQ+XI645F/raBvAYcRdcuE/LJgiHMlQ 3qNolk3NCmSnEXdH900NcIGdp48mGMHxqQUntl7/XYG4U6wbXO+kfaoZSSFbkL++V4nt vAzGgkQb2cEIa1czP/uB3PjIxwWaJKpw/q3QmbGqoNWbGUihswyrNkdwYOt/kVPxNxMk GPKA== X-Received: by 10.14.209.3 with SMTP id r3mr9306eeo.27.1402045963699; Fri, 06 Jun 2014 02:12:43 -0700 (PDT) Received: from [192.168.0.4] (f052000039.adsl.alicedsl.de. [78.52.0.39]) by mx.google.com with ESMTPSA id a45sm21494033eez.2.2014.06.06.02.12.42 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 06 Jun 2014 02:12:43 -0700 (PDT) Message-ID: <53918609.3050203@apache.org> Date: Fri, 06 Jun 2014 11:12:41 +0200 From: Sebastian Schelter Reply-To: ssc@apache.org User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: user@mahout.apache.org Subject: Re: Performance issues in Mahout recommendations References: In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org You should not use Hadoop for such a tiny dataset. Use the GenericItemBasedRecommender on a single machine in Java. --sebastian On 06/06/2014 11:10 AM, Warunika Ranaweera wrote: > Hi, > > I am using Mahout's recommenditembased algorithm on a data set with nearly > 10,000 (implicit) user ratings. This is the command I used: > *mahout recommenditembased --input ratings.csv --output recommendation > --usersFile users.dat --tempDir temp --similarityClassname > SIMILARITY_LOGLIKELIHOOD --numRecommendations 3 * > > Although the output is successfully generated, this process takes nearly 7 > minutes to produce recommendations for a single user. The Hadoop cluster > has 8 nodes and the machine on which Mahout is invoked is an AWS EC2 > c3.2xlarge server. When I tracked the mapreduce jobs, I noticed that more > than one machine is *not* utilized at a time, and the *recommenditembased* > command takes 9 mapreduce jobs altogether with approx. 45 seconds taken per > job. > > Since the performance is too slow for real time recommendations, it would > be really helpful to know whether I'm missing out any additional commands > or configurations that enables faster performance. > > Thanks, > Warunikay >