Return-Path: X-Original-To: apmail-mahout-user-archive@www.apache.org Delivered-To: apmail-mahout-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7B02D658E for ; Wed, 1 Jun 2011 07:41:02 +0000 (UTC) Received: (qmail 53259 invoked by uid 500); 1 Jun 2011 07:41:01 -0000 Delivered-To: apmail-mahout-user-archive@mahout.apache.org Received: (qmail 53228 invoked by uid 500); 1 Jun 2011 07:41:01 -0000 Mailing-List: contact user-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mahout.apache.org Delivered-To: mailing list user@mahout.apache.org Received: (qmail 53215 invoked by uid 99); 1 Jun 2011 07:41:01 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Jun 2011 07:41:00 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of srowen@gmail.com designates 209.85.213.42 as permitted sender) Received: from [209.85.213.42] (HELO mail-yw0-f42.google.com) (209.85.213.42) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Jun 2011 07:40:54 +0000 Received: by ywh1 with SMTP id 1so3795619ywh.1 for ; Wed, 01 Jun 2011 00:40:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=NOGIDgd+0NXg7c0Xmr80orhljmRSJ798phP0QcqpSPI=; b=f2A6jwWwRJ9kunSQjTbiG9WliX5ciRRZ4YegbfwwYPMyuovjPqfXBNb1qQ5UC1/RAR WQX9/gSs5huge1EG7kmQNBUquwJgCoO0H8lU19OwUZJltJquj66vxl8iKfsn6mW1iQuS GAwwgsxguu7bYDSWwlQRnXF6GyGJC0r5WJiHU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=qk8yU2ySf4YmLjFEpPSztiS5XOe+j83j7XwL6kBv3TcYfHDpWKfigsyEeRsxidi0qf kg84GdsVQ8whFzQ76zzLgNyXOzrymIUwJagX4qLmFGMUhczl+7U5Rkc/fuSQ9hkTRvAn S2ZxJQmM3iC2KNKWPdsPy1C6NH/vJpNOsOPxQ= MIME-Version: 1.0 Received: by 10.101.208.8 with SMTP id k8mr4192708anq.159.1306914033634; Wed, 01 Jun 2011 00:40:33 -0700 (PDT) Received: by 10.100.128.15 with HTTP; Wed, 1 Jun 2011 00:40:33 -0700 (PDT) In-Reply-To: References: Date: Wed, 1 Jun 2011 08:40:33 +0100 Message-ID: Subject: Re: Why do userid & itemid have to be long? From: Sean Owen To: user@mahout.apache.org Content-Type: multipart/alternative; boundary=0016e68ee44dba0e4d04a4a1a220 --0016e68ee44dba0e4d04a4a1a220 Content-Type: text/plain; charset=UTF-8 It is for performance -- it used to allow any Comparable type but the object overhead slowed things down by 2-3x. It looks like you are using integer values already in Mongo, am I reading that right? those look like 12-byte hex values. Is it a question of reading/writing them as such then rather than treating as strings in Mongo? If you really have to convert such a thing to/from String, I bet that writing your own simple encoder/decoder runs much faster. On Wed, Jun 1, 2011 at 3:50 AM, Mike Khristo wrote: > Rather, how can I use string-based userid/itemid's without having the deal > with the slowness associated with mapping them to a long? > > In the MongoDataModel, for example, significant time/overhead goes into > converting the unique id's to long... I'm still getting my head wrapped > around mahout, but this seems like a significant limitation. I have to > assume there's some logic behind the decision to restrict them to long, but > i didn't find anything about it in Mahout in Action or the list. > > Thanks. > --0016e68ee44dba0e4d04a4a1a220--