Return-Path: Delivered-To: apmail-mahout-user-archive@www.apache.org Received: (qmail 24031 invoked from network); 30 Jul 2010 14:30:34 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 30 Jul 2010 14:30:34 -0000 Received: (qmail 69577 invoked by uid 500); 30 Jul 2010 14:30:34 -0000 Delivered-To: apmail-mahout-user-archive@mahout.apache.org Received: (qmail 69408 invoked by uid 500); 30 Jul 2010 14:30:32 -0000 Mailing-List: contact user-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mahout.apache.org Delivered-To: mailing list user@mahout.apache.org Received: (qmail 69400 invoked by uid 99); 30 Jul 2010 14:30:32 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Jul 2010 14:30:32 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of florent.empis@gmail.com designates 209.85.214.42 as permitted sender) Received: from [209.85.214.42] (HELO mail-bw0-f42.google.com) (209.85.214.42) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Jul 2010 14:30:24 +0000 Received: by bwz11 with SMTP id 11so957021bwz.1 for ; Fri, 30 Jul 2010 07:30:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=/GaOp83dMNPgXJD/cz4GwlCEdRtdCp0TwVlfBYkn8Es=; b=ooLYSBQBIedmddMV/knV1HIFZQPO37q+HTO+0y5l1/gc7BQjNostPDMztQ6Tlk/cSS SkX283Q43boSLEG5qoBByO6PwGFohwSzqe/G9Bk+BHcHh/ZKsu7Nu+GCWZ/njYi3zSIz pVZeJkB+ufhbtSd0U4j+DPHochdYL0s0YmNuk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=KxyS3xfb3YS0x45JJhqlqrWivPeQJGPtSuspHC+u9UY3ctcxMo6myz+B8YMyJ5CnYY nYFlJi/fDRh88CiXwgZsC0C5kJx6gki3j9cJmXkhPu2MUDFNgIocSoZ8Xi9klE6twmjs sy7fqr21cmBzvbZq2FcmXumzC4L3pLMmRhUsk= MIME-Version: 1.0 Received: by 10.204.7.213 with SMTP id e21mr381642bke.213.1280500204189; Fri, 30 Jul 2010 07:30:04 -0700 (PDT) Received: by 10.204.65.195 with HTTP; Fri, 30 Jul 2010 07:30:04 -0700 (PDT) In-Reply-To: References: <624845ab.e2ad.12a1f6abdf6.Coremail.woshidustin@126.com> Date: Fri, 30 Jul 2010 16:30:04 +0200 Message-ID: Subject: Re: Could you improve the AbstractJDBCDataModel? From: Florent Empis To: user@mahout.apache.org Content-Type: multipart/alternative; boundary=00151750e552ce3686048c9baf86 X-Virus-Checked: Checked by ClamAV on apache.org --00151750e552ce3686048c9baf86 Content-Type: text/plain; charset=ISO-8859-1 Actually, I discussed a while back the creation of a caching strategy for the datamodel, through Memcache. I did begin an implementation, but other stuff (namely, a new job) got in the way.... Another option would be to go for a library such as Hibernate or Ibatis and throw efficient caching (such as EhCache) at it. You'd maybe loose some speed (particularly with the added abstraction layer of Hibernate), but it would probably not be too big of job to implement (particularly with Ibatis, where it would be a matter a few days: take the existing statements, convert them to plain sql request, write caching configuration and you're all set) 2010/7/29 Sean Owen > That's rather the point of the JDBC-backed models -- they're for the > case where you can't load data in memory. If you can, then load them > in memory. > > Yes it's not fast to use JDBC. It only really makes sense with > algorithms that, through their nature or caching, don't read a lot of > data. You want to use the Caching* wrappers everywhere. > > You might write some kind of DataModel wrapper which caches some > number of user preferences to mitigate this effect, yes. > > 2010/7/29 Young : > > Hi All, > > I use MySQLJDBCDataModel for ItemBasedRecommender for online > recommendation. I have put other stuff in the memory and do the > itemsimilarity matrix precomputation. > > > > I find when I do one single recommendation, the > getPreferencesFromUser(long id) method (in )will be called thousands of > times. I think it is okay when using in-memory datamodel. But if it is in > the AbstractJDBCDataModel, it will take much more time because the *same* > mysql queries will be excuted thousands of times. > > > > Thanks. > > ---Young > --00151750e552ce3686048c9baf86--