Return-Path: Delivered-To: apmail-incubator-hama-dev-archive@locus.apache.org Received: (qmail 93100 invoked from network); 11 Dec 2008 08:38:17 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 11 Dec 2008 08:38:17 -0000 Received: (qmail 67581 invoked by uid 500); 11 Dec 2008 08:38:30 -0000 Delivered-To: apmail-incubator-hama-dev-archive@incubator.apache.org Received: (qmail 67565 invoked by uid 500); 11 Dec 2008 08:38:30 -0000 Mailing-List: contact hama-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hama-dev@incubator.apache.org Delivered-To: mailing list hama-dev@incubator.apache.org Received: (qmail 67554 invoked by uid 99); 11 Dec 2008 08:38:30 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Dec 2008 00:38:30 -0800 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.85.142.186] (HELO ti-out-0910.google.com) (209.85.142.186) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Dec 2008 08:38:08 +0000 Received: by ti-out-0910.google.com with SMTP id w7so632820tib.6 for ; Thu, 11 Dec 2008 00:37:45 -0800 (PST) Received: by 10.110.39.16 with SMTP id m16mr1846488tim.16.1228984657609; Thu, 11 Dec 2008 00:37:37 -0800 (PST) Received: by 10.110.49.18 with HTTP; Thu, 11 Dec 2008 00:37:37 -0800 (PST) Message-ID: Date: Thu, 11 Dec 2008 17:37:37 +0900 From: "Edward J. Yoon" Sender: edward@udanax.org To: hama-dev@incubator.apache.org Subject: Re: blocking_mapred() speed In-Reply-To: <25aacb800812110030j71c588cbq7620a4a4c39af294@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <25aacb800812110030j71c588cbq7620a4a4c39af294@mail.gmail.com> X-Google-Sender-Auth: ca3a966d9be5249d X-Virus-Checked: Checked by ClamAV on apache.org Oh, one question. Why we need to fill C with zeros? public SubMatrix mult(SubMatrix b) { double[][] C = new double[this.getRows()][b.getColumns()]; for (int i = 0; i < this.getRows(); i++) { Arrays.fill(C[i], 0); } for (int i = 0; i < this.getRows(); i++) { for (int j = 0; j < b.getColumns(); j++) { for (int k = 0; k < this.getColumns(); k++) { C[i][j] += this.get(i, k) * b.get(k, j); } } } return new SubMatrix(C); } On Thu, Dec 11, 2008 at 5:30 PM, Samuel Guo wrote: > On Thu, Dec 11, 2008 at 2:36 PM, Edward J. Yoon wrote: > >> If we remove 'reduce phase', I guess we can reduce the disk I/O operations. > > > Yes. > > >> >> >> In the map, read { Constants.BLOCK_STARTROW, Constants.BLOCK_ENDROW, >> Constants.BLOCK_STARTCOLUMN, Constants.BLOCK_ENDCOLUMN } instead of { >> Constants.COLUMN }, and write directly blocks. > > > Two methods to be considered: > 1) We need a InputFormat that partitions the matrix table according to the > row boundaries of the blocks. > This should be carefully to make sure a single block will not divied > into two or more mappers. > > 2) Like what RandomMatrixMap does, we just tell the mappers the row/column > boundaries of the blocks of a matrix-table. > Scanner the portion of the table will be done in a mapper. > > I think 1) may be better than 2). > An InputFormat can get the locality of a range of table to let MR know how > to move the mr computations close to it. > In 2), if we do it like RandomMatrixMap, we may lose some locality > informations of the table. so that the network transfer overhead may be > increase. > > It is just my guess and thoughts. > > >> >> >> What do you think? >> >> -- >> Best Regards, Edward J. Yoon @ NHN, corp. >> edwardyoon@apache.org >> http://blog.udanax.org >> > -- Best Regards, Edward J. Yoon @ NHN, corp. edwardyoon@apache.org http://blog.udanax.org