Return-Path: X-Original-To: apmail-incubator-hama-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-hama-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 213FC9BCC for ; Mon, 21 May 2012 07:25:56 +0000 (UTC) Received: (qmail 74126 invoked by uid 500); 21 May 2012 07:25:54 -0000 Delivered-To: apmail-incubator-hama-dev-archive@incubator.apache.org Received: (qmail 73529 invoked by uid 500); 21 May 2012 07:25:54 -0000 Mailing-List: contact hama-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hama-dev@incubator.apache.org Delivered-To: mailing list hama-dev@incubator.apache.org Received: (qmail 72923 invoked by uid 99); 21 May 2012 07:25:53 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 May 2012 07:25:53 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of sarawgi.aditya@gmail.com designates 209.85.217.175 as permitted sender) Received: from [209.85.217.175] (HELO mail-lb0-f175.google.com) (209.85.217.175) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 May 2012 07:25:46 +0000 Received: by lbol5 with SMTP id l5so3364824lbo.6 for ; Mon, 21 May 2012 00:25:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=uPtip0q7mVSt3YL8vUI9Ul7hUqSgWgl97RRcs6ZpN1A=; b=cu+LKrdTD93acv68GMEQE0PQ7Ujx2iJCIdWQ7qK4dEbdg6kS03nCoR5bvYr/2JCx7M VqudpeiOfVpfOwU2JapusDkH9wXlgVyyJX2EEg/jsirv+XxslHHwMkiF6YrlSyint1DP q7rv8fsQb3zJqAW+mIi7cqXc07Q0cJfXED3FEdWMviJvqLNPUXfcf2BxrUhYManDP+yv Oiy6d6gtIHmfnBNe1fnVK4je0q+1FL1P44sJrrYhCwBSojUW3pLSahItLird/IEihyzY yw20Q3UDyj0bN/8t2vANNd+HSn9+KpbQhoauBMbtAgmM7arB/8eVN50iY5KJgB3x+rY4 unng== MIME-Version: 1.0 Received: by 10.152.146.67 with SMTP id ta3mr2509740lab.27.1337585125852; Mon, 21 May 2012 00:25:25 -0700 (PDT) Received: by 10.112.31.231 with HTTP; Mon, 21 May 2012 00:25:25 -0700 (PDT) In-Reply-To: References: Date: Mon, 21 May 2012 03:25:25 -0400 Message-ID: Subject: Re: Hama receive queue From: Aditya Sarawgi To: hama-dev@incubator.apache.org Content-Type: multipart/alternative; boundary=e89a8f22c72b4883ed04c086ced9 --e89a8f22c72b4883ed04c086ced9 Content-Type: text/plain; charset=ISO-8859-1 Thanks again Thomas. Yeah, basically I need to make some changes before we can test it in full distributed mode. Let me know if you have some other suggestions as well. On Fri, May 18, 2012 at 3:28 AM, Thomas Jungblut < thomas.jungblut@googlemail.com> wrote: > Cool, I'd be glad to help you on the way ;) > Just have a few notes: > > procId = Integer.parseInt(bspPeer.getPeerName().split(":")[1]); > > > > This is a good observation, but in other modes than the local mode this is > a host:port tuple. So your "hack" won't work, but the peerNames array > returned by "bspPeer.getAllPeerNames()" is sorted on each task, so you just > have to get the index of your peer name. e.G. with: > > procId = > > Arrays.binarySearch(bspPeer.getAllPeerNames(),bspPeer.getPeerName()); > > > > As told in the mail before, I think you will need a row partitioning of the > matrix. I made a very simplistic matrix multiplication in BSP [1], if you > scroll down, you will see a partitioner based on row number. > So your input file (I recommend sequencefiles) have to be ArrayWritable/your ArrayMessage> as input type. > The partitioner will take care of splitting the files accordingly and give > it a task. > > [1] > > https://github.com/thomasjungblut/thomasjungblut-common/blob/master/src/de/jungblut/math/bsp/MatrixMultiplicationBSP.java > > > 2012/5/18 Aditya Sarawgi > > > Hi, > > > > The main optimization step is still left, I wanted to be sure that I get > > ICF right before moving ahead. And the time complexity of the entire > > algorithm is dominated by ICF decomposition. > > Will update you guys soon when I have the final implementation done, I > > am eager to try it on datasets as well :) > > > > > > On Fri, May 18, 2012 at 1:44 AM, Thomas Jungblut < > > thomas.jungblut@googlemail.com> wrote: > > > > > Thanks for the explanation! > > > I have plenty of time today so I can clone your stuff and play arround > > with > > > it. > > > Are there any steps left to use this as SVM? I wanted to try it out on > > the > > > mushroom set. > > > > > > 2012/5/18 Aditya Sarawgi > > > > > > > @Edward its not urgent, I am ready when you are :) > > > > > > > > @Thomas Thanks for the feedback and help. Sure, you can use the code > > > > for the jiras. But do remember it is slightly different from the > actual > > > icf > > > > in the sense > > > > that here the dimension of the result matrix would n x p ( where p is > > > > typically sqrt(n) ) > > > > and the approximation error changes with what p. If p is close to n > the > > > > error is low. > > > > > > > > It seems to work on smaller matrices pretty well. I tried it by > varying > > > the > > > > values of p and > > > > as p approaches n, the decomposition has less error. > > > > I have to do some more testing though. > > > > > > > > > > > > On Thu, May 17, 2012 at 11:06 AM, Thomas Jungblut < > > > > thomas.jungblut@googlemail.com> wrote: > > > > > > > > > instanceof is slow as hell, but if you have no other solution then > > this > > > > is > > > > > okay. > > > > > > > > > > 2) What is like the standard way to load matrices in different > nodes > > > > with a > > > > > > custom partitioning scheme > > > > > > > > > > > > > > > It is depending on your algorithm needs, but I think you will need > to > > > > > implement your own partitioner, since HashPartitioning may not > apply > > to > > > > > this ICF. > > > > > Generally you need to use the input system to read a part of a > matrix > > > > into > > > > > each peer. > > > > > > > > > > We also script a mapreduce job that will create random input for x > GB > > > to > > > > > check scalability. > > > > > Here is that for graphs: > > > https://issues.apache.org/jira/browse/HAMA-558 > > > > > But I think this is easily extendable to matrices. There is an > issue > > > for > > > > > that as well, I don't know how far Mikalai came with that. > > > > > > > > > > BTW your code looks good ;) > > > > > > > > > > Can we use this for https://issues.apache.org/jira/browse/HAMA-94or > > > > > https://issues.apache.org/jira/browse/HAMA-553 ? Would be a great > > > > addition > > > > > if it works! > > > > > > > > > > Greetings from Germany, > > > > > Thomas > > > > > > > > > > 2012/5/17 Aditya Sarawgi > > > > > > > > > > > Thanks Thomas. > > > > > > I am actually using tags for something else. So for now using > > > > instanceof > > > > > is > > > > > > just fine with me. > > > > > > > > > > > > I had a couple of more questions, regarding benchmarking stuff on > > > > hama. I > > > > > > have a working implementation of > > > > > > Parallel row based icf that given a n x n matrix returns a > > > decomposed n > > > > > x p > > > > > > matrix. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/truncs/hello-world/blob/master/shttps://issues.apache.org/jira/browse/HAMA-558rc/main/java/edu/sunysb/cs/Icf.java > > > > > < > > > > > > > > > > > > > > > https://github.com/truncs/hello-world/blob/master/src/main/java/edu/sunysb/cs/Icf.java > > > > > > > > > > > > > > > > > > Now I would like to test this on a big input and possibly in full > > > > > > distributed mode, so I was wondering how do > > > > > > people usually do these sort of benchmarking. > > > > > > > > > > > > Specifically, > > > > > > 1) Do they setup a cluster on AWS ? > > > > > > 2) What is like the standard way to load matrices in different > > nodes > > > > > with a > > > > > > custom partitioning scheme > > > > > > 3) Is there anything else that I should know > > > > > > > > > > > > On Thu, May 17, 2012 at 3:20 AM, Thomas Jungblut < > > > > > > thomas.jungblut@googlemail.com> wrote: > > > > > > > > > > > > > Hi Aditya, > > > > > > > > > > > > > > that's where the concept of Message Tagging comes into play. > You > > > have > > > > > > tags > > > > > > > in each message which are hardcoded as Strings. > > > > > > > But as Edward told you can use GenericWritable or > ObjectWritable > > > > > instead, > > > > > > > so they will tag your messages with the classnames and give you > > the > > > > > > correct > > > > > > > class. > > > > > > > > > > > > > > Is there any way by which I can pop from the receive queue ? > > > > > > > > > > > > > > > > > > > > > peer.getCurrentMessage() is popping from the received queue. > > > > > > > > > > > > > > 2012/5/17 Aditya Sarawgi > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > But thats not the only problem, consider this case > > > > > > > > that there are variable number of messages being sent, so I > > would > > > > > have > > > > > > to > > > > > > > > maintain > > > > > > > > counts for each peer pointing to the last unread message. > > > > > > > > > > > > > > > > Is there any way by which I can pop from the receive queue ? > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 16, 2012 at 10:23 PM, Suraj Menon < > > > > > surajsmenon@apache.org > > > > > > > > >wrote: > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > Please take a look at this snippet of code copied and > > modified > > > > from > > > > > > > > > Mapper class to implement your scenario. - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/ssmenon/hama/edit/master/hama-mapreduce/src/org/apache/hama/computemodel/mapreduce/Trials.java > > > > > > > > > Between lines 233 to 245 I am able to send different type > of > > > > > > messages. > > > > > > > > > With type checks and generics you shouldn't be encountering > > > > > Classcast > > > > > > > > > exception at receiving end too. I am yet to test the next > > > > > superstep, > > > > > > > > > shall update you with sample code for the next superstep > > > > mimicking > > > > > > > > > your scenario for receiving. > > > > > > > > > > > > > > > > > > For elegance, we have an experimental Superstep#compute > > > > > > > > > API(org.apache.hama.bsp.Superstep). I have encountered an > > issue > > > > in > > > > > > job > > > > > > > > > submission framework with this method in distributed mode; > > fix > > > > for > > > > > > > > > this would be pushed to trunk in next few hours. You can > > still > > > > run > > > > > it > > > > > > > > > using LocalBSPRunner for now. > > > > > > > > > > > > > > > > > > -Suraj > > > > > > > > > > > > > > > > > > On Wed, May 16, 2012 at 9:18 PM, Aditya Sarawgi > > > > > > > > > wrote: > > > > > > > > > > Hi Edward, > > > > > > > > > > > > > > > > > > > > Yes that is what I did > > > > > > > > > > I wrote an ArrayMessage class (doesn't use generics for > now > > > but > > > > > can > > > > > > > be > > > > > > > > > > converted easily) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/truncs/hello-world/blob/master/src/main/java/edu/sunysb/cs/ArrayMessage.java > > > > > > > > > > > > > > > > > > > > But the problem is that I am sending a IntegerMessage > > before > > > > and > > > > > > > after > > > > > > > > > > reading the IntegerMessage I am sending > > > > > > > > > > an ArrayMessage but the previous IntegerMessage is still > > > there. > > > > > > > > > > > > > > > > > > > > On Wed, May 16, 2012 at 8:34 PM, Edward J. Yoon < > > > > > > > edwardyoon@apache.org > > > > > > > > > >wrote: > > > > > > > > > > > > > > > > > > > >> Hi, > > > > > > > > > >> > > > > > > > > > >> To send or receive multiple Message types, I think you > can > > > use > > > > > > > > > >> GenericWritable. You can also implement your own > > > > GenericMessage > > > > > > and > > > > > > > > > >> contribute it to our project! > > > > > > > > > >> > > > > > > > > > >> Hope this helps you. > > > > > > > > > >> > > > > > > > > > >> On Thu, May 17, 2012 at 7:48 AM, Aditya Sarawgi > > > > > > > > > >> wrote: > > > > > > > > > >> > Hi Guys, > > > > > > > > > >> > > > > > > > > > > >> > I am wondering how do the receive queues in hama work. > > > > > Consider > > > > > > > this > > > > > > > > > case > > > > > > > > > >> > that I want to sent a different type of BSPMessage in > 2 > > > > > > > consecutive > > > > > > > > > >> > superstep. > > > > > > > > > >> > In this first superstep I am sending IntMessage and in > > the > > > > > next > > > > > > > one > > > > > > > > I > > > > > > > > > am > > > > > > > > > >> > sending a ArrayMessage ( custom message class). > > > > > > > > > >> > > > > > > > > > > >> > Now in the second super step when I do a > > > > > > > > > >> > while ((arrayMessage = (ArrayMessage) > > > > > peer.getCurrentMessage()) > > > > > > > != > > > > > > > > > >> null) { > > > > > > > > > >> > > > > > > > > > > >> > it is throwing a java.lang.ClassCastException, which > is > > > > > obvious > > > > > > > > since > > > > > > > > > its > > > > > > > > > >> > trying to cast IntMessage to ArrayMessage. > > > > > > > > > >> > I thought the message is dropped from the queue after > it > > > is > > > > > > read, > > > > > > > is > > > > > > > > > this > > > > > > > > > >> > not the case ? > > > > > > > > > >> > And if it is not, how can this be handled elegantly ? > > > > > > > > > >> > > > > > > > > > > >> > -- > > > > > > > > > >> > Cheers, > > > > > > > > > >> > Aditya Sarawgi > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> -- > > > > > > > > > >> Best Regards, Edward J. Yoon > > > > > > > > > >> @eddieyoon > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Cheers, > > > > > > > > > > Aditya Sarawgi > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Cheers, > > > > > > > > Aditya Sarawgi > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Thomas Jungblut > > > > > > > Berlin > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Cheers, > > > > > > Aditya Sarawgi > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Thomas Jungblut > > > > > Berlin > > > > > > > > > > > > > > > > > > > > > -- > > > > Cheers, > > > > Aditya Sarawgi > > > > > > > > > > > > > > > > -- > > > Thomas Jungblut > > > Berlin > > > > > > > > > > > -- > > Cheers, > > Aditya Sarawgi > > > > > > -- > Thomas Jungblut > Berlin > -- Cheers, Aditya Sarawgi --e89a8f22c72b4883ed04c086ced9--