Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 71062 invoked from network); 28 Feb 2011 04:39:26 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 28 Feb 2011 04:39:26 -0000 Received: (qmail 89882 invoked by uid 500); 28 Feb 2011 04:39:23 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 89487 invoked by uid 500); 28 Feb 2011 04:39:20 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 89477 invoked by uid 99); 28 Feb 2011 04:39:19 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 28 Feb 2011 04:39:19 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.214.176] (HELO mail-iw0-f176.google.com) (209.85.214.176) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 28 Feb 2011 04:39:13 +0000 Received: by iwr19 with SMTP id 19so3639501iwr.35 for ; Sun, 27 Feb 2011 20:38:51 -0800 (PST) Received: by 10.42.239.200 with SMTP id kx8mr4369010icb.330.1298867930218; Sun, 27 Feb 2011 20:38:50 -0800 (PST) MIME-Version: 1.0 Received: by 10.42.213.129 with HTTP; Sun, 27 Feb 2011 20:38:30 -0800 (PST) X-Originating-IP: [67.160.196.149] In-Reply-To: References: From: Ted Dunning Date: Sun, 27 Feb 2011 20:38:30 -0800 Message-ID: Subject: Re: Hadoop Case Studies? To: common-user@hadoop.apache.org, tpederse@d.umn.edu Content-Type: multipart/alternative; boundary=20cf30549b27974315049d5041bb X-Virus-Checked: Checked by ClamAV on apache.org --20cf30549b27974315049d5041bb Content-Type: text/plain; charset=ISO-8859-1 At any large company that makes heavy use of Hadoop, you aren't going to find any concise description of all the ways that hadoop is used. That said, here is a concise description of some of the ways that hadoop is (was) used at Yahoo: http://www.slideshare.net/ydn/hadoop-yahoo-internet-scale-data-processing On Sun, Feb 27, 2011 at 7:31 PM, Ted Pedersen wrote: > Thanks for all these great ideas. These are really very helpful. > > What I'm also hoping to find are articles or papers that describe what > particular companies or organizations have done with Hadoop. How does > Facebook use Hadoop for example (that's one of the case studies in the > White book), or how does last.fm use Hadoop (another of the case > studies in the White book). > > One interesting resource is the list of "powered by Hadoop" projects > available here: > > http://wiki.apache.org/hadoop/PoweredBy > > Some of these entries provide links to more detailed discussions of > what an organization is doing, as in the following from Twitter > http://www.slideshare.net/kevinweil/hadoop-pig-and-twitter-nosql-east-2009 > > So any additional descriptions of what specific organizations are > doing with Hadoop (to the extent they are willing to share) would be > really helpful (these sorts of "real world" cases tend to be > particularly motivating). > > Cordially, > Ted > > On Sun, Feb 27, 2011 at 9:23 PM, Simon wrote: > > I think you can also simulate PageRank Algorithm with hadoop. > > > > Simon - > > > > On Sun, Feb 27, 2011 at 9:20 PM, Lance Norskog > wrote: > > > >> This is an exercise that will appeal to undergrads: pull the Craiglist > >> personals ads from several cities, and do text classification. Given a > >> training set of all the cities, attempt to classify test ads by city. > >> (If Peter Harrington is out there, I stole this from you.) > >> > >> Lance > >> > >> On Sun, Feb 27, 2011 at 4:55 PM, Ted Dunning > >> wrote: > >> > Ted, > >> > > >> > Greetings back at you. It has been a while. > >> > > >> > Check out Jimmy Lin and Chris Dyer's book about text processing with > >> > hadoop: > >> > > >> > http://www.umiacs.umd.edu/~jimmylin/book.html > >> > > >> > > >> > On Sun, Feb 27, 2011 at 4:34 PM, Ted Pedersen > >> wrote: > >> > > >> >> Greetings all, > >> >> > >> >> I'm teaching an undergraduate Computer Science class that is using > >> >> Hadoop quite heavily, and would like to include some case studies at > >> >> various points during this semester. > >> >> > >> >> We are using Tom White's "Hadoop The Definitive Guide" as a text, and > >> >> that includes a very nice chapter of case studies which might even > >> >> provide enough material for my purposes. > >> >> > >> >> But, I wanted to check and see if there were other case studies out > >> >> there that might provide motivating and interesting examples of how > >> >> Hadoop is currently being used. The idea is to find material that > goes > >> >> beyond simply saying "X uses Hadoop" to explaining in more detail how > >> >> and why X are using Hadoop. > >> >> > >> >> Any hints would be very gratefully received. > >> >> > >> >> Cordially, > >> >> Ted > >> >> > >> >> -- > >> >> Ted Pedersen > >> >> http://www.d.umn.edu/~tpederse > >> >> > >> > > >> > >> > >> > >> -- > >> Lance Norskog > >> goksron@gmail.com > >> > > > > > > > > -- > > Regards, > > Simon > > > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > --20cf30549b27974315049d5041bb--