Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 19834 invoked from network); 2 Mar 2011 20:16:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 2 Mar 2011 20:16:43 -0000 Received: (qmail 21533 invoked by uid 500); 2 Mar 2011 20:16:33 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 21314 invoked by uid 500); 2 Mar 2011 20:16:33 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Delivered-To: moderator for common-user@hadoop.apache.org Received: (qmail 43944 invoked by uid 99); 2 Mar 2011 19:01:11 -0000 X-ASF-Spam-Status: No, hits=0.0 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of duluthted@gmail.com designates 209.85.220.176 as permitted sender) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:reply-to:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to :content-type:content-transfer-encoding; bh=TppYCgyISHRlokayWg12OcwPpF8kyULQYb59FSM0E9c=; b=otbzo2oKpzlbfS+jOHIVqjadrMXDSJ3G7rRnhDWQOUH7huEe0dTlRAslgeYTvUv3iD Xn1t/b33Nr+zDP/SbeBDTkhFChBekoCoojAzHTIDcdM2/YywExzKQOk699PgvlYpDlIq M+c7m2shgscNUhbIsvcx4/BLUvIqoKkkfr/sA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:reply-to:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type :content-transfer-encoding; b=MEIqOUO4JNoQMObxI6s2tiam++rN5Orb9COZ4mHMfCMsZvv/2eAdDWxz1lWUOKmCBO 0M09WWURpuw10rvgmM33OODGI7BTJBOTK8JHOhkteLf/f0lpPjTyG2yomHvajDSCUXzk zJYx3Rogs2cmhX7e2r7TjPjSSt/GPCw3TYuiM= MIME-Version: 1.0 Sender: duluthted@gmail.com Reply-To: tpederse@d.umn.edu In-Reply-To: References: Date: Wed, 2 Mar 2011 12:58:03 -0600 X-Google-Sender-Auth: DLxN_hrgZAGhSnhJYAiCZlaqHTA Message-ID: Subject: Re: Hadoop Case Studies? From: Ted Pedersen To: common-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Greetings all, Since posting my original request I ran across the following, which is a nice example of what I'd call a case study. Gives a few details at least and is kind of an interesting or creative use of Hadoop... http://engineering.foursquare.com/2011/02/28/how-we-found-the-rudest-cities= -in-the-world-analytics-foursquare/ Enjoy, Ted On Sun, Feb 27, 2011 at 9:31 PM, Ted Pedersen wrote: > Thanks for all these great ideas. These are really very helpful. > > What I'm also hoping to find are articles or papers that describe what > particular companies or organizations have done with Hadoop. How does > Facebook use Hadoop for example (that's one of the case studies in the > White book), or how does last.fm use Hadoop (another of the case > studies in the White book). > > One interesting resource is the list of "powered by Hadoop" projects > available here: > > http://wiki.apache.org/hadoop/PoweredBy > > Some of these entries provide links to more detailed discussions of > what an organization is doing, as in the following from Twitter > http://www.slideshare.net/kevinweil/hadoop-pig-and-twitter-nosql-east-200= 9 > > So any additional descriptions of what specific organizations are > doing with Hadoop (to the extent they are willing to share) would be > really helpful (these sorts of "real world" cases tend to be > particularly motivating). > > Cordially, > Ted > > On Sun, Feb 27, 2011 at 9:23 PM, Simon wrote: >> I think you can also simulate PageRank Algorithm with hadoop. >> >> Simon - >> >> On Sun, Feb 27, 2011 at 9:20 PM, Lance Norskog wrote= : >> >>> This is an exercise that will appeal to undergrads: pull the Craiglist >>> personals ads from several cities, and do text classification. Given a >>> training set of all the cities, attempt to classify test ads by city. >>> (If Peter Harrington is out there, I stole this from you.) >>> >>> Lance >>> >>> On Sun, Feb 27, 2011 at 4:55 PM, Ted Dunning >>> wrote: >>> > Ted, >>> > >>> > Greetings back at you. =A0It has been a while. >>> > >>> > Check out Jimmy Lin and Chris Dyer's book about text processing with >>> > hadoop: >>> > >>> > http://www.umiacs.umd.edu/~jimmylin/book.html >>> > >>> > >>> > On Sun, Feb 27, 2011 at 4:34 PM, Ted Pedersen >>> wrote: >>> > >>> >> Greetings all, >>> >> >>> >> I'm teaching an undergraduate Computer Science class that is using >>> >> Hadoop quite heavily, and would like to include some case studies at >>> >> various points during this semester. >>> >> >>> >> We are using Tom White's "Hadoop The Definitive Guide" as a text, an= d >>> >> that includes a very nice chapter of case studies which might even >>> >> provide enough material for my purposes. >>> >> >>> >> But, I wanted to check and see if there were other case studies out >>> >> there that might provide motivating and interesting examples of how >>> >> Hadoop is currently being used. The idea is to find material that go= es >>> >> beyond simply saying "X uses Hadoop" to explaining in more detail ho= w >>> >> and why X are using Hadoop. >>> >> >>> >> Any hints would be very gratefully received. >>> >> >>> >> Cordially, >>> >> Ted >>> >> >>> >> -- >>> >> Ted Pedersen >>> >> http://www.d.umn.edu/~tpederse >>> >> >>> > >>> >>> >>> >>> -- >>> Lance Norskog >>> goksron@gmail.com >>> >> >> >> >> -- >> Regards, >> Simon >> > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > --=20 Ted Pedersen http://www.d.umn.edu/~tpederse