hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "BristolHadoopWorkshopSpring2010" by SteveLoughran
Date Mon, 22 Mar 2010 14:07:57 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "BristolHadoopWorkshopSpring2010" page has been changed by SteveLoughran.
The comment on this change is: BBC.
http://wiki.apache.org/hadoop/BristolHadoopWorkshopSpring2010?action=diff&rev1=2&rev2=3

--------------------------------------------------

   
  HDFS has been used as a filestore in some of the US CMS Tier-2 sites, the new work that
James discussed was that of actually treating physics problems as MapReduce jobs. They are
bringing up a cluster of machines with storage for this, but would also like to use idle CPU
time on other machines in the datacentre -there was some discussion on how to do this MAPREDUCE-1603
is now a feature request asking for a way to make the assessing of availability a feature
that supported plugins. This would allow someone to write something that looked at non-Hadoop
workload of machines and reduced the number Hadoop slots to report as being available when
busy with other work.
  
+ == Leo Simons: The BBC  ==
+ Leo spoke about their CouchDB back end for the BBC web site
+  * [[http://vis.cs.ucdavis.edu/~ogawa/codeswarm/|Codeswarm]]: live graphics of their repository
work.
+  * There's a new BBC homepage [[http://www.live.bbc.co.uk]]
+  * The web page is integrated with iplayer.
+  * Friday afternoons are busy iPlayer times. People either skive off work or watch TV from
their desk.
+  * Lets you change your prefs -no need to login, the preferences are just bound to cookies
+  * Uses a hash of json to drive couchdb lookup, this lets them stay with 4M docs rather
than 60M docs.
+  * They reach consistency in 40mS or so, no need for microsecond consistency as the rate
of change of  homepage is below that.
+  * Compaction reduced the status display to "blue", rather than green, had everyone panicing
but no visible change in behaviour. Moral: use light green instead.
+ Lots of fun with incomplete resharding causing intermittent replication failures. When an
app saw a 404, it created a new doc as it expected this and kept going, created extra load
and resulted in a 7h replication. 
+ 

Mime
View raw message