hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "IdeasOnLdapConfiguration" by SomeOtherAccount
Date Wed, 12 May 2010 19:03:03 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "IdeasOnLdapConfiguration" page has been changed by SomeOtherAccount.
http://wiki.apache.org/hadoop/IdeasOnLdapConfiguration?action=diff&rev1=3&rev2=4

--------------------------------------------------

  
  and in our LDAP server, we have placed the following objects:
  
- { { {
+ {{{
  hostname=myhost1
  objectclass=node
  domain=example.com
@@ -24, +24 @@

  hostname=myhost2
  objectclass=node
  domain=example.com
- } } }
+ }}}
  
  We can now do an LDAP search with (&(objectclass=node)(hostname=myhost1)) to find the
'myhost1' object.  Similarly, we can (&(objectless)(domain=example.com)) to find both
myhost1 and myhost2 objects.
  
  Let's apply these ideas to Hadoop.  Here are some rough objectclasses that we can use for
demonstration purposes:
  
+ {{{
  generic properties: hadoopGlobalConfig
  hadoop.tmp.dir: string
  fs.default.name: string
@@ -64, +65 @@

  dfs.http.address: string
  hostname: string
  dfs.name.dir: multi-string
+ }}}
+ 
  
  Let's define a simple grid:
  
+ {{{
  clusterName=red
  objectclass=hadoopGlobalConfig
  hadoop.tmp.dir=/tmp
@@ -94, +98 @@

  mapred.local.dir: /mr1,/mr2,/mr3
  mapred.tasktracker.map.tasks.maximum: 4
  mapred.tasktracker.reduce.tasks.maximum: 4
+ }}}
  
+ Let's say we fire up node1. The local config would say what ldap server, necessary creds
to talk to the ldap server, etc. It might also say that it is part of the red cluster in order
to speed up the startup.  From there, it would do the following:
+ 
+ Get all the global config for the red cluster:  search scope: (&(objectclass=hadoopGlobalConfig)(clusterName=red)).
 We now know hadoop.tmp.dir, fs.default.name, etc.
+ 
+ Are we a namenode?  (&(objectclass=hadoopNameNode)(hostname=node1)).  Empty.  Drats!
+ 
+ Are we a datanode?  (&(objectclass=hadoopDataNode)(commonname=node1)).  We got an object
back!  Grab that info and can now start up the datanode process.
+ 
+ Are we a jobtracker?  (&(objectclass=hadoopJobTracker)(hostname=node1)).  Empty.  Drats!
+ 
+ Are we a tasktracker? (&(objectclass=hadoopTaskTracker)(hostname=node1)):  We got an
object back!  Fire up the task tracker with that object's info.
+ 
+ From these base definitions, we can do more complex things:
+ 
+ {{{
+ commonname=simplecomputenode1,cluster=red
+ objectclass=hadoopDataNode,hadoopTaskTracker
+ hostname:  node1,node2,node3
+ dfs.data.dir: /hdfs1, /hdfs2, /hdfs3
+ dfs.datanode.du.reserved: 10
+ mapred.job.tracker: commonname=jobtracker,cluster=red
+ mapred.local.dir: /mr1,/mr2,/mr3
+ mapred.tasktracker.map.tasks.maximum: 4
+ mapred.tasktracker.reduce.tasks.maximum: 4
+ 
+ commonname=simplecomputenode2,cluster=red
+ objectclass=hadoopDataNode,hadoopTaskTracker
+ hostname:  node4,node5,node6
+ dfs.data.dir: /hdfs1, /hdfs2, /hdfs3, /hdfs4
+ dfs.datanode.du.reserved: 10
+ mapred.job.tracker: commonname=jobtracker,cluster=red
+ mapred.local.dir: /mr1,/mr2,/mr3,/mr4
+ mapred.tasktracker.map.tasks.maximum: 8
+ mapred.tasktracker.reduce.tasks.maximum: 4
+ }}}
+ 
+ We can define multiple definitions for the same grid.  This is important when you consider
that small-medium sized grids are likely to have a mix of nodes.  For example, some nodes
may have 8 cores with four disks and some nodes may have 6 cores with eight disks.  If they
are part of the same cluster, they will need different mapred-site.xml settings in order to
maximize the hardware purchase.
+ 

Mime
View raw message