hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Trivial Update of "FAQ" by LewisJohnMcgibbney
Date Thu, 20 Mar 2014 08:46:48 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "FAQ" page has been changed by LewisJohnMcgibbney:
https://wiki.apache.org/hadoop/FAQ?action=diff&rev1=109&rev2=110

  Text files are handled similarly, using newlines instead of sync marks.
  
  == How do I change final output file name with the desired name rather than in partitions
like part-00000, part-00001? ==
- You can subclass the [[http://svn.apache.org/viewvc/hadoop/core/trunk/src/mapred/org/apache/hadoop/mapred/OutputFormat.java?view=markup|OutputFormat.java]]
class and write your own. You can look at the code of [[http://svn.apache.org/viewvc/hadoop/core/trunk/src/mapred/org/apache/hadoop/mapred/TextOutputFormat.java?view=markup|TextOutputFormat]]
[[http://svn.apache.org/viewvc/hadoop/core/trunk/src/mapred/org/apache/hadoop/mapred/MultipleOutputFormat.java?view=markup|MultipleOutputFormat.java]]
etc. for reference. It might be the case that you only need to do minor changes to any of
the existing Output Format classes. To do that you can just subclass that class and override
the methods you need to change.
+ You can subclass the [[http://hadoop.apache.org/docs/current/api/index.html?org/apache/hadoop/mapred/OutputFormat.html|OutputFormat.java]]
class and write your own. You can locate and browse the code of [[http://hadoop.apache.org/docs/current/api/index.html?org/apache/hadoop/mapred/TextOutputFormat.html|TextOutputFormat]]
[[http://hadoop.apache.org/docs/current/api/index.html?org/apache/hadoop/mapred/lib/MultipleOutputFormat.html]]
etc. for reference. It might be the case that you only need to do minor changes to any of
the existing Output Format classes. To do that you can just subclass that class and override
the methods you need to change.
  
  == When writing a New InputFormat, what is the format for the array of string returned by
InputSplit\#getLocations()? ==
  It appears that DatanodeID.getHost() is the standard place to retrieve this name, and the
machineName variable, populated in DataNode.java\#startDataNode, is where the name is first
set. The first method attempted is to get "slave.host.name" from the configuration; if that
is not available, DNS.getDefaultHost is used instead.

Mime
View raw message