hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: Please help on providing correct answers
Date Wed, 07 Nov 2012 17:27:13 GMT
Ok... 
Where are you pulling these questions from? 

Seriously. 


On Nov 7, 2012, at 11:21 AM, Ramasubramanian Narayanan <ramasubramanian.narayanan@gmail.com>
wrote:

> Hi,
> 
>    I came across the following question in some sites and the answer that they provided
seems to be wrong according to me... I might be wrong... Can some one help on confirming the
right answers for these 11 questions pls.. appreciate the explanation if you could able to
provide...
> 
> *******************************************************************************
> You are running a job that will process a single InputSplit on a cluster which has no
other jobs
> currently running. Each node has an equal number of open Map slots. On which node will
Hadoop
> first attempt to run the Map task?
> A. The node with the most memory
> B. The node with the lowest system load
> C. The node on which this InputSplit is stored
> D. The node with the most free local disk space
> 
> My Answer            : C 
> Answer Given in site : A
> 
> *******************************************************************************
> What is a Writable?
> A. Writable is an interface that all keys and values in MapReduce must implement. Classes
implementing this interface must implement methods forserializingand deserializing themselves.
> B. Writable is an abstract class that all keys and values in MapReduce must extend. Classes
extending this abstract base class must implementmethods for serializing and deserializingthemselves
> C. Writable is an interface that all keys, but not values, in MapReduce must implement.
Classes implementing this interface mustimplementmethods for serializing and deserializing
themselves.
> D. Writable is an abstract class that all keys, but not values, in MapReduce must extend.
Classes extending this abstract base class must implementmethods for serializing and deserializing
themselves.
> 
> My Answer            : A
> Answer Given in site : B
> 
> *******************************************************************************
> 
> You write a MapReduce job to process 100 files in HDFS. Your MapReducc algorithm uses
> TextInputFormat and the IdentityReducer: the mapper applies a regular expression over
input
> values and emits key-value pairs with the key consisting of the matching text, and the
value
> containing the filename and byte offset. Determine the difference between setting the
number of
> reducers to zero.
> A. There is no differenceinoutput between the two settings.
> B. With zero reducers, no reducer runs and the job throws an exception. With one reducer,
> instances of matching patterns are stored in a single file on HDFS.
> C. With zero reducers, all instances of matching patterns are gathered together in one
file on
> HDFS. With one reducer, instances ofmatching patternsstored in multiple files on HDFS.
> D. With zero reducers, instances of matching patterns are stored in multiple files on
HDFS. With
> one reducer, all instances of matching patterns aregathered together in one file on HDFS.
> 
> My Answer            : D
> Answer Given in site : C
> 
> *******************************************************************************
> 
> During the standard sort and shuffle phase of MapReduce, keys and values are passed to
> reducers. Which of the following is true?
> A. Keys are presented to a reducerin sorted order; values foragiven key are not sorted.
> B. Keys are presented to a reducer in soiled order; values for a given key are sorted
in ascending
> order.
> C. Keys are presented to a reducer in random order; values for a given key are not sorted.
> D. Keys are presented to a reducer in random order; values for a given key are sorted
in
> ascending order.
> 
> My Answer            : A
> Answer Given in site : D
> 
> *******************************************************************************
> 
> Which statement best describes the data path of intermediate key-value pairs (i.e., output
of the
> mappers)?
> A. Intermediate key-value pairs are written to HDFS. Reducers read the intermediate data
from
> HDFS.
> B. Intermediate key-value pairs are written to HDFS. Reducers copy the intermediate data
to the
> local disks of the machines runningthe reduce tasks.
> C. Intermediate key-value pairs are written to the local disks of the machines running
the map
> tasks, and then copied to the machinerunning thereduce tasks.
> D. Intermediate key-value pairs are written to the local disks of the machines running
the map
> tasks, and are then copied to HDFS. Reducers read theintermediate data from HDFS.
> 
> My Answer            : C
> Answer Given in site : B
> 
> *******************************************************************************
> 
> You are developing a combiner that takes as input Text keys, IntWritable values, and
emits Text
> keys, Intwritable values. Which interface should your class implement?
> A. Mapper <Text, IntWritable, Text, IntWritable>
> B. Reducer <Text, Text, IntWritable, IntWritable>
> C. Reducer <Text, IntWritable, Text, IntWritable>
> D. Combiner <Text, IntWritable, Text, IntWritable>
> E. Combiner <Text, Text, IntWritable, IntWritable>
> 
> My Answer            : D
> Answer Given in site : C
> 
> *******************************************************************************
> 
> What happens in a MapReduce job when you set the number of reducers to one?
> A. A single reducer gathers and processes all the output from all the mappers. The output
is
> written in as many separate files as there are mappers.
> B. A single reducer gathers and processes all the output from all the mappers. The output
is
> written to a single file in HDFS.
> C. Setting the number of reducers to one creates a processing bottleneck, and since the
number
> of reducers as specified by the programmer is used as areference value only, the MapReduce
> runtime provides a default setting for the number of reducers.
> D. Setting the number of reducers to one is invalid, and an exception is thrown
> 
> My Answer            : B
> Answer Given in site : C
> 
> *******************************************************************************
> 
> In the standard word count MapReduce algorithm, why might using a combiner reduce the
overall
> Job running time?
> A. Because combiners perform local aggregation of word counts, thereby allowing the mappers
to
> process input data faster.
> B. Because combiners perform local aggregation of word counts, thereby reducing the number
of
> mappers that need to run.
> C. Because combiners perform local aggregation of word counts, and then transfer that
data to
> reducers without writing the intermediatedata to disk.
> D. Because combiners perform local aggregation of word counts, thereby reducing the number
of
> key-value pairs that need to be snuff letacross thenetwork to the reducers.
> 
> My Answer            : C
> Answer Given in site : A
> 
> *******************************************************************************
> 
> You need to create a GUI application to help your company's sales people add and edit
customer
> information. Would HDFS be appropriate for this customer information file?
> A. Yes, because HDFS isoptimized forrandom access writes.
> B. Yes, because HDFS is optimized for fast retrieval of relatively small amounts of data.
> C. No, becauseHDFS can only be accessed by MapReduce applications.
> D. No, because HDFS is optimized for write-once, streaming access for relatively large
files.
> 
> My Answer            : D
> Answer Given in site : A
> 
> *******************************************************************************
> 
> You need to create a job that does frequency analysis on input data. You will do this
by writing a
> Mapper that uses TextInputForma and splits each value (a line of text from an input file)
into
> individual characters. For each one of these characters, you will emit the character
as a key and
> as IntWritable as the value. Since this will produce proportionally more intermediate
data than
> input data, which resources could you expect to be likely bottlenecks?
> A. Processor and RAM
> B. Processor and disk I/O
> C. Disk I/O and network I/O
> D. Processor and network I/O
> 
> My Answer            : D
> Answer Given in site : B
> 
> *******************************************************************************
> 
> Which of the following statements best describes how a large (100 GB) file is stored
in HDFS?
> A. The file is divided into variable size blocks, which are stored on multiple data nodes.
Each block
> is replicated three timesby default.
> B. The file is replicated three times by default. Each ropy of the file is stored on
a separate
> datanodes.
> C. The master copy of the file is stored on a single datanode. The replica copies are
divided into
> fixed-size blocks, which are stored on multiple datanodes.
> D. The file is divided into fixed-size blocks, which are stored on multiple datanodes.Eachblock
is
> replicated three times by default. Multiple blocks from the same file mightreside on
the same
> datanode.
> E. The tile is divided into fixed-sizeblocks, which are stored on multiple datanodes.Eachblock
is
> replicated three times by default.HDES guarantees that different blocks from the same
file are
> never on the same datanode.
> 
> My Answer            : D
> Answer Given in site : B
> 
> *******************************************************************************
> 
> regards,
> Rams


Mime
View raw message