hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kartashov, Andy" <Andy.Kartas...@mpac.ca>
Subject RE: using sequencefile generated by Sqoop in Mapreduce
Date Tue, 09 Oct 2012 20:58:00 GMT
Gents, please ignore my below. Everything works as a glove.

conf.setInputFormat(SequenceFileInputFormat.class) indeed works well  with Sqoop generated

The reason why I was getting only the last line in my output is because I failed to notice
that I am using fs.create() i/o fs.append(). *blash*

Andy Kartashov
Architecture R&D, Co-op
1340 Pickering Parkway, Pickering, L1V 0C4
* Phone : (905) 837 6269
* Mobile: (416) 722 1787

From: Kartashov, Andy
Sent: Tuesday, October 09, 2012 1:19 PM
To: user@hadoop.apache.org
Subject: using sequencefile generated by Sqoop in Mapreduce


I have trouble using sequence file  in Mar-Reduce.  The output I get is very last record.

I am creating sequence file while importing MySQL table into Hadoop using:
$Sqoop import...... --as-sequencefile

I am then are trying to read from this file into the mapper and create keys from object's
Ids and values - the actual objects with attributes per each table's record.
In the Reducer I am iterating those objects and outputting objects's attributes to a .txt

My Mapreduce code:

     public static void main(String[] args) throws Exception {
    conf.setOutputKeyClass(Text.class); // this will be one of the fields of the exported
table, say ids
   conf.setOutputValueClass(<Sqoop_class.class>); //say Sqoop_class.class generated_class_during_import
  conf.setOutputFormat(NullOutputFormat.class); // output will be to a .txt file

public static class MyMapper extends MapReduceBase implements Mapper<LongWritable, Sqoop_class,
Text, Sqoop_class> {
public void map(LongWritable key, Sqoop_class value, OutputCollector<Text, Sqoop_class>
output, Reporter reporter) throws IOException {
        output.collect(new Text(value.get_foo_id ().toString()), value);
     } // end of map()
} // end of static class MyMapper

  public static class MyReducer extends MapReduceBase implements Reducer<Text, Sqoop_class,
Text, Sqoop_class> {

   public void reduce(Text key, Iterator<Sqoop_class> values, OutputCollector<Text,
Sqoop_class> output, Reporter reporter) throws IOException {
  while (values.hasNext()){
       output.collect(key,epa); //output is to Null...
       out.writeBytes(" values.next().get_foo_name() + "!\n "  );
       } // end of while loop
        }//end of reduce()
} // end of static class MyReducer

Would not Mapper create Keys from each Sqoop_class id value and Values will be instance of
each Sqoop_class
Then we group those instances in the reducer and Iterate through them retrieving attribute
names from each object.
Somehow the values of one last instance of the object is only written.

Should not conf.setInputFormat(SequenceFileInputFormat.class);
Sqoop_class.class work together reading from Sequence file?

From: nagarjuna kanamarlapudi [mailto:nagarjuna.kanamarlapudi@gmail.com]
Sent: Tuesday, October 09, 2012 11:03 AM
To: user@hadoop.apache.org
Subject: Re: Hive-Site XML changing any proprty.

by restarting the hive server your problem should be solved.

Not sure if we have any other ways of starting the hive server other than .

1. bin/hive --service hiveserver

2. HIVE_PORT=xxxx ./hive --service hiveserver


On Tue, Oct 9, 2012 at 8:10 PM, Uddipan Mukherjee <Uddipan_Mukherjee@infosys.com<mailto:Uddipan_Mukherjee@infosys.com>>
Hi hadoop, hive gurus,

    I have a requirement to change the path of the scratch folder of Hive.  Hence I have added
following property in Hive-Site.xml and changed its value as required.

  <description>Scratch space for Hive jobs</description>

But still it is not reflecting as required. Do I need to restart Hive server to read the updated
value in the file.

Also is there any other way other than restarting Hive server?

Any pointers will be helpful.

Thanks And Regards
Uddipan Mukherjee

**************** CAUTION - Disclaimer *****************

This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely

for the use of the addressee(s). If you are not the intended recipient, please

notify the sender by e-mail and delete the original message. Further, you are not

to copy, disclose, or distribute this e-mail or its contents to any other person and

any such actions are unlawful. This e-mail may contain viruses. Infosys has taken

every reasonable precaution to minimize this risk, but is not liable for any damage

you may sustain as a result of any virus in this e-mail. You should carry out your

own virus checks before opening the e-mail or attachment. Infosys reserves the

right to monitor and review the content of all messages sent to or from this e-mail

address. Messages sent to or from this e-mail address may be stored on the

Infosys e-mail system.

***INFOSYS******** End of Disclaimer ********INFOSYS***

NOTICE: This e-mail message and any attachments are confidential, subject to copyright and
may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not
the intended recipient, please delete and contact the sender immediately. Please consider
the environment before printing this e-mail. AVIS : le pr?sent courriel et toute pi?ce jointe
qui l'accompagne sont confidentiels, prot?g?s par le droit d'auteur et peuvent ?tre couverts
par le secret professionnel. Toute utilisation, copie ou divulgation non autoris?e est interdite.
Si vous n'?tes pas le destinataire pr?vu de ce courriel, supprimez-le et contactez imm?diatement
l'exp?diteur. Veuillez penser ? l'environnement avant d'imprimer le pr?sent courriel

View raw message