Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (nike.apache.org: domain of yuzhihong@gmail.com designates
 209.85.212.41 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type;
        b=MsruYfR39V4JaaOf/i+STfSdZsLboAjx156q27zZWWCL8MSCSjxB3ioh3A22+0q/7B
         +vLUKOOIMevglahZW6jlQH2xQsoQ0Qq1kyn1qoh/nLbdo2kTIjfaP3caAFFaRmrOgZj3
         DT34xvi+G8vs7bsUg/kOK/U0UQhiqY1GYRkqc=
MIME-Version: 1.0
In-Reply-To: 
 <DE357E04199C8A4782DA2C8130FD64CAFF6C3062@RHV-MEXMS-002.corp.ebay.com>
References: 
 <DE357E04199C8A4782DA2C8130FD64CAFF63FCBA@RHV-MEXMS-002.corp.ebay.com>
	<2D6136772A13B84E95DF6DA79E85A9F0011D189817EE@NSPEXMBX-A.the-lab.llnl.gov>
	<DE357E04199C8A4782DA2C8130FD64CAFF6C3062@RHV-MEXMS-002.corp.ebay.com>
Date: Thu, 17 Jun 2010 11:23:03 -0700
Message-ID: <AANLkTikuGaNLhEwD7ToBwskmidy8tunXv_G9J6g5xNpZ@mail.gmail.com>
Subject: Re: MapReduce job runs fine, but nothing is written to HTable
From: Ted Yu <yuzhihong@gmail.com>
To: user@hbase.apache.org
Content-Type: multipart/alternative; boundary=000feaf111411237e204893deef5

--000feaf111411237e204893deef5
Content-Type: text/plain; charset=ISO-8859-1

You can override these methods of org.apache.hadoop.mapreduce.Mapper class.

  /**
   * Called once at the beginning of the task.
   */
  protected void setup(Context context
                       ) throws IOException, InterruptedException {
    // NOTHING
  }

  /**
   * Called once at the end of the task.
   */
  protected void cleanup(Context context
                         ) throws IOException, InterruptedException {
    // NOTHING
  }

On Thu, Jun 17, 2010 at 10:44 AM, Sharma, Avani <agsharma@ebay.com> wrote:

> Thanks, Dave. I have only 2 records in my HDFS file for testing.
> Could you give an example of which setup and cleanup functions you are
> referring to. This is my first MR HBase job using the new api.
> The other commented code in the email thread below runs fine, but it is
> non-MR. Is there any other setting needed for an MR job to update HBase?
>
> The MR jobs are run against `hadoop  jar <jar_name> <class_name>
> <hdfs_file_name>`
> And non-MR hbase jobs simply run against hbase `hbase <classname>`
> I am suspecting that I am missing some setting. I made sure that the
> CLASSPATHs are all good.
>
> This is how I configure the MR job -
>
>  public int run(String[] args) throws Exception {
>
>    Configuration conf = new Configuration();
>    conf.set(TableOutputFormat.OUTPUT_TABLE, "blogposts");
>
>    Job job = new Job(conf, NAME);
>    FileInputFormat.addInputPath(job, new Path(args[0]));
>    job.setJarByClass(mapRedImport_from_hdfs.class);
>    job.setMapperClass(myMap.class);
>    job.setNumReduceTasks(0);
>    job.setOutputFormatClass(NullOutputFormat.class);
>
>    job.waitForCompletion(true);
>
>    return 0;
>  }
>
>  public static void main(String[] args) throws Exception {
>    int errCode = ToolRunner.run(new mapRedImport_from_hdfs(), args);
>    System.exit(errCode);
>   }
>
> -----Original Message-----
> From: Buttler, David [mailto:buttler1@llnl.gov]
> Sent: Thursday, June 17, 2010 8:06 AM
> To: user@hbase.apache.org
> Subject: RE: MapReduce job runs fine, but nothing is written to HTable
>
> It looks to me as if you are not defining your input format correctly.
>  Notice that you only had two map input records.
> Other issues:
> You are not flushing
> You are creating a new htable on each map.  Put that in the setup and put
> the flush in the cleanup
> Dave
>
>
> -----Original Message-----
> From: Sharma, Avani [mailto:agsharma@ebay.com]
> Sent: Wednesday, June 16, 2010 7:06 PM
> To: user@hbase.apache.org
> Subject: MapReduce job runs fine, but nothing is written to HTable
>
> Hi,
>
> I am running a job to write some data from a HDFS file to and Hbase table
> using the new API.
> The job runs fine without any errors, but I do not see the rows added to
> the hbase table.
>
> This is what my code looks like -
> I am running this as hadoop jar <jar_file_name> <class_name>
> <hdfs_file_name>
>
>      private HTable table;
>                protected void  map(ImmutableBytesWritable key, Text value,
> Context context)
>                throws IOException, InterruptedException
>                        {
>                           table = new HTable( new HBaseConfiguration(),
> "blogposts");
>
>                             // Split input line on tab character
>                            String [] splits = value.toString().split("\t");
>                            String rowID = splits[0];
>                            String cellValue = splits[1];
>                          Put p = new Put(Bytes.toBytes(rowID));
>                        p.add(Bytes.toBytes("post"), Bytes.toBytes("title"),
> Bytes.toBytes(splits[1]));
>                        table.put(p);
>                        table.flushCommits();
> }
> /*
>       This commented code when run seprataely in a main program runs fine
> and does update to the table
>        HTable table = new HTable(new HBaseConfiguration(), "blogposts");
>
>        Put p = new Put(Bytes.toBytes("post3"));
>
>        p.add(Bytes.toBytes("post"), Bytes.toBytes("title"),
> Bytes.toBytes("abx"));
>        p.add(Bytes.toBytes("post"), Bytes.toBytes("author"),
> Bytes.toBytes("hadings"));
>        p.add(Bytes.toBytes("image"), Bytes.toBytes("body"),
> Bytes.toBytes("123.jpg"));
>        p.add(Bytes.toBytes("image"), Bytes.toBytes("header"),
> Bytes.toBytes("7657.jpg"));
>
>        table.put(p);
> */
>
> Run log
> 10/06/16 19:00:35 WARN mapred.JobClient: Use GenericOptionsParser for
> parsing the arguments. Applications should implement Tool for the same.
> 10/06/16 19:00:35 INFO input.FileInputFormat: Total input paths to process
> : 1
> 10/06/16 19:00:36 INFO mapred.JobClient: Running job: job_201003301510_0157
> 10/06/16 19:00:37 INFO mapred.JobClient:  map 0% reduce 0%
> 10/06/16 19:00:45 INFO mapred.JobClient:  map 100% reduce 0%
> 10/06/16 19:00:47 INFO mapred.JobClient: Job complete:
> job_201003301510_0157
> 10/06/16 19:00:47 INFO mapred.JobClient: Counters: 5
> 10/06/16 19:00:47 INFO mapred.JobClient:   Job Counters
> 10/06/16 19:00:47 INFO mapred.JobClient:     Rack-local map tasks=1
> 10/06/16 19:00:47 INFO mapred.JobClient:     Launched map tasks=1
> 10/06/16 19:00:47 INFO mapred.JobClient:   FileSystemCounters
> 10/06/16 19:00:47 INFO mapred.JobClient:     HDFS_BYTES_READ=31
> 10/06/16 19:00:47 INFO mapred.JobClient:   Map-Reduce Framework
> 10/06/16 19:00:47 INFO mapred.JobClient:     Map input records=2
> 10/06/16 19:00:47 INFO mapred.JobClient:     Spilled Records=0
>
> Thanks,
> Avani
>
>
>

--000feaf111411237e204893deef5--