hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Hbase and Hadoop Config to run in Standalone mode
Date Thu, 23 Jul 2009 17:11:19 GMT
Ok that explains. The problem you have is that you extended
IdentityTableMap but tried to override it with the wrong method name
so it was never called. Instead it was the parent's map that was
called.

The error it's now giving you is pretty much self-explanatory and is
not related to Hadoop or HBase, you must override the map method and
this is done with @override.

You should also take at look at this doc to learn how to build your
jobs http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/mapred/package-summary.html

J-D

On Thu, Jul 23, 2009 at 1:04 PM, bharath
vissapragada<bharathvissapragada1990@gmail.com> wrote:
> I think this is the problem .. but when i changed it .. it gave me a weird
> error
>
>  name clash:
> map(org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult,org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>,org.apache.hadoop.mapred.Reporter)
> in MR_DS_Scan_Case1 and
> map(org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult,org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult>,org.apache.hadoop.mapred.Reporter)
> in org.apache.hadoop.hbase.mapred.IdentityTableMap have the same erasure,
> yet neither overrides the other
>
> I must override the map function in the IdentityTableMap ... but other
> libraries also seem to have map function ..
> so what must i do ..
>
> On Thu, Jul 23, 2009 at 10:26 PM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:
>
>> Think I found your problem, is this a typo?
>>
>>  public void mapp(ImmutableBytesWritable row, RowResult value,
>> OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
>>
>> I should read map not mapp
>>
>> J-D
>>
>> On Thu, Jul 23, 2009 at 12:42 PM, bharath
>> vissapragada<bharathvissapragada1990@gmail.com> wrote:
>> > I have tried apache -commons logging ...
>> >
>> > instead of printing the row ... i have written log.error(row) ...
>> > even then i got the same output as follows ...
>> >
>> > 09/07/24 03:41:38 INFO jvm.JvmMetrics: Initializing JVM Metrics with
>> > processName=JobTracker, sessionId=
>> > 09/07/24 03:41:38 WARN mapred.JobClient: No job jar file set.  User
>> classes
>> > may not be found. See JobConf(Class) or JobConf#setJar(String).
>> > 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
>> > 0->localhost.localdomain:,
>> > 09/07/24 03:41:39 INFO mapred.JobClient: Running job: job_local_0001
>> > 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
>> > 0->localhost.localdomain:,
>> > 09/07/24 03:41:40 INFO mapred.MapTask: numReduceTasks: 1
>> > 09/07/24 03:41:40 INFO mapred.MapTask: io.sort.mb = 100
>> > 09/07/24 03:41:40 INFO mapred.MapTask: data buffer = 79691776/99614720
>> > 09/07/24 03:41:40 INFO mapred.MapTask: record buffer = 262144/327680
>> > 09/07/24 03:41:40 INFO mapred.MapTask: Starting flush of map output
>> > 09/07/24 03:41:40 INFO mapred.MapTask: Finished spill 0
>> > 09/07/24 03:41:40 INFO mapred.TaskRunner:
>> Task:attempt_local_0001_m_000000_0
>> > is done. And is in the process of commiting
>> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
>> > 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
>> > 'attempt_local_0001_m_000000_0' done.
>> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
>> > 09/07/24 03:41:40 INFO mapred.Merger: Merging 1 sorted segments
>> > 09/07/24 03:41:40 INFO mapred.Merger: Down to the last merge-pass, with 1
>> > segments left of total size: 333 bytes
>> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
>> > 09/07/24 03:41:40 INFO mapred.TaskRunner:
>> Task:attempt_local_0001_r_000000_0
>> > is done. And is in the process of commiting
>> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner: reduce > reduce
>> > 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
>> > 'attempt_local_0001_r_000000_0' done.
>> > 09/07/24 03:41:40 INFO mapred.JobClient: Job complete: job_local_0001
>> > 09/07/24 03:41:40 INFO mapred.JobClient: Counters: 11
>> > 09/07/24 03:41:40 INFO mapred.JobClient:   File Systems
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes read=38933
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes written=78346
>> > 09/07/24 03:41:40 INFO mapred.JobClient:   Map-Reduce Framework
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input groups=8
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Combine output records=0
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map input records=8
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce output records=8
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map output bytes=315
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map input bytes=0
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Combine input records=0
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map output records=8
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input records=8
>> >
>> >
>> > On Thu, Jul 23, 2009 at 9:32 PM, Jean-Daniel Cryans <jdcryans@apache.org
>> >wrote:
>> >
>> >> And you don't need any more config to run local MR jobs on HBase. But
>> >> you do need Hadoop when running MR jobs on HBase on a cluster.
>> >>
>> >> Also your code is running fine as you could see, the real question is
>> >> where is the stdout going when in local mode. When you ran your other
>> >> MR jobs, it was on a working Hadoop setup right? So you were looking
>> >> at the logs in the web UI? One simple thing to do is to do your
>> >> debugging with a logger so you are sure to see your output as I
>> >> already proposed. Another simple thing is to get a pseudo-distributed
>> >> setup and run you HBase MR jobs with Hadoop and get your logs like I'm
>> >> sure you did before.
>> >>
>> >> J-D
>> >>
>> >> On Thu, Jul 23, 2009 at 11:54 AM, bharath
>> >> vissapragada<bharathvissapragada1990@gmail.com> wrote:
>> >> > I am really thankful to you J-D for replying me inspite of ur busy
>> >> schedule.
>> >> > I am still in a learning stage and there are no good guides on HBase
>> >> other
>> >> > than Its own one .. So please spare me and I really appreciate ur help
>> .
>> >> >
>> >> > Now i got ur point that there is no need of hadoop while running Hbase
>> MR
>> >> > programs .... But iam confused abt the config . I have only set the
>> >> > JAVA_HOME in the "hbase-env.sh" and other than that i didn't do
>> anything
>> >> ..
>> >> > so i wonder if my conf was wrong or some error in that simple code
...
>> >> > because stdout worked for me while writing mapreduce programs ...
>> >> >
>> >> > Thanks once again!
>> >> >
>> >> > On Thu, Jul 23, 2009 at 9:14 PM, Jean-Daniel Cryans <
>> jdcryans@apache.org
>> >> >wrote:
>> >> >
>> >> >> The code itself is very simple, I was referring to your own
>> >> >> description of your situation. You say you use standalone HBase
yet
>> >> >> you talk about Hadoop configuration. You also talk about the
>> >> >> JobTracker web UI which is in no use since you run local jobs
>> directly
>> >> >> on HBase.
>> >> >>
>> >> >> J-D
>> >> >>
>> >> >> On Thu, Jul 23, 2009 at 11:41 AM, bharath
>> >> >> vissapragada<bharathvissapragada1990@gmail.com> wrote:
>> >> >> > I used stdout for debugging while writing codes in hadoop
MR
>> programs
>> >> and
>> >> >> it
>> >> >> > worked fine ...
>> >> >> > Can you please tell me wch part of the code u found confusing
so
>> that
>> >> i
>> >> >> can
>> >> >> > explain it a bit clearly ...
>> >> >> >
>> >> >> >
>> >> >> > On Thu, Jul 23, 2009 at 9:06 PM, Jean-Daniel Cryans <
>> >> jdcryans@apache.org
>> >> >> >wrote:
>> >> >> >
>> >> >> >> What you wrote is a bit confusing to me, sorry.
>> >> >> >>
>> >> >> >> The usual way to debug MR jobs is to define a logger and
post with
>> >> >> >> either info or debug level, not sysout like you did. I'm
not even
>> >> sure
>> >> >> >> where the standard output is logged when using a local
job. Also
>> >> since
>> >> >> >> this is local you won't see anything in your host:50030
web UI. So
>> >> use
>> >> >> >> apache common logging and you should see your output.
>> >> >> >>
>> >> >> >> J-D
>> >> >> >>
>> >> >> >> On Thu, Jul 23, 2009 at 11:13 AM, bharath
>> >> >> >> vissapragada<bharathvissapragada1990@gmail.com>
wrote:
>> >> >> >> > Thanks for ur reply J-D ... Im pasting some part
of the code ...
>> >> >> >> >
>> >> >> >> > Im doing it frm the command line .. Iam pasting some
part of the
>> >> code
>> >> >> >> here
>> >> >> >> > ....
>> >> >> >> >
>> >> >> >> >  public void mapp(ImmutableBytesWritable row, RowResult
value,
>> >> >> >> > OutputCollector<Text, Text> output, Reporter
reporter) throws
>> >> >> IOException
>> >> >> >> {
>> >> >> >> >                System.out.println(row);
>> >> >> >> > }
>> >> >> >> >
>> >> >> >> > public JobConf createSubmittableJob(String[] args)
throws
>> >> IOException
>> >> >> {
>> >> >> >> >                JobConf c = new JobConf(getConf(),
>> >> >> >> MR_DS_Scan_Case1.class);
>> >> >> >> >                c.set("col.name", args[1]);
>> >> >> >> >                c.set("operator.name",args[2]);
>> >> >> >> >                c.set("val.name",args[3]);
>> >> >> >> >                IdentityTableMap.initJob(args[0],
args[1],
>> >> >> >> this.getClass(),
>> >> >> >> > c);
>> >> >> >> >                c.setOutputFormat(NullOutputFormat.class);
>> >> >> >> >                 return c
>> >> >> >> > }
>> >> >> >> >
>> >> >> >> > As u can see ... im just printing the value of row
in the map ..
>> i
>> >> >> can't
>> >> >> >> see
>> >> >> >> > in the terminal .....
>> >> >> >> > I only wan't the map phase ... so i didn't write
any reduce
>> phase
>> >> ..
>> >> >> is
>> >> >> >> my
>> >> >> >> > jobConf correct??
>> >> >> >> >
>> >> >> >> > Also as i have already asked how to check the job
logs and web
>> >> >> interface
>> >> >> >> > like "localhost:<port>/jobTracker.jsp"... since
im running in
>> local
>> >> >> mode
>> >> >> >> ...
>> >> >> >> >
>> >> >> >> > On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans
<
>> >> >> jdcryans@apache.org
>> >> >> >> >wrote:
>> >> >> >> >
>> >> >> >> >> What output do you need exactly? I see that you
have 8 output
>> >> records
>> >> >> >> >> in your reduce task so if you take a look in
your output folder
>> or
>> >> >> >> >> table (I don't know which sink you used) you
should see them.
>> >> >> >> >>
>> >> >> >> >> Also did you run your MR inside Eclipse or in
command line?
>> >> >> >> >>
>> >> >> >> >> Thx,
>> >> >> >> >>
>> >> >> >> >> J-D
>> >> >> >> >>
>> >> >> >> >> On Thu, Jul 23, 2009 at 8:30 AM, bharath
>> >> >> >> >> vissapragada<bharat_v@students.iiit.ac.in>
wrote:
>> >> >> >> >> > This is the output i go t.. seems everything
is fine ..but no
>> >> >> output!!
>> >> >> >> >> >
>> >> >> >> >> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing
JVM
>> Metrics
>> >> >> with
>> >> >> >> >> > processName=JobTracker, sessionId=
>> >> >> >> >> > 09/07/23 23:25:36 WARN mapred.JobClient:
No job jar file set.
>> >>  User
>> >> >> >> >> classes
>> >> >> >> >> > may not be found. See JobConf(Class) or
>> JobConf#setJar(String).
>> >> >> >> >> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase:
split:
>> >> >> >> >> > 0->localhost.localdomain:,
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.JobClient:
Running job:
>> >> >> job_local_0001
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase:
split:
>> >> >> >> >> > 0->localhost.localdomain:,
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks:
1
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb
= 100
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: data
buffer =
>> >> >> 79691776/99614720
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: record
buffer =
>> >> >> 262144/327680
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting
flush of map
>> >> output
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished
spill 0
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>> >> >> >> >> Task:attempt_local_0001_m_000000_0
>> >> >> >> >> > is done. And is in the process of commiting
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
Task
>> >> >> >> >> > 'attempt_local_0001_m_000000_0' done.
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Merging
1 sorted
>> segments
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Down
to the last
>> >> merge-pass,
>> >> >> >> with 1
>> >> >> >> >> > segments left of total size: 333 bytes
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>> >> >> >> >> Task:attempt_local_0001_r_000000_0
>> >> >> >> >> > is done. And is in the process of commiting
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
reduce > reduce
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
Task
>> >> >> >> >> > 'attempt_local_0001_r_000000_0' done.
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:
Job complete:
>> >> >> job_local_0001
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:
Counters: 11
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:
  File Systems
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:
    Local bytes
>> >> read=38949
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:
    Local bytes
>> >> >> written=78378
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:
  Map-Reduce
>> Framework
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:
    Reduce input
>> >> groups=8
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:
    Combine output
>> >> >> records=0
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:
    Map input
>> records=8
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:
    Reduce output
>> >> >> records=8
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:
    Map output
>> >> bytes=315
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:
    Map input
>> bytes=0
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:
    Combine input
>> >> >> records=0
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:
    Map output
>> >> records=8
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:
    Reduce input
>> >> records=8
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> > On Thu, Jul 23, 2009 at 12:17 PM, bharath
vissapragada <
>> >> >> >> >> > bharat_v@students.iiit.ac.in> wrote:
>> >> >> >> >> >
>> >> >> >> >> >> since i haven;t started the cluster
.. i can even see the
>> >> details
>> >> >> in
>> >> >> >> >> >> "localhost:<port>/jobTracker.jsp"
..  i didn't even add
>> >> anything
>> >> >> to
>> >> >> >> >> >> hadoop/conf/hadoop-site.xml
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath
vissapragada <
>> >> >> >> >> >> bharat_v@students.iiit.ac.in> wrote:
>> >> >> >> >> >>
>> >> >> >> >> >>> Hi all ,
>> >> >> >> >> >>>
>> >> >> >> >> >>> I wanted to run HBase in standalone
mode to check my Hbase
>> MR
>> >> >> >> programs
>> >> >> >> >> ...
>> >> >> >> >> >>> I have dl a built version of hbase-0.20.
and i have hadoop
>> >> 0.19.3
>> >> >> >> >> >>>
>> >> >> >> >> >>> "I have set JAVA_HOME in both of
them" .. then i started
>> hbase
>> >> >> and
>> >> >> >> >> >>> inserted some tables using JAVA
API .. Now i have written
>> some
>> >> MR
>> >> >> >> >> programs
>> >> >> >> >> >>> onHBase and when i run them on Hbase
it runs perfectly
>> without
>> >> >> any
>> >> >> >> >> errors
>> >> >> >> >> >>> and all the Map -reduce statistics
are displayed correctly
>> but
>> >>  i
>> >> >> >> get
>> >> >> >> >> no
>> >> >> >> >> >>> output .
>> >> >> >> >> >>>
>> >> >> >> >> >>> I have one doubt now .. how do HBase
recognize hadoop in
>> stand
>> >> >> alone
>> >> >> >> >> >>> mode(i haven;t started my hadoop
even) .. Even simple print
>> >> >> >> statements
>> >> >> >> >> donot
>> >> >> >> >> >>> work .. no output is displayed on
the screen ... I doubt my
>> >> >> config
>> >> >> >> ....
>> >> >> >> >> >>>
>> >> >> >> >> >>> Do i need to add some config to
run them ... Please reply
>> ...
>> >> >> >> >> >>>
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >
>> >> >> >> >>
>> >> >> >> >
>> >> >> >>
>> >> >> >
>> >> >>
>> >> >
>> >>
>> >
>>
>

Mime
View raw message