hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Hbase and Hadoop Config to run in Standalone mode
Date Thu, 23 Jul 2009 16:56:29 GMT
Think I found your problem, is this a typo?

 public void mapp(ImmutableBytesWritable row, RowResult value,
OutputCollector<Text, Text> output, Reporter reporter) throws IOException {

I should read map not mapp

J-D

On Thu, Jul 23, 2009 at 12:42 PM, bharath
vissapragada<bharathvissapragada1990@gmail.com> wrote:
> I have tried apache -commons logging ...
>
> instead of printing the row ... i have written log.error(row) ...
> even then i got the same output as follows ...
>
> 09/07/24 03:41:38 INFO jvm.JvmMetrics: Initializing JVM Metrics with
> processName=JobTracker, sessionId=
> 09/07/24 03:41:38 WARN mapred.JobClient: No job jar file set.  User classes
> may not be found. See JobConf(Class) or JobConf#setJar(String).
> 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
> 0->localhost.localdomain:,
> 09/07/24 03:41:39 INFO mapred.JobClient: Running job: job_local_0001
> 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
> 0->localhost.localdomain:,
> 09/07/24 03:41:40 INFO mapred.MapTask: numReduceTasks: 1
> 09/07/24 03:41:40 INFO mapred.MapTask: io.sort.mb = 100
> 09/07/24 03:41:40 INFO mapred.MapTask: data buffer = 79691776/99614720
> 09/07/24 03:41:40 INFO mapred.MapTask: record buffer = 262144/327680
> 09/07/24 03:41:40 INFO mapred.MapTask: Starting flush of map output
> 09/07/24 03:41:40 INFO mapred.MapTask: Finished spill 0
> 09/07/24 03:41:40 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000000_0
> is done. And is in the process of commiting
> 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
> 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
> 'attempt_local_0001_m_000000_0' done.
> 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
> 09/07/24 03:41:40 INFO mapred.Merger: Merging 1 sorted segments
> 09/07/24 03:41:40 INFO mapred.Merger: Down to the last merge-pass, with 1
> segments left of total size: 333 bytes
> 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
> 09/07/24 03:41:40 INFO mapred.TaskRunner: Task:attempt_local_0001_r_000000_0
> is done. And is in the process of commiting
> 09/07/24 03:41:40 INFO mapred.LocalJobRunner: reduce > reduce
> 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
> 'attempt_local_0001_r_000000_0' done.
> 09/07/24 03:41:40 INFO mapred.JobClient: Job complete: job_local_0001
> 09/07/24 03:41:40 INFO mapred.JobClient: Counters: 11
> 09/07/24 03:41:40 INFO mapred.JobClient:   File Systems
> 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes read=38933
> 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes written=78346
> 09/07/24 03:41:40 INFO mapred.JobClient:   Map-Reduce Framework
> 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input groups=8
> 09/07/24 03:41:40 INFO mapred.JobClient:     Combine output records=0
> 09/07/24 03:41:40 INFO mapred.JobClient:     Map input records=8
> 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce output records=8
> 09/07/24 03:41:40 INFO mapred.JobClient:     Map output bytes=315
> 09/07/24 03:41:40 INFO mapred.JobClient:     Map input bytes=0
> 09/07/24 03:41:40 INFO mapred.JobClient:     Combine input records=0
> 09/07/24 03:41:40 INFO mapred.JobClient:     Map output records=8
> 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input records=8
>
>
> On Thu, Jul 23, 2009 at 9:32 PM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:
>
>> And you don't need any more config to run local MR jobs on HBase. But
>> you do need Hadoop when running MR jobs on HBase on a cluster.
>>
>> Also your code is running fine as you could see, the real question is
>> where is the stdout going when in local mode. When you ran your other
>> MR jobs, it was on a working Hadoop setup right? So you were looking
>> at the logs in the web UI? One simple thing to do is to do your
>> debugging with a logger so you are sure to see your output as I
>> already proposed. Another simple thing is to get a pseudo-distributed
>> setup and run you HBase MR jobs with Hadoop and get your logs like I'm
>> sure you did before.
>>
>> J-D
>>
>> On Thu, Jul 23, 2009 at 11:54 AM, bharath
>> vissapragada<bharathvissapragada1990@gmail.com> wrote:
>> > I am really thankful to you J-D for replying me inspite of ur busy
>> schedule.
>> > I am still in a learning stage and there are no good guides on HBase
>> other
>> > than Its own one .. So please spare me and I really appreciate ur help .
>> >
>> > Now i got ur point that there is no need of hadoop while running Hbase MR
>> > programs .... But iam confused abt the config . I have only set the
>> > JAVA_HOME in the "hbase-env.sh" and other than that i didn't do anything
>> ..
>> > so i wonder if my conf was wrong or some error in that simple code ...
>> > because stdout worked for me while writing mapreduce programs ...
>> >
>> > Thanks once again!
>> >
>> > On Thu, Jul 23, 2009 at 9:14 PM, Jean-Daniel Cryans <jdcryans@apache.org
>> >wrote:
>> >
>> >> The code itself is very simple, I was referring to your own
>> >> description of your situation. You say you use standalone HBase yet
>> >> you talk about Hadoop configuration. You also talk about the
>> >> JobTracker web UI which is in no use since you run local jobs directly
>> >> on HBase.
>> >>
>> >> J-D
>> >>
>> >> On Thu, Jul 23, 2009 at 11:41 AM, bharath
>> >> vissapragada<bharathvissapragada1990@gmail.com> wrote:
>> >> > I used stdout for debugging while writing codes in hadoop MR programs
>> and
>> >> it
>> >> > worked fine ...
>> >> > Can you please tell me wch part of the code u found confusing so that
>> i
>> >> can
>> >> > explain it a bit clearly ...
>> >> >
>> >> >
>> >> > On Thu, Jul 23, 2009 at 9:06 PM, Jean-Daniel Cryans <
>> jdcryans@apache.org
>> >> >wrote:
>> >> >
>> >> >> What you wrote is a bit confusing to me, sorry.
>> >> >>
>> >> >> The usual way to debug MR jobs is to define a logger and post with
>> >> >> either info or debug level, not sysout like you did. I'm not even
>> sure
>> >> >> where the standard output is logged when using a local job. Also
>> since
>> >> >> this is local you won't see anything in your host:50030 web UI.
So
>> use
>> >> >> apache common logging and you should see your output.
>> >> >>
>> >> >> J-D
>> >> >>
>> >> >> On Thu, Jul 23, 2009 at 11:13 AM, bharath
>> >> >> vissapragada<bharathvissapragada1990@gmail.com> wrote:
>> >> >> > Thanks for ur reply J-D ... Im pasting some part of the code
...
>> >> >> >
>> >> >> > Im doing it frm the command line .. Iam pasting some part
of the
>> code
>> >> >> here
>> >> >> > ....
>> >> >> >
>> >> >> >  public void mapp(ImmutableBytesWritable row, RowResult value,
>> >> >> > OutputCollector<Text, Text> output, Reporter reporter)
throws
>> >> IOException
>> >> >> {
>> >> >> >                System.out.println(row);
>> >> >> > }
>> >> >> >
>> >> >> > public JobConf createSubmittableJob(String[] args) throws
>> IOException
>> >> {
>> >> >> >                JobConf c = new JobConf(getConf(),
>> >> >> MR_DS_Scan_Case1.class);
>> >> >> >                c.set("col.name", args[1]);
>> >> >> >                c.set("operator.name",args[2]);
>> >> >> >                c.set("val.name",args[3]);
>> >> >> >                IdentityTableMap.initJob(args[0], args[1],
>> >> >> this.getClass(),
>> >> >> > c);
>> >> >> >                c.setOutputFormat(NullOutputFormat.class);
>> >> >> >                 return c
>> >> >> > }
>> >> >> >
>> >> >> > As u can see ... im just printing the value of row in the
map .. i
>> >> can't
>> >> >> see
>> >> >> > in the terminal .....
>> >> >> > I only wan't the map phase ... so i didn't write any reduce
phase
>> ..
>> >> is
>> >> >> my
>> >> >> > jobConf correct??
>> >> >> >
>> >> >> > Also as i have already asked how to check the job logs and
web
>> >> interface
>> >> >> > like "localhost:<port>/jobTracker.jsp"... since im running
in local
>> >> mode
>> >> >> ...
>> >> >> >
>> >> >> > On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans <
>> >> jdcryans@apache.org
>> >> >> >wrote:
>> >> >> >
>> >> >> >> What output do you need exactly? I see that you have 8
output
>> records
>> >> >> >> in your reduce task so if you take a look in your output
folder or
>> >> >> >> table (I don't know which sink you used) you should see
them.
>> >> >> >>
>> >> >> >> Also did you run your MR inside Eclipse or in command
line?
>> >> >> >>
>> >> >> >> Thx,
>> >> >> >>
>> >> >> >> J-D
>> >> >> >>
>> >> >> >> On Thu, Jul 23, 2009 at 8:30 AM, bharath
>> >> >> >> vissapragada<bharat_v@students.iiit.ac.in> wrote:
>> >> >> >> > This is the output i go t.. seems everything is fine
..but no
>> >> output!!
>> >> >> >> >
>> >> >> >> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing
JVM Metrics
>> >> with
>> >> >> >> > processName=JobTracker, sessionId=
>> >> >> >> > 09/07/23 23:25:36 WARN mapred.JobClient: No job jar
file set.
>>  User
>> >> >> >> classes
>> >> >> >> > may not be found. See JobConf(Class) or JobConf#setJar(String).
>> >> >> >> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase:
split:
>> >> >> >> > 0->localhost.localdomain:,
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.JobClient: Running
job:
>> >> job_local_0001
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase:
split:
>> >> >> >> > 0->localhost.localdomain:,
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks:
1
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb
= 100
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: data buffer
=
>> >> 79691776/99614720
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: record buffer
=
>> >> 262144/327680
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush
of map
>> output
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill
0
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>> >> >> >> Task:attempt_local_0001_m_000000_0
>> >> >> >> > is done. And is in the process of commiting
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>> >> >> >> > 'attempt_local_0001_m_000000_0' done.
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted
segments
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Down to the
last
>> merge-pass,
>> >> >> with 1
>> >> >> >> > segments left of total size: 333 bytes
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>> >> >> >> Task:attempt_local_0001_r_000000_0
>> >> >> >> > is done. And is in the process of commiting
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce
> reduce
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>> >> >> >> > 'attempt_local_0001_r_000000_0' done.
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Job complete:
>> >> job_local_0001
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Counters:
11
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   File
Systems
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local
bytes
>> read=38949
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local
bytes
>> >> written=78378
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce
Framework
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce
input
>> groups=8
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine
output
>> >> records=0
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map
input records=8
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce
output
>> >> records=8
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map
output
>> bytes=315
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map
input bytes=0
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine
input
>> >> records=0
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map
output
>> records=8
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce
input
>> records=8
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada
<
>> >> >> >> > bharat_v@students.iiit.ac.in> wrote:
>> >> >> >> >
>> >> >> >> >> since i haven;t started the cluster .. i can
even see the
>> details
>> >> in
>> >> >> >> >> "localhost:<port>/jobTracker.jsp" ..  i
didn't even add
>> anything
>> >> to
>> >> >> >> >> hadoop/conf/hadoop-site.xml
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada
<
>> >> >> >> >> bharat_v@students.iiit.ac.in> wrote:
>> >> >> >> >>
>> >> >> >> >>> Hi all ,
>> >> >> >> >>>
>> >> >> >> >>> I wanted to run HBase in standalone mode
to check my Hbase MR
>> >> >> programs
>> >> >> >> ...
>> >> >> >> >>> I have dl a built version of hbase-0.20.
and i have hadoop
>> 0.19.3
>> >> >> >> >>>
>> >> >> >> >>> "I have set JAVA_HOME in both of them" ..
then i started hbase
>> >> and
>> >> >> >> >>> inserted some tables using JAVA API .. Now
i have written some
>> MR
>> >> >> >> programs
>> >> >> >> >>> onHBase and when i run them on Hbase it runs
perfectly without
>> >> any
>> >> >> >> errors
>> >> >> >> >>> and all the Map -reduce statistics are displayed
correctly but
>>  i
>> >> >> get
>> >> >> >> no
>> >> >> >> >>> output .
>> >> >> >> >>>
>> >> >> >> >>> I have one doubt now .. how do HBase recognize
hadoop in stand
>> >> alone
>> >> >> >> >>> mode(i haven;t started my hadoop even) ..
Even simple print
>> >> >> statements
>> >> >> >> donot
>> >> >> >> >>> work .. no output is displayed on the screen
... I doubt my
>> >> config
>> >> >> ....
>> >> >> >> >>>
>> >> >> >> >>> Do i need to add some config to run them
... Please reply ...
>> >> >> >> >>>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >
>> >> >> >>
>> >> >> >
>> >> >>
>> >> >
>> >>
>> >
>>
>

Mime
View raw message