hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stuti Awasthi <stutiawas...@hcl.com>
Subject RE: Facing Issues with RowCounter
Date Fri, 18 Nov 2011 07:07:22 GMT
Hi JD,
I have applied the patch and tested it also, its working fine now. :) Thanks

-----Original Message-----
From: Stuti Awasthi 
Sent: Friday, November 18, 2011 11:27 AM
To: user@hbase.apache.org
Subject: RE: Facing Issues with RowCounter

Ok. 
Thanks for update. Il check the patch else I can write my own MR for row count.

Cheers
Stuti

-----Original Message-----
From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of Jean-Daniel Cryans
Sent: Friday, November 18, 2011 3:37 AM
To: user@hbase.apache.org
Subject: Re: Facing Issues with RowCounter

Ah! Took me a moment to figure it out, it's:

https://issues.apache.org/jira/browse/HBASE-4295 "rowcounter does not return the correct number
of rows in certain circumstances"

What made me think about it is that your counters do say that rows were taken into input,
but none counted because the values are empty.
That was the problem in 4295.

The patch is currently only in the tip of the 0.90 branch, so unless you patch it yourself
you'll have to wait for 0.90.5 (which may or may not get released, depends if someone wants
to do it).

J-D

On Wed, Nov 16, 2011 at 9:27 PM, Stuti Awasthi <stutiawasthi@hcl.com> wrote:
> Hi JD,
>
> Table 'Keyword' contains 'Set' column family with 7 rows. Here is the output of scan
:
>
> hbase(main):001:0> scan 'Keyword',{COLUMNS=>['Set']} ROW
> COLUMN+CELL
>  Apache                            column=Set:Fuse, 
> timestamp=1321506922206, value=
>  Apache                            column=Set:Hadoop, 
> timestamp=1321506922206, value=
>  Apache                            column=Set:Hive, 
> timestamp=1321506922206, value=
>  Apache                            column=Set:MySql, 
> timestamp=1321506922206, value=
>  Apache                            column=Set:PHP, 
> timestamp=1321506922206, value=
>  Fuse                                column=Set:Apache, 
> timestamp=1321506922206, value=
>  Fuse                                column=Set:Hdfs, 
> timestamp=1321506922209, value=
>  Hadoop                            column=Set:Apache, 
> timestamp=1321506922209, value=
>  Hadoop                            column=Set:Hive, 
> timestamp=1321506922212, value=
>  Hdfs                              column=Set:Fuse, 
> timestamp=1321506922212, value=
>  Hive                              column=Set:Apache, 
> timestamp=1321506922212, value=
>  Hive                              column=Set:Hadoop, 
> timestamp=1321506922214, value=
>  MySql                             column=Set:Apache, 
> timestamp=1321506922214, value=
>  MySql                             column=Set:PHP, 
> timestamp=1321506922216, value=
>  PHP                               column=Set:Apache, 
> timestamp=1321506922216, value=
>  PHP                               column=Set:MySql, 
> timestamp=1321506922218, value=
> 7 row(s) in 0.4120 seconds
>
> This output is not shown in RowCounter MR job.
>
> -----Original Message-----
> From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of 
> Jean-Daniel Cryans
> Sent: Wednesday, November 16, 2011 11:09 PM
> To: user@hbase.apache.org
> Subject: Re: Facing Issues with RowCounter
>
> What I can decrypt from those outputs is that you have a total of 7 rows, and none of
them have data in the "Set" column family. Is it the case or not? Without more info from you,
it's hard to tell.
>
> J-D
>
> On Tue, Nov 15, 2011 at 11:41 PM, Stuti Awasthi <stutiawasthi@hcl.com> wrote:
>> Hi,
>> I tried to use MR RowCounter to count the rows of a table with specific column family.
But it is not displaying correct result.
>>
>> Command (Only Table Name as argument ):  Hbase/hbase-0.90.3/bin/hbase 
>> org.apache.hadoop.hbase.mapreduce.RowCounter Keyword Output :
>> 11/11/16 13:04:31 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
>> 11/11/16 13:04:32 INFO mapred.JobClient:  map 100% reduce 0%
>> 11/11/16 13:04:32 INFO mapred.JobClient: Job complete: job_local_0001
>> 11/11/16 13:04:32 INFO mapred.JobClient: Counters: 6
>> 11/11/16 13:04:32 INFO mapred.JobClient:
>> org.apache.hadoop.hbase.mapreduce.RowCounter$RowCounterMapper$Counter
>> s
>> 11/11/16 13:04:32 INFO mapred.JobClient:     ROWS=7
>> 11/11/16 13:04:32 INFO mapred.JobClient:   FileSystemCounters
>> 11/11/16 13:04:32 INFO mapred.JobClient:     FILE_BYTES_READ=2373099
>> 11/11/16 13:04:32 INFO mapred.JobClient:
>> FILE_BYTES_WRITTEN=2411923
>> 11/11/16 13:04:32 INFO mapred.JobClient:   Map-Reduce Framework
>> 11/11/16 13:04:32 INFO mapred.JobClient:     Map input records=7
>> 11/11/16 13:04:32 INFO mapred.JobClient:     Spilled Records=0
>> 11/11/16 13:04:32 INFO mapred.JobClient:     Map output records=0
>>
>> Command (TableName, ColumnFamily): Hbase/hbase-0.90.3/bin/hbase 
>> org.apache.hadoop.hbase.mapreduce.RowCounter Keyword Set
>>
>> Output :
>> 11/11/16 13:05:33 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
>> 11/11/16 13:05:34 INFO mapred.JobClient:  map 100% reduce 0%
>> 11/11/16 13:05:34 INFO mapred.JobClient: Job complete: job_local_0001
>> 11/11/16 13:05:34 INFO mapred.JobClient: Counters: 5
>> 11/11/16 13:05:34 INFO mapred.JobClient:   FileSystemCounters
>> 11/11/16 13:05:34 INFO mapred.JobClient:     FILE_BYTES_READ=2373107
>> 11/11/16 13:05:34 INFO mapred.JobClient:
>> FILE_BYTES_WRITTEN=2411939
>> 11/11/16 13:05:34 INFO mapred.JobClient:   Map-Reduce Framework
>> 11/11/16 13:05:34 INFO mapred.JobClient:     Map input records=7
>> 11/11/16 13:05:34 INFO mapred.JobClient:     Spilled Records=0
>> 11/11/16 13:05:34 INFO mapred.JobClient:     Map output records=0
>>
>> Table Describe command Output is :
>> TABLE => {{NAME => 'Keyword', FAMILIES => [{NAME => 'Info', 
>> BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 
>> 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', 
>> IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'Set', 
>> BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 
>> 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', 
>> IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}
>>
>> Am I executing in wrong way or this is some bug ?
>>
>> Regards,
>> Stuti Awasthi
>> HCL Comnet Systems and Services Ltd
>> F-8/9 Basement, Sec-3,Noida.
>>
>>
>> ________________________________
>> ::DISCLAIMER::
>> ---------------------------------------------------------------------
>> -
>> -------------------------------------------------
>>
>> The contents of this e-mail and any attachment(s) are confidential and intended for
the named recipient(s) only.
>> It shall not attach any liability on the originator or HCL or its 
>> affiliates. Any views or opinions presented in this email are solely those of the
author and may not necessarily reflect the opinions of HCL or its affiliates.
>> Any form of reproduction, dissemination, copying, disclosure, 
>> modification, distribution and / or publication of this message 
>> without the prior written consent of the author of this e-mail is 
>> strictly prohibited. If you have received this email in error please delete it and
notify the sender immediately. Before opening any mail and attachments please check them for
viruses and defect.
>>
>> ---------------------------------------------------------------------
>> -
>> -------------------------------------------------
>>
>

Mime
View raw message