hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Washusen <...@reactive.org>
Subject Re: Support for MultiGet / SQL In clause -- error in patch HBASE-1845
Date Sun, 24 Jan 2010 08:29:58 GMT
Sounds like it's some sort of reporting system. Have you considered
duplicating data into reporting tables?

Write all the game details into the main table then map reduce into
your reporting tables...

On 24/01/2010, at 7:07 PM, "Sriram Muthuswamy Chittathoor"
<sriramc@ivycomptech.com
 > wrote:

> However, I'd only recommend using secondary index as a last resort.
> First I'd try doing everything I can to work with the index I get for
> free. The row key.  It sounds like you have done this already...
> --
>
> The only reason why this is important to me is because of the
> following
>
> 1.  I am storing at a minimal 1 yrs worth of data (small rows --  10
> billion)
>
> 2.  Row key is   user + date   (columns  --   gameid ,  opponent etc)
>
> 3.  Queries may be something like give me details for a particular
> "gameid"
>
> 4.  To do step 3  I am assuming I need something like a secondary
> index
> or else given my row key  how else can I do it
>
>
>
> -----Original Message-----
> From: Daniel Washusen [mailto:dan@reactive.org]
> Sent: Sunday, January 24, 2010 3:16 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
> HBASE-1845
>
> Well, it CAN be a RAM hog ;-). It depends what you're indexing. Each
> unique value in the indexed column resides in memory. If you index a
> column that contains 1 million random 1KB values then the index will
> require at least 1GB of memory. Also it *can* slow down writes,
> especially when bulk loading sequential keys.
>
> On the up side, it can make scans dramatically faster.
>
> However, I'd only recommend using secondary index as a last resort.
> First I'd try doing everything I can to work with the index I get for
> free. The row key.  It sounds like you have done this already...
>
> Cheers,
> Dan
>
> On 24/01/2010, at 7:02 AM, Stack <stack@duboce.net> wrote:
>
>> On Sat, Jan 23, 2010 at 2:52 AM, Sriram Muthuswamy Chittathoor
>> <sriramc@ivycomptech.com> wrote:
>>> Thanks all.  I messed it up when I was trying to upgrade to
>>> 0.20.3.  I deleted the data directory and formatted it thinking it
>>> will reset the whole cluster.
>>>
>>> I started fresh by deleting the data directory on all the nodes and
>>> then everything worked.  I was also able to create the indexed
>>> table using the 0.20.3 patch.  Let me run some tests on a few
>>> million rows and see how it holds up.
>>>
>>> BTW --  what would be the right way when I moved versions.  Do I
>>> run migrate scripts to migrate the data to newer versions ?
>>>
>> Just install the new binaries every and restart or perform a rolling
>> restart -- see http://wiki.apache.org/hadoop/Hbase/RollingRestart --
>> if you would avoid taking down your cluster during the upgrade.
>>
>> You'll be flagged on start if you need to run a migration but general
>> rule is that there (should) never be need of a migration between
>> patch
>> releases: e.g. between 0.20.2 to 0.20.3.  There may be need of
>> migrations moving between minor numbers; e.g. from 0.19 to 0.20.
>>
>> Let us know how IHBase works out for you (indexed hbase).  Its a RAM
>> hog but the speed improvement finding matching cells can be
>> startling.
>>
>> St.Ack
>>
>>> -----Original Message-----
>>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
>>> Stack
>>> Sent: Saturday, January 23, 2010 5:00 AM
>>> To: hbase-user@hadoop.apache.org
>>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>> HBASE-1845
>>>
>>> Check your master log.  Something is seriously off if you do not
>>> have
>>> a reachable .META. table.
>>> St.Ack
>>>
>>> On Fri, Jan 22, 2010 at 1:09 PM, Sriram Muthuswamy Chittathoor
>>> <sriramc@ivycomptech.com> wrote:
>>>> I applied the hbase-0.20.3 version / hadoop 0.20.1.  But after
>>>> starting
>>>> hbase I keep getting the error below when I go to the hbase shell
>>>>
>>>> [ppoker@karisimbivir1 hbase-0.20.3]$ ./bin/hbase shell
>>>> HBase Shell; enter 'help<RETURN>' for list of supported commands.
>>>> Version: 0.20.3, r900041, Sat Jan 16 17:20:21 PST 2010
>>>> hbase(main):001:0> list
>>>> NativeException:
>>>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
>>>> contact region server null for region , row '', but failed after 7
>>>> attempts.
>>>> Exceptions:
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>>
>>>>
>>>>
>>>> Also when I try to create a table programatically I get this --
>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Attempting connection
>>>> to
>>>> server localhost/127.0.0.1:2181
>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Priming connection to
>>>> java.nio.channels.SocketChannel[connected local=/127.0.0.1:43775
>>>> remote=localhost/127.0.0.1:2181]
>>>> 10/01/22 15:48:23 INFO zookeeper.ClientCnxn: Server connection
>>>> successful
>>>> Exception in thread "main"
>>>> org.apache.hadoop.hbase.TableNotFoundException: .META.
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ionInMeta(HConnectionManager.java:684)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:634)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:601)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ionInMeta(HConnectionManager.java:675)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:638)
>>>>       at
>>>> org.apache.hadoop.hbase.client.HConnectionManager
>>>> $TableServers.locateReg
>>>> ion(HConnectionManager.java:601)
>>>>       at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>>> 128)
>>>>       at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:
>>>> 106)
>>>>       at test.CreateTable.main(CreateTable.java:36)
>>>>
>>>>
>>>>
>>>> Any clues ?
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Dan Washusen [mailto:dan@reactive.org]
>>>> Sent: Friday, January 22, 2010 4:53 AM
>>>> To: hbase-user@hadoop.apache.org
>>>> Subject: Re: Support for MultiGet / SQL In clause -- error in patch
>>>> HBASE-1845
>>>>
>>>> If you want to give the "indexed" contrib package a try you'll
>>>> need to
>>>> do
>>>> the following:
>>>>
>>>>  1. Include the contrib jars (export HBASE_CLASSPATH=(`find
>>>>  /path/to/hbase/hbase-0.20.3/contrib/indexed -name '*jar' | tr -s
>>>> "\n"
>>>> ":"`)
>>>>  2. Set the 'hbase.hregion.impl' property to
>>>>  'org.apache.hadoop.hbase.regionserver.IdxRegion' in your
>>>> hbase-site.xml
>>>>
>>>> Once you've done that you can create a table with an index using:
>>>>
>>>>>    // define which qualifiers need an index (choosing the correct
>>>> type)
>>>>>    IdxColumnDescriptor columnDescriptor = new
>>>>> IdxColumnDescriptor("columnFamily");
>>>>>    columnDescriptor.addIndexDescriptor(
>>>>>      new IdxIndexDescriptor("qualifier",
>>>>> IdxQualifierType.BYTE_ARRAY)
>>>>>    );
>>>>>
>>>>>    HTableDescriptor tableDescriptor = new HTableDescriptor
>>>>> ("table");
>>>>>    tableDescriptor.addFamily(columnDescriptor);
>>>>>
>>>>
>>>> Then when you want to perform a scan with an index hint:
>>>>
>>>>>    Scan scan = new IdxScan(
>>>>>          new Comparison("columnFamily", "qualifier",
>>>>>              Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>>      );
>>>>>
>>>>
>>>> You have to keep in mind that the index hint is only a hint.  It
>>>> guarantees
>>>> that your scan will get all rows that match the hint but you'll
>>>> more
>>>> than
>>>> likely receive rows that don't.  For this reason I'd suggest that
>>>> you
>>>> also
>>>> include a filter along with the scan:
>>>>
>>>>>      Scan scan = new IdxScan(
>>>>>          new Comparison("columnFamily", "qualifier",
>>>>>              Comparison.Operator.EQ, Bytes.toBytes("foo"))
>>>>>      );
>>>>>      scan.setFilter(
>>>>>          new SingleColumnValueFilter(
>>>>>              "columnFamily", "qualifer",
>>>> CompareFilter.CompareOp.EQUAL,
>>>>>              new BinaryComparator("foo")
>>>>>          )
>>>>>      );
>>>>>
>>>>
>>>> Cheers,
>>>> Dan
>>>>
>>>>
>>>> 2010/1/22 stack <stack@duboce.net>
>>>>
>>>>>
>>>>
> http://people.apache.org/~jdcryans/hbase-0.20.3-candidate-2/<http://peop
>>>> le.apache.org/%7Ejdcryans/hbase-0.20.3-candidate-2/>
>>>>>
>>>>> There is a bit of documentation if you look at javadoc for the
>>>>> 'indexed' contrib (This is what hbase-2073 is called on commit).
>>>>>
>>>>> St.Ack
>>>>>
>>>>> P.S. We had a thread going named "HBase bulk load".  You got all
>>>>> the
>>>>> answers you need on that one?
>>>>>
>>>>> On Thu, Jan 21, 2010 at 11:19 AM, Sriram Muthuswamy Chittathoor
>>>>> <sriramc@ivycomptech.com> wrote:
>>>>>>
>>>>>> Great.  Can I migrate to 0.20.3RC2 easily.  I am on 0.20.2. Can u
>>>> pass
>>>>>> me the link
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf
>>>>>> Of
>>>>>> stack
>>>>>> Sent: Friday, January 22, 2010 12:42 AM
>>>>>> To: hbase-user@hadoop.apache.org
>>>>>> Subject: Re: Support for MultiGet / SQL In clause -- error in
>>>>>> patch
>>>>>> HBASE-1845
>>>>>>
>>>>>> IIRC, hbase-1845 was a sketch only and not yet complete.  Its
>>>> probably
>>>>>> rotted since any ways.
>>>>>>
>>>>>> Have you looked at hbase-2037 since committed and available in
>>>>>> 0.20.3RC2.
>>>>>> Would this help you with your original problem?
>>>>>>
>>>>>> St.Ack
>>>>>>
>>>>>> On Thu, Jan 21, 2010 at 9:10 AM, Sriram Muthuswamy Chittathoor <
>>>>>> sriramc@ivycomptech.com> wrote:
>>>>>>
>>>>>>> I tried applying the patch to the hbase source code  hbase
>>>>>>> 0.20.2
>>>> and
>>>>>> I
>>>>>>> get the errors below.  Do you know if this needs to be applied
>>>>>>> to
>>>> a
>>>>>>> specific hbase version. Is there a version which works with
>>>>>>> 0.20.2
>>>> or
>>>>>>> later ??
>>>>>>> Basically HRegionServer  and HTable patching fails.
>>>>>>>
>>>>>>>
>>>>>>> Thanks for the help
>>>>>>>
>>>>>>> patch -p0 -i batch.patch
>>>>>>>
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Get.java
>>>>>>> Hunk #1 succeeded at 61 (offset 2 lines).
>>>>>>> Hunk #2 succeeded at 347 (offset 31 lines).
>>>>>>> patching file
>>>> src/java/org/apache/hadoop/hbase/client/HConnection.java
>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
>>>>>>> Hunk #3 succeeded at 1244 (offset 6 lines).
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/
>>>>>>> HTable.java
>>>>>>> Hunk #2 succeeded at 73 (offset 8 lines).
>>>>>>> Hunk #4 FAILED at 405.
>>>>>>> Hunk #5 succeeded at 671 with fuzz 2 (offset 26 lines).
>>>>>>> 1 out of 5 hunks FAILED -- saving rejects to file
>>>>>>> src/java/org/apache/hadoop/hbase/client/HTable.java.rej
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Multi.java
>>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/client/MultiCallable.java
>>>>>>> patching file
>>>> src/java/org/apache/hadoop/hbase/client/MultiResult.java
>>>>>>> patching file src/java/org/apache/hadoop/hbase/client/Row.java
>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
>>>>>>> Hunk #2 succeeded at 156 with fuzz 1 (offset 3 lines).
>>>>>>> patching file
>>>>>> src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
>>>>>>> Hunk #2 succeeded at 247 (offset 2 lines).
>>>>>>> patching file
>>>>>>> src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
>>>>>>> Hunk #1 succeeded at 78 (offset -1 lines).
>>>>>>> Hunk #2 FAILED at 2515.
>>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>>>
>>>> src/java/org/apache/hadoop/hbase/regionserver/
>>>> HRegionServer.java.rej
>>>>>>> patching file
>>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java
>>>>>>> Hunk #2 FAILED at 333.
>>>>>>> 1 out of 2 hunks FAILED -- saving rejects to file
>>>>>>> src/test/org/apache/hadoop/hbase/client/TestHTable.java.rej
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Marc Limotte [mailto:mslimotte@gmail.com]
>>>>>>> Sent: Tuesday, January 19, 2010 10:26 PM
>>>>>>> To: hbase-user@hadoop.apache.org
>>>>>>> Subject: Re: Support for MultiGet / SQL In clause
>>>>>>>
>>>>>>> Sriram,
>>>>>>>
>>>>>>> Would a secondary index help you:
>>>>>>>
>>>>>>
>>>>
> http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/
>>>>>>> client/tableindexed/package-summary.html#package_description
>>>>>>> .
>>>>>>>
>>>>>>> The index is stored in a separate table, but the index is
>>>>>>> managed
>>>> for
>>>>>>> you.
>>>>>>>
>>>>>>> I don't think you can do an arbitrary "in" query, though.  If
>>>>>>> the
>>>> keys
>>>>>>> that
>>>>>>> you want to include in the "in" are reasonably close neighbors,
>>>> you
>>>>>>> could do
>>>>>>> a scan and skip ones that are uninteresting.  You could also
>>>>>>> try a
>>>>>> batch
>>>>>>> Get
>>>>>>> by applying a separate patch, see
>>>>>>> http://issues.apache.org/jira/browse/HBASE-1845.
>>>>>>>
>>>>>>> Marc Limotte
>>>>>>>
>>>>>>> On Tue, Jan 19, 2010 at 8:45 AM, Sriram Muthuswamy Chittathoor
<
>>>>>>> sriramc@ivycomptech.com> wrote:
>>>>>>>
>>>>>>>> Is there any support for this.  I want to do this
>>>>>>>>
>>>>>>>> 1.  Create a second table to maintain mapping between secondary
>>>>>> column
>>>>>>>> and the rowid's of the primary table
>>>>>>>>
>>>>>>>> 2.  Use this second table to get the rowid's to lookup from
the
>>>>>>> primary
>>>>>>>> table using a SQL In like clause ---
>>>>>>>>
>>>>>>>> Basically I am doing this to speed up querying by  Non-row
key
>>>>>>> columns.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>> Sriram C
>>>>>>>>
>>>>>>>>
>>>>>>>> This email is sent for and on behalf of Ivy Comptech Private
>>>>>> Limited.
>>>>>>> Ivy
>>>>>>>> Comptech Private Limited is a limited liability company.
>>>>>>>>
>>>>>>>> This email and any attachments are confidential, and may
be
>>>> legally
>>>>>>>> privileged and protected by copyright. If you are not the
>>>> intended
>>>>>>> recipient
>>>>>>>> dissemination or copying of this email is prohibited. If
you
>>>> have
>>>>>>> received
>>>>>>>> this in error, please notify the sender by replying by email
>>>>>>>> and
>>>>>> then
>>>>>>> delete
>>>>>>>> the email completely from your system.
>>>>>>>> Any views or opinions are solely those of the sender.  This
>>>>>>> communication
>>>>>>>> is not intended to form a binding contract on behalf of Ivy
>>>> Comptech
>>>>>>> Private
>>>>>>>> Limited unless expressly indicated to the contrary and properly
>>>>>>> authorised.
>>>>>>>> Any actions taken on the basis of this email are at the
>>>> recipient's
>>>>>>> own
>>>>>>>> risk.
>>>>>>>>
>>>>>>>> Registered office:
>>>>>>>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>>>>> Hills,
>>>>>>>> Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>>>> 37994.
>>>>>>>> Registered in India. A list of members' names is available
for
>>>>>>> inspection at
>>>>>>>> the registered office.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>> This email is sent for and on behalf of Ivy Comptech Private
>>> Limited. Ivy Comptech Private Limited is a limited liability
>>> company.
>>>
>>> This email and any attachments are confidential, and may be legally
>>> privileged and protected by copyright. If you are not the intended
>>> recipient dissemination or copying of this email is prohibited. If
>>> you have received this in error, please notify the sender by
>>> replying by email and then delete the email completely from your
>>> system.
>>> Any views or opinions are solely those of the sender.  This
>>> communication is not intended to form a binding contract on behalf
>>> of Ivy Comptech Private Limited unless expressly indicated to the
>>> contrary and properly authorised. Any actions taken on the basis of
>>> this email are at the recipient's own risk.
>>>
>>> Registered office:
>>> Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara
>>> Hills, Hyderabad 500 033, Andhra Pradesh, India. Registered number:
>>> 37994. Registered in India. A list of members' names is available
>>> for inspection at the registered office.
>>>
>>>

Mime
View raw message