hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ram Kulbak <ram.kul...@gmail.com>
Subject Re: [Indexed HBase] Can I add index in an existing table?
Date Sat, 27 Feb 2010 03:53:35 GMT
Posting a reply to a question I got off list:

> Ram:
> How do I specify index in HColumnDescriptor that is passed to modifyColumn()
> ?
>
> Thanks
>


You will need to use an IdxColumnDescriptor:

Here's a code example for creating a table with a byte array index:

   HTableDescriptor tableDescriptor = new HTableDescriptor(TABLE_NAME);
   IdxColumnDescriptor idxColumnFamilyDescriptor = new
IdxColumnDescriptor(FAMILY_NAME);
   try {
     idxColumnFamilyDescriptor.addIndexDescriptor(
       new IdxIndexDescriptor(QUALIFIER_NAME, IdxQualifierType.BYTE_ARRAY)
     );
   } catch (IOException e) {
     throw new IllegalStateException(e);
   }
   tableDescriptor.addFamily(idxColumnFamilyDescriptor);


You can add several index descriptors to the same column family and
you can put indexes on more than one column families. You should use
IdxScan with an org.apache.hadoop.hbase.client.idx.exp.Expression set
to match your query criteria. The expression may cross columns from
the same or different families using ANDs and ORs.

Note that several index types are supported. Current types include all
basic types and BigDecimals. Char arrays are also supported.  Types
allow for correct range checking (for example you can quickly evaluate
a scan getting all rows for which a given column has values between 42
and 314). You should make sure that columns which are indexed with a
given qualifier type are actually populated with bytes matching their
type, e.g. it you use IdxQualifierType.LONG make sure that you
actually put values which are 8-long byte arrays which were produced
in a method similar to Bytes.toBytes(long).

Yoram


2010/2/26 Ram Kulbak <ram.kulbak@gmail.com>:
> Hi Shen,
>
> The first thing you need to verify is that you can switch to the
> IdxRegion implementation without problems. I've just checked that the
> following steps work on the PerformanceEvaluation tables. I would
> suggest you backup your hbase production instance before attempting
> this (or create and try it out on a sandbox instance)
>
> * Stop hbase
> * Edit  conf/hbase-env.sh file and add IHBASE to your classpath.
> Here's an example which assumes you don't need to add anything else to
> your classpath, make sure the HBASE_HOME is defined or simply
> substiute it with the full path of hbase installation directory:
>   export HBASE_CLASSPATH=(`find $HBASE_HOME/contrib/indexed -name
> '*jar' | tr -s "\n" ":"`)
>
> * Edit conf/hbase-site.xml and set IdxRegion to be the region implementation:
>
>  <property>
>     <name>hbase.hregion.impl</name>
>     <value>org.apache.hadoop.hbase.regionserver.IdxRegion</value>
>  </property>
>
> * Propagate the configuration to all slaves
> * Start HBASE
>
>
> Next, modify the table you want to index using code similar to this:
>
>    HBaseConfiguration conf = new HBaseConfiguration();
>
>    HBaseAdmin admin = new HBaseAdmin(conf);
>    admin.disableTable(TABLE_NAME);
>    admin.modifyColumn(TABLE_NAME, FAMILY_NAME1, IDX_COLUMN_DESCRIPTOR1);
>          ...
>    admin.modifyColumn(TABLE_NAME, FAMILY_NAMEN, IDX_COLUMN_DESCRIPTORN);
>  admin.enableTable(TABLE_NAME);
>
> Wait for the table to get indexed. This may take a few minutes. Check
> the master web page and verify your index definitions appear correctly
> in the table description.
>
> This is it. Please let me know how it goes.
>
> Yoram
>
>
>
>
> 2010/2/26 ChingShen <chingshenchen@gmail.com>:
>> Thanks, But I think I need the indexed HBase rather than transactional
>> HBase.
>>
>> Shen
>>
>> 2010/2/26 <y_823910@tsmc.com>
>>
>>> You can try my code to create a index in the existing table.
>>>
>>> public void AddIdx2ExistingTable(String tablename,String
>>> columnfamily,String idx_column) throws IOException {
>>>            IndexedTableAdmin admin = null;
>>>          admin = new IndexedTableAdmin(config);
>>>          admin.addIndex(Bytes.toBytes(tablename), new
>>> IndexSpecification(idx_column,
>>>          Bytes.toBytes(columnfamily+":"+idx_column)));
>>> }
>>>
>>>
>>>
>>>
>>> Fleming Chiu(邱宏明)
>>> 707-6128
>>> y_823910@tsmc.com
>>> 週一無肉日吃素救地球(Meat Free Monday Taiwan)
>>>
>>>
>>>
>>>
>>>
>>>                      ChingShen
>>>                      <chingshenchen@gm        To:      hbase-user <
>>> hbase-user@hadoop.apache.org>
>>>                      ail.com>                 cc:      (bcc:
>>> Y_823910/TSMC)
>>>                                               Subject: [Indexed HBase] Can
>>> I add index in an existing table?
>>>                      2010/02/26 10:18
>>>                      AM
>>>                      Please respond to
>>>                      hbase-user
>>>
>>>
>>>
>>>
>>>
>>>
>>> Hi,
>>>
>>> I got http://issues.apache.org/jira/browse/HBASE-2037 that can create a
>>> new
>>> table with index, but can I add index in an existing table?
>>> Any code examples?
>>>
>>> Thanks.
>>>
>>> Shen
>>>
>>>
>>>
>>>
>>>
>>>  ---------------------------------------------------------------------------
>>>                                                         TSMC PROPERTY
>>>  This email communication (and any attachments) is proprietary information
>>>  for the sole use of its
>>>  intended recipient. Any unauthorized review, use or distribution by anyone
>>>  other than the intended
>>>  recipient is strictly prohibited.  If you are not the intended recipient,
>>>  please notify the sender by
>>>  replying to this email, and then delete this email and any copies of it
>>>  immediately. Thank you.
>>>
>>>  ---------------------------------------------------------------------------
>>>
>>>
>>>
>>>
>>
>>
>> --
>> *****************************************************
>> Ching-Shen Chen
>> Advanced Technology Center,
>> Information & Communications Research Lab.
>> E-mail: chenchingshen@itri.org.tw
>> Tel:+886-3-5915542
>> *****************************************************
>>
>

Mime
View raw message