hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "chunhui shen (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-8980) Assistant Store ----------- An Index Store of HRegion
Date Fri, 19 Jul 2013 03:14:49 GMT

     [ https://issues.apache.org/jira/browse/HBASE-8980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

chunhui shen updated HBASE-8980:
--------------------------------

    Description: 
*Background*
a.Generally, we would hope several organizations for the same data. e.g. Secondary Index sortes
the data as the non-primary key.
b.Now, when we scanning the data on HBase with condition, like ValueFilter, its  efficiency
seems low
c.We could create an Assistant Store to store the data with another organization for the data
of HRegion

*Assistant Store*
a.It's a store of HRegion, like HStore, could be created by user through adding ColumnFamliy

b.Data in Assistant Store is the copy of data in HRegion, but using another organization ,The
Exception is that its row could be not in the range of HRegion and its value is the same as
the row of original KeyValue
For example, 
The region(Range:'row001'~'row999') includes the following KVs in the Store cf:
row001/cf:q1/val001
row002/cf:q1/val002
row003/cf:q1/val003
we could create an Assistant Store(named as) for the region which includes the following KVs:
val001/cf:q1/row001
val002/cf:q1/row002
val003/cf:q1/row003

c.We could use local region transaction to ensure the Atomicity and Consistency

e.Regionserver will put data into Assistant Store automatically, but user should read the
data from Assistant Store himself


*Example of Using Assistant Store*
a.Supposing exist the empty table named t1 with the column family named c1, it has only one
region (region's range is from EMPTY_START_ROW to EMPTY_END_ROW).

b.Adding an Assistant Store for the table through adding a new column family named c2.

c.User put following data to table:
r1/c1:q1/v1
r2/c1:q1/v2
r3/c1:q1/v1
r4/c1:q1/v2

d.Then, the region will have the following data:
r1/c1:q1/v1
r2/c1:q1/v2
r3/c1:q1/v1
r4/c1:q1/v2

v1/c2:q1/r1
v1/c2:q1/r3
v2/c2:q1/r2 (Generated by Assistant, Stored in Assistant Store)
v2/c2:q1/r4

e.From the above, we could see the Assistant Store would have the data:
v1/c2:q1/r1
v1/c2:q1/r3
v2/c2:q1/r2
v2/c2:q1/r4

And these data



*Implementation Dependency*
a.Split the StoreFile as value.(Now,we just split the file as row)
b.Support multi-row transaction in region (Alreadt implemented)

Providing an initial patch on 0.94 version. 
What do you think about such a Store.

  was:
*Background*
a.Generally, we would hope several organizations for the same data. e.g. Secondary Index sortes
the data as the non-primary key.
b.Now, when we scanning the data on HBase with condition, like ValueFilter, its  efficiency
seems low
c.We could create an Assistant Store to store the data with another organization for the data
of HRegion

*Assistant Store*
a.It's a store of HRegion, like HStore, could be created by user through adding ColumnFamliy
b.Data in Assistant Store is the copy of data in HRegion, but using another organization ,The
Exception is that its row could be not in the range of HRegion and its value is the same as
the row of original KeyValue
For example, 
The region(Range:'row001'~'row999') includes the following KVs in the Store cf:
row001/cf:q1/val001
row002/cf:q1/val002
row003/cf:q1/val003
we could create an Assistant Store(named as) for the region which includes the following KVs:
val001/cf:q1/row001
val002/cf:q1/row002
val003/cf:q1/row003

c.We could use local region transaction to ensure the Atomicity and Consistency

*Implementation Dependency*
a.Split the StoreFile as value.(Now,we just split the file as row)
b.Support multi-row transaction in region (Alreadt implemented)

Providing an initial patch on 0.94 version. 
What do you think about such a Store.

    
> Assistant Store ----------- An Index Store of HRegion
> -----------------------------------------------------
>
>                 Key: HBASE-8980
>                 URL: https://issues.apache.org/jira/browse/HBASE-8980
>             Project: HBase
>          Issue Type: New Feature
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: 8980-94.patch
>
>
> *Background*
> a.Generally, we would hope several organizations for the same data. e.g. Secondary Index
sortes the data as the non-primary key.
> b.Now, when we scanning the data on HBase with condition, like ValueFilter, its  efficiency
seems low
> c.We could create an Assistant Store to store the data with another organization for
the data of HRegion
> *Assistant Store*
> a.It's a store of HRegion, like HStore, could be created by user through adding ColumnFamliy
> b.Data in Assistant Store is the copy of data in HRegion, but using another organization
,The Exception is that its row could be not in the range of HRegion and its value is the same
as the row of original KeyValue
> For example, 
> The region(Range:'row001'~'row999') includes the following KVs in the Store cf:
> row001/cf:q1/val001
> row002/cf:q1/val002
> row003/cf:q1/val003
> we could create an Assistant Store(named as) for the region which includes the following
KVs:
> val001/cf:q1/row001
> val002/cf:q1/row002
> val003/cf:q1/row003
> c.We could use local region transaction to ensure the Atomicity and Consistency
> e.Regionserver will put data into Assistant Store automatically, but user should read
the data from Assistant Store himself
> *Example of Using Assistant Store*
> a.Supposing exist the empty table named t1 with the column family named c1, it has only
one region (region's range is from EMPTY_START_ROW to EMPTY_END_ROW).
> b.Adding an Assistant Store for the table through adding a new column family named c2.
> c.User put following data to table:
> r1/c1:q1/v1
> r2/c1:q1/v2
> r3/c1:q1/v1
> r4/c1:q1/v2
> d.Then, the region will have the following data:
> r1/c1:q1/v1
> r2/c1:q1/v2
> r3/c1:q1/v1
> r4/c1:q1/v2
> v1/c2:q1/r1
> v1/c2:q1/r3
> v2/c2:q1/r2 (Generated by Assistant, Stored in Assistant Store)
> v2/c2:q1/r4
> e.From the above, we could see the Assistant Store would have the data:
> v1/c2:q1/r1
> v1/c2:q1/r3
> v2/c2:q1/r2
> v2/c2:q1/r4
> And these data
> *Implementation Dependency*
> a.Split the StoreFile as value.(Now,we just split the file as row)
> b.Support multi-row transaction in region (Alreadt implemented)
> Providing an initial patch on 0.94 version. 
> What do you think about such a Store.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message