hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: How HBase perform per-column scan?
Date Sun, 10 Mar 2013 14:40:35 GMT
bq. physically column family should be able to perform efficiently (storage
layer

When you scan a row, data for different column families would be brought
into memory (if you don't utilize HBASE-5416)
Take a look at:
https://issues.apache.org/jira/browse/HBASE-5416?focusedCommentId=13541258&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13541258

which was based on the settings described in:

https://issues.apache.org/jira/browse/HBASE-5416?focusedCommentId=13541191&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13541191

This boils down to your schema design. If possible, consider extracting
column C into its own column family.

Cheers

On Sun, Mar 10, 2013 at 7:14 AM, PG <pengyunmomo@gmail.com> wrote:

> Hi, Ted and Anoop, thanks for your notes.
> I am talking about column rather than column family, since physically
> column family should be able to perform efficiently (storage layer, CF's
> are stored separately). But columns of the same column family may be mixed
> physically, and that makes filters column value hard... So I want to know
> if there are any mechanism in HBase worked on this...
> Regards,
> Yun
>
> On Mar 10, 2013, at 10:01 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > Hi, Yun:
> > Take a look at HBASE-5416 (Improve performance of scans with some kind of
> > filters) which is in 0.94.5 release.
> >
> > In your case, you can use a filter which specifies column C as the
> > essential family.
> > Here I interpret column C as column family.
> >
> > Cheers
> >
> > On Sat, Mar 9, 2013 at 11:11 AM, yun peng <pengyunmomo@gmail.com> wrote:
> >
> >> Hi, All,
> >> I want to find all existing values for a given column in a HBase, and
> would
> >> that result in a full-table scan in HBase? For example, given a column
> C,
> >> the table is of very large number of rows, from which few rows (say
> only 1
> >> row) have non-empty values for column C. Would HBase still ues a full
> table
> >> scan to find this row? Or HBase has any optimization work for this kind
> of
> >> query?
> >> Thanks...
> >> Regards
> >> Yun
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message