hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: ResultScanner performance
Date Thu, 28 Aug 2014 04:29:54 GMT
You can enhance ColumnRangeFilter to return the first column in the range.

In its filterKeyValue(Cell kv) method:

    int cmpMax = Bytes.compareTo(buffer, qualifierOffset, qualifierLength,

        this.maxColumn, 0, this.maxColumn.length);

    if (this.maxColumnInclusive && cmpMax <= 0 ||

        !this.maxColumnInclusive && cmpMax < 0) {

      return ReturnCode.INCLUDE;

    }

ReturnCode.NEXT_ROW should be returned (for subsequent columns) once
ReturnCode.INCLUDE is returned for the first column in range.

Cheers


On Wed, Aug 27, 2014 at 9:05 PM, Jianshi Huang <jianshi.huang@gmail.com>
wrote:

> Very similar. We setup a column range (we're using ColumnRangeFilter right
> now), and we want the first column in the range.
>
> The problem we have a lot of rows.
>
> If there's no such capability, then we need to control the parallelism
> ourselves.
>
> Shall I sort the rows first before scanning? Will a random order be more
> efficient if we have many servers?
>
> Jianshi
>
>
> On Thu, Aug 28, 2014 at 1:44 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > So you want to specify several columns. e.g. c2, c3, and c4, the GET is
> > supposed to return the first one of them (doesn't have to be c2, can be
> c3
> > if c2 is absent) ?
> >
> > To my knowledge there is no such capability now.
> >
> > Cheers
> >
> >
> > On Wed, Aug 27, 2014 at 10:28 AM, Jianshi Huang <jianshi.huang@gmail.com
> >
> > wrote:
> >
> > > On Thu, Aug 28, 2014 at 1:20 AM, Jianshi Huang <
> jianshi.huang@gmail.com>
> > > wrote:
> > >
> > > >
> > > > There's a special but common case that for each row we only need the
> > > first
> > > > column. Is there a better way to do this than multiple scans +
> take(1)?
> > > >
> > >
> > > We still need to set a column range, is there a way to get the first
> > column
> > > value of a range using GET?
> > >
> > >
> > > --
> > > Jianshi Huang
> > >
> > > LinkedIn: jianshi
> > > Twitter: @jshuang
> > > Github & Blog: http://huangjs.github.com/
> > >
> >
>
>
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message