hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: does Scan guarantee min to max rows?
Date Tue, 12 Apr 2011 16:12:34 GMT
Have you read the thread entitled 'min, max' ?

On Tue, Apr 12, 2011 at 7:33 AM, Vishal Kapoor
<vishal.kapoor.in@gmail.com>wrote:

> Here is the problem.
>
> my row Ids are "starting" with reversed time stamp followed by "/" and
> some more values.
>
>
> 9223370735421724555/TimeStamp1/TimeStamp2/CustomerId/MacIdSystem1/MacIdSystem2/RowType
>
> the RowId is designed to make sure the latest row comes up first in the
> Scan.
>
> reverse time is calculated as below:
>
> long reverseTimeStampForRIghtNow = Long.MAX_VALUE -
> System.currentTimeMillis()
>
> Now, I have a need to only process the new incoming rows, so I land up
> keeping a ScoreBoard Table with records of what I process with every
> iteration.
>
> I pass start and stop Row to the Scan to define the scope.
>
> start row is taken as below.
>
>                Scan scan = new Scan();
>                scan.setCacheBlocks(false);
>                scan.setFilter(new FirstKeyOnlyFilter());
>                ResultScanner rsc = table.getScanner(scan);
>                Result firstRow = rsc.next();
>                        if(firstRow != null ) {
>                                startRow = firstRow.getRow();
>                        }
>
>  and the last row for the very "first" run is calculated like this.
>
>        Result lastRow = table.getRowOrBefore(Bytes.toBytes("9999999999"),
> someFamilyOfThisTableWhichAlwaysExist);
>                        if(lastRow != null )
>                        stopRow = lastRow.getRow();
>
> once processed, the first Row from this processing becomes the last
> row for Next Iteration.
> and since the last row is excluded from the scan, it should work to my
> advantage.
>
> conceptually I assume it to work as long as the processing code and
> new records writer code does not step on each other.
> But I have instances when the Scan does not give me the top most
> record from table.
>
> I am clueless on where I am going wrong.
> any pointers to improving it or switching to a design that is proven
> to be working on this kind of problem will help me.
>
> thanks,
> Vishal Kapoor
>
> On Wed, Apr 6, 2011 at 12:56 PM, Stack <stack@duboce.net> wrote:
> > On Wed, Apr 6, 2011 at 5:12 AM, Vishal Kapoor
> > <vishal.kapoor.in@gmail.com> wrote:
> >> I am getting shuffled rows? is there a problem at my end somewhere? we
> >> did some manual split of tables.
> >> have a scoreboard kind of code for staged processing of table based on
> >> it, which is going for a toss.
> >>
> >
> > Vishal, you'll have to do better than the above describing your
> > problem if you are looking for some help from the list.
> > St.Ack
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message