Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (nike.apache.org: domain of lhofhansl@yahoo.com designates
 98.139.213.164 as permitted sender)
References: 
 <CAE6kwsPHxQ1tsCZU0kW+0HCffF+ZJ-ihfN5XoboA9wT6NZBbEw@mail.gmail.com>
	<1410414612.16635.YahooMailNeo@web140603.mail.bf1.yahoo.com>
	<CAE6kwsPc7UAqNXU8UBaMzYXBJaUQa=UtGrMncQdSjV--5zgBmw@mail.gmail.com>
	<CAE6kwsO=h+ovKUbvQJY2t6QdmvO_jKqErfYZ7E5x1h9afrqNAg@mail.gmail.com>
	<BLU436-SMTP208D6E73AC40F3FF145C9288FCD0@phx.gbl>
	<CAE6kwsNgbvR_8ti4gxD=Tz5Q+-2RCAKcet7agr+hd_dx_cvGDQ@mail.gmail.com>
	<CAE6kwsMsRS0myNNDTYd6YKzC-B9FY-GUxgCAUMjygT=YEsHGEg@mail.gmail.com>
	<BLU436-SMTP237957AB38111BB13E5CD8D8FCD0@phx.gbl>
	<CAE6kwsPdOuuGgRZnietKFrzUCy8h2ZGx7JrjW0P7os7yGghT4A@mail.gmail.com>
	<BLU436-SMTP139A459353D7F634340F3E78FCD0@phx.gbl>
	<CAE6kwsO6Q_J_OXa6sPAhXHX_PxMobjuxrybc0L6yXcQCMjpB_g@mail.gmail.com>
	<BLU436-SMTP124C6400160EC11E10022EC8FCD0@phx.gbl>
	<CAE6kwsN5HvnKa8LZHurV+6yrvo++3QV2i=8R8wcZ=Tr2d84Egw@mail.gmail.com>
 <CAE6kwsOuVwH_ypSn=7HSRHdzR_dN+0=BxSU=05R8N8aY4nuNOQ@mail.gmail.com>
Message-ID: <1410673329.71674.YahooMailNeo@web140606.mail.bf1.yahoo.com>
Date: Sat, 13 Sep 2014 22:42:09 -0700
From: lars hofhansl <larsh@apache.org>
Reply-To: lars hofhansl <larsh@apache.org>
Subject: Re: Scan vs Parallel scan.
To: "user@hbase.apache.org" <user@hbase.apache.org>
In-Reply-To: 
 <CAE6kwsOuVwH_ypSn=7HSRHdzR_dN+0=BxSU=05R8N8aY4nuNOQ@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable

What specific version of 0.94 are you using?=0A=0AIn general, if you have m=
ultiple spindles (disks) and/or multiple CPU cores at the region server you=
 should benefits from keeping multiple region server handler threads busy. =
I have experimented with this before and saw a close to linear speed up (up=
 to the point where all disks/core were busy). Obviously this also assuming=
 this is the only load you throw at the servers at this point.=0A=0ACan you=
 post your complete code to pastebin? Maybe even with some code to seed the=
 data?=0AHow do you run your callables? Did you configure the ExecuteServic=
e correctly (assuming you use one to run your callables)? =0A=0AThen we can=
 run it and have a look.=0A=0AThanks.=0A=0A-- Lars=0A=0A=0A----- Original M=
essage -----=0AFrom: Guillermo Ortiz <konstt2000@gmail.com>=0ATo: "user@hba=
se.apache.org" <user@hbase.apache.org>=0ACc: =0ASent: Saturday, September 1=
3, 2014 4:49 PM=0ASubject: Re: Scan vs Parallel scan.=0A=0AWhat am I missin=
g??=0A=0A=0A=0A=0A2014-09-12 16:05 GMT+02:00 Guillermo Ortiz <konstt2000@gm=
ail.com>:=0A=0A> For an partial scan, I guess that I call to the RS to get =
data, it starts=0A> looking in the store files and recollecting the data. (=
It doesn't write to=0A> the blockcache in both cases). It has ready the dat=
a and it gives to the=0A> client the data step by step, I mean,,, it depend=
s the caching and batching=0A> parameters.=0A>=0A> Big differences that I s=
ee...=0A> I'm opening more connections to the Table, one for Region.=0A>=0A=
> I should check the single table scan, it looks like it does partial scans=
=0A> sequentially. Since you can see on the HBase Master how the request=0A=
> increase one after another, not all in the same time.=0A>=0A> 2014-09-12 =
15:23 GMT+02:00 Michael Segel <michael_segel@hotmail.com>:=0A>=0A>> It does=
n=E2=80=99t matter which RS, but that you have 1 thread for each region.=0A=
>>=0A>> So for each thread, what=E2=80=99s happening.=0A>> Step by step, wh=
at is the code doing.=0A>>=0A>> Now you=E2=80=99re comparing this against a=
 single table scan, right?=0A>> What=E2=80=99s happening in the table scan=
=E2=80=A6?=0A>>=0A>>=0A>> On Sep 12, 2014, at 2:04 PM, Guillermo Ortiz <kon=
stt2000@gmail.com>=0A>> wrote:=0A>>=0A>> > Right, My table for example has =
keys between 0-9. in three regions=0A>> > 0-2,3-7,7-9=0A>> > I lauch three =
partial scans in parallel. The scans that I'm executing=0A>> are:=0A>> > sc=
an(0,2), scan(3,7), scan(7,9).=0A>> > Each region is if a different RS, so =
each thread goes to different RS.=0A>> It's=0A>> > not exactly like that, b=
ut on the benchmark case it's like it's working.=0A>> >=0A>> > Really the c=
ode will execute a thread for each Region not for each=0A>> > RegionServer.=
 But in the test I only have two regions for regionServer.=0A>> I=0A>> > do=
nt' think that's an important point, there're two threads for RS.=0A>> >=0A=
>> > 2014-09-12 14:48 GMT+02:00 Michael Segel <michael_segel@hotmail.com>:=
=0A>> >=0A>> >> Ok, lets again take a step back=E2=80=A6=0A>> >>=0A>> >> So=
 you are comparing your partial scan(s) against a full table scan?=0A>> >>=
=0A>> >> If I understood your question, you launch 3 partial scans where yo=
u set=0A>> >> the start row and then end row of each scan, right?=0A>> >>=
=0A>> >> On Sep 12, 2014, at 9:16 AM, Guillermo Ortiz <konstt2000@gmail.com=
>=0A>> wrote:=0A>> >>=0A>> >>> Okay, then, the partial scan doesn't work as=
 I think.=0A>> >>> How could it exceed the limit of a single region if I ca=
lculate the=0A>> >> limits?=0A>> >>>=0A>> >>>=0A>> >>> The only bad point t=
hat I see it's that If a region server has three=0A>> >>> regions of the sa=
me table,  I'm executing three partial scans about=0A>> this=0A>> >> RS=0A>=
> >>> and they could compete for resources (network, etc..) on this node.=
=0A>> It'd=0A>> >> be=0A>> >>> better to have one thread for RS. But, that =
doesn't answer your=0A>> >> questions.=0A>> >>>=0A>> >>> I keep thinking...=
=0A>> >>>=0A>> >>> 2014-09-12 9:40 GMT+02:00 Michael Segel <michael_segel@h=
otmail.com>:=0A>> >>>=0A>> >>>> Hi,=0A>> >>>>=0A>> >>>> I wanted to take a =
step back from the actual code and to stop and=0A>> think=0A>> >>>> about w=
hat you are doing and what HBase is doing under the covers.=0A>> >>>>=0A>> =
>>>> So in your code, you are asking HBase to do 3 separate scans and then=
=0A>> >> you=0A>> >>>> take the result set back and join it.=0A>> >>>>=0A>>=
 >>>> What does HBase do when it does a range scan?=0A>> >>>> What happens =
when that range scan exceeds a single region?=0A>> >>>>=0A>> >>>> If you an=
swer those questions=E2=80=A6 you=E2=80=99ll have your answer.=0A>> >>>>=0A=
>> >>>> HTH=0A>> >>>>=0A>> >>>> -Mike=0A>> >>>>=0A>> >>>> On Sep 12, 2014, =
at 8:34 AM, Guillermo Ortiz <konstt2000@gmail.com>=0A>> >> wrote:=0A>> >>>>=
=0A>> >>>>> It's not all the code, I set things like these as well:=0A>> >>=
>>> scan.setMaxVersions();=0A>> >>>>> scan.setCacheBlocks(false);=0A>> >>>>=
> ...=0A>> >>>>>=0A>> >>>>> 2014-09-12 9:33 GMT+02:00 Guillermo Ortiz <kons=
tt2000@gmail.com>:=0A>> >>>>>=0A>> >>>>>> yes, that is. I have changed the =
HBase version to 0.98=0A>> >>>>>>=0A>> >>>>>> I got the start and stop keys=
 with this method:=0A>> >>>>>> private List<RegionScanner> generatePartitio=
ns() {=0A>> >>>>>>      List<RegionScanner> regionScanners =3D new=0A>> >>>=
>>> ArrayList<RegionScanner>();=0A>> >>>>>>      byte[] startKey;=0A>> >>>>=
>>      byte[] stopKey;=0A>> >>>>>>      HConnection connection =3D null;=
=0A>> >>>>>>      HBaseAdmin hbaseAdmin =3D null;=0A>> >>>>>>      try {=0A=
>> >>>>>>          connection =3D HConnectionManager.=0A>> >>>>>> createCon=
nection(HBaseConfiguration.create());=0A>> >>>>>>          hbaseAdmin =3D n=
ew HBaseAdmin(connection);=0A>> >>>>>>          List<HRegionInfo> regions =
=3D=0A>> >>>>>> hbaseAdmin.getTableRegions(scanConfiguration.getTable());=
=0A>> >>>>>>          RegionScanner regionScanner =3D null;=0A>> >>>>>>    =
      for (HRegionInfo region : regions) {=0A>> >>>>>>=0A>> >>>>>>         =
     startKey =3D region.getStartKey();=0A>> >>>>>>              stopKey =
=3D region.getEndKey();=0A>> >>>>>>=0A>> >>>>>>              regionScanner =
=3D new RegionScanner(startKey, stopKey,=0A>> >>>>>> scanConfiguration);=0A=
>> >>>>>>              // regionScanner =3D createRegionScanner(startKey,=
=0A>> >>>> stopKey);=0A>> >>>>>>              if (regionScanner !=3D null) =
{=0A>> >>>>>>                  regionScanners.add(regionScanner);=0A>> >>>>=
>>              }=0A>> >>>>>>          }=0A>> >>>>>>=0A>> >>>>>> And I exec=
ute the RegionScanner with this:=0A>> >>>>>> public List<Result> call() thr=
ows Exception {=0A>> >>>>>>      HConnection connection =3D=0A>> >>>>>> HCo=
nnectionManager.createConnection(HBaseConfiguration.create());=0A>> >>>>>> =
     HTableInterface table =3D=0A>> >>>>>> connection.getTable(configuratio=
n.getTable());=0A>> >>>>>>=0A>> >>>>>>  Scan scan =3D new Scan(startKey, st=
opKey);=0A>> >>>>>>      scan.setBatch(configuration.getBatch());=0A>> >>>>=
>>      scan.setCaching(configuration.getCaching());=0A>> >>>>>>      Resul=
tScanner resultScanner =3D table.getScanner(scan);=0A>> >>>>>>=0A>> >>>>>> =
     List<Result> results =3D new ArrayList<Result>();=0A>> >>>>>>      for=
 (Result result : resultScanner) {=0A>> >>>>>>          results.add(result)=
;=0A>> >>>>>>      }=0A>> >>>>>>=0A>> >>>>>>      connection.close();=0A>> =
>>>>>>      table.close();=0A>> >>>>>>=0A>> >>>>>>      return results;=0A>=
> >>>>>>  }=0A>> >>>>>>=0A>> >>>>>> They implement Callable.=0A>> >>>>>>=0A=
>> >>>>>>=0A>> >>>>>> 2014-09-12 9:26 GMT+02:00 Michael Segel <michael_sege=
l@hotmail.com=0A>> >:=0A>> >>>>>>=0A>> >>>>>>> Lets take a step back=E2=80=
=A6.=0A>> >>>>>>>=0A>> >>>>>>> Your parallel scan is having the client crea=
te N threads where in=0A>> >> each=0A>> >>>>>>> thread, you=E2=80=99re doin=
g a partial scan of the table where each=0A>> partial=0A>> >>>> scan=0A>> >=
>>>>>> takes the first and last row of each region?=0A>> >>>>>>>=0A>> >>>>>=
>> Is that correct?=0A>> >>>>>>>=0A>> >>>>>>> On Sep 12, 2014, at 7:36 AM, =
Guillermo Ortiz <=0A>> konstt2000@gmail.com>=0A>> >>>>>>> wrote:=0A>> >>>>>=
>>=0A>> >>>>>>>> I was checking a little bit more about,, I checked the clu=
ster=0A>> and=0A>> >>>> data=0A>> >>>>>>> is=0A>> >>>>>>>> store in three d=
ifferent regions servers, each one in a=0A>> differente=0A>> >>>> node.=0A>=
> >>>>>>>> So, I guess the threads go to different hard-disks.=0A>> >>>>>>>=
>=0A>> >>>>>>>> If someone has an idea or suggestion.. why it's faster a si=
ngle=0A>> scan=0A>> >>>>>>> than=0A>> >>>>>>>> this implementation. I based=
 on this implementation=0A>> >>>>>>>> https://github.com/zygm0nt/hbase-dist=
ributed-search=0A>> >>>>>>>>=0A>> >>>>>>>> 2014-09-11 12:05 GMT+02:00 Guill=
ermo Ortiz <konstt2000@gmail.com=0A>> >:=0A>> >>>>>>>>=0A>> >>>>>>>>> I'm w=
orking with HBase 0.94 for this case,, I'll try with 0.98,=0A>> >>>>>>> alt=
hough=0A>> >>>>>>>>> there is not difference.=0A>> >>>>>>>>> I disabled the=
 table and disabled the blockcache for that family=0A>> >> and=0A>> >>>> I=
=0A>> >>>>>>> put=0A>> >>>>>>>>> scan.setBlockcache(false) as well for both=
 cases.=0A>> >>>>>>>>>=0A>> >>>>>>>>> I think that it's not possible that I=
 executing an complete scan=0A>> >> for=0A>> >>>>>>> each=0A>> >>>>>>>>> th=
read since my data are the type:=0A>> >>>>>>>>> 000001 f:q value=3D1=0A>> >=
>>>>>>>> 000002 f:q value=3D2=0A>> >>>>>>>>> 000003 f:q value=3D3=0A>> >>>>=
>>>>> ...=0A>> >>>>>>>>>=0A>> >>>>>>>>> I add all the values and get the sa=
me result on a single scan=0A>> than=0A>> >> a=0A>> >>>>>>>>> distributed, =
so, I guess that DistributedScan did well.=0A>> >>>>>>>>> The count from th=
e hbase shell takes about 10-15seconds, I don't=0A>> >>>>>>> remember,=0A>>=
 >>>>>>>>> but like 4x  of the scan time.=0A>> >>>>>>>>> I'm not using any =
filter for the scans.=0A>> >>>>>>>>>=0A>> >>>>>>>>> This is the way I calcu=
late number of regions/scans=0A>> >>>>>>>>> private List<RegionScanner> gen=
eratePartitions() {=0A>> >>>>>>>>>     List<RegionScanner> regionScanners =
=3D new=0A>> >>>>>>>>> ArrayList<RegionScanner>();=0A>> >>>>>>>>>     byte[=
] startKey;=0A>> >>>>>>>>>     byte[] stopKey;=0A>> >>>>>>>>>     HConnecti=
on connection =3D null;=0A>> >>>>>>>>>     HBaseAdmin hbaseAdmin =3D null;=
=0A>> >>>>>>>>>     try {=0A>> >>>>>>>>>         connection =3D=0A>> >>>>>>=
>>>=0A>> HConnectionManager.createConnection(HBaseConfiguration.create());=
=0A>> >>>>>>>>>         hbaseAdmin =3D new HBaseAdmin(connection);=0A>> >>>=
>>>>>>         List<HRegionInfo> regions =3D=0A>> >>>>>>>>> hbaseAdmin.getT=
ableRegions(scanConfiguration.getTable());=0A>> >>>>>>>>>         RegionSca=
nner regionScanner =3D null;=0A>> >>>>>>>>>         for (HRegionInfo region=
 : regions) {=0A>> >>>>>>>>>=0A>> >>>>>>>>>             startKey =3D region=
.getStartKey();=0A>> >>>>>>>>>             stopKey =3D region.getEndKey();=
=0A>> >>>>>>>>>=0A>> >>>>>>>>>             regionScanner =3D new RegionScan=
ner(startKey, stopKey,=0A>> >>>>>>>>> scanConfiguration);=0A>> >>>>>>>>>   =
          // regionScanner =3D createRegionScanner(startKey,=0A>> >>>>>>> s=
topKey);=0A>> >>>>>>>>>             if (regionScanner !=3D null) {=0A>> >>>=
>>>>>>                 regionScanners.add(regionScanner);=0A>> >>>>>>>>>   =
          }=0A>> >>>>>>>>>         }=0A>> >>>>>>>>>=0A>> >>>>>>>>> I did so=
me test for a tiny table and I think that the range for=0A>> >> each=0A>> >=
>>>>>> scan=0A>> >>>>>>>>> works fine. Although, I though that it was inter=
esting that the=0A>> >> time=0A>> >>>>>>> when I=0A>> >>>>>>>>> execute dis=
tributed scan is about 6x.=0A>> >>>>>>>>>=0A>> >>>>>>>>> I'm going to check=
 about the hard disks, but I think that ti's=0A>> >> right.=0A>> >>>>>>>>>=
=0A>> >>>>>>>>>=0A>> >>>>>>>>>=0A>> >>>>>>>>>=0A>> >>>>>>>>> 2014-09-11 7:5=
0 GMT+02:00 lars hofhansl <larsh@apache.org>:=0A>> >>>>>>>>>=0A>> >>>>>>>>>=
> Which version of HBase?=0A>> >>>>>>>>>> Can you show us the code?=0A>> >>=
>>>>>>>>=0A>> >>>>>>>>>>=0A>> >>>>>>>>>> Your parallel scan with caching 10=
0 takes about 6x as long as=0A>> the=0A>> >>>>>>> single=0A>> >>>>>>>>>> sc=
an, which is suspicious because you say you have 6 regions.=0A>> >>>>>>>>>>=
 Are you sure you're not accidentally scanning all the data in=0A>> each=0A=
>> >>>> of=0A>> >>>>>>>>>> your parallel scans?=0A>> >>>>>>>>>>=0A>> >>>>>>=
>>>> -- Lars=0A>> >>>>>>>>>>=0A>> >>>>>>>>>>=0A>> >>>>>>>>>>=0A>> >>>>>>>>>=
> ________________________________=0A>> >>>>>>>>>> From: Guillermo Ortiz <k=
onstt2000@gmail.com>=0A>> >>>>>>>>>> To: "user@hbase.apache.org" <user@hbas=
e.apache.org>=0A>> >>>>>>>>>> Sent: Wednesday, September 10, 2014 1:40 AM=
=0A>> >>>>>>>>>> Subject: Scan vs Parallel scan.=0A>> >>>>>>>>>>=0A>> >>>>>=
>>>>>=0A>> >>>>>>>>>> Hi,=0A>> >>>>>>>>>>=0A>> >>>>>>>>>> I developed an di=
stributed scan, I create an thread for each=0A>> >> region.=0A>> >>>>>>> Af=
ter=0A>> >>>>>>>>>> that, I've tried to get some times Scan vs DistributedS=
can.=0A>> >>>>>>>>>> I have disabled blockcache in my table. My cluster has=
 3 region=0A>> >>>>>>> servers=0A>> >>>>>>>>>> with 2 regions each one, in =
total there are 100.000 rows and=0A>> >>>> execute a=0A>> >>>>>>>>>> comple=
te scan.=0A>> >>>>>>>>>>=0A>> >>>>>>>>>> My partitions are=0A>> >>>>>>>>>> =
-01666 -> request 16665=0A>> >>>>>>>>>> 016666-033332 -> request 16666=0A>>=
 >>>>>>>>>> 033332-049998 -> request 16666=0A>> >>>>>>>>>> 049998-066664 ->=
 request 16666=0A>> >>>>>>>>>> 066664-083330 -> request 16666=0A>> >>>>>>>>=
>> 083330- -> request 16671=0A>> >>>>>>>>>>=0A>> >>>>>>>>>>=0A>> >>>>>>>>>>=
 14/09/10 09:15:47 INFO hbase.HbaseScanTest: NUM ROWS 100000=0A>> >>>>>>>>>=
> 14/09/10 09:15:47 INFO util.TimerUtil: SCAN=0A>> >>>>>>> PARALLEL:22089ms=
,Counter:2 ->=0A>> >>>>>>>>>> Caching 10=0A>> >>>>>>>>>>=0A>> >>>>>>>>>> 14=
/09/10 09:16:04 INFO hbase.HbaseScanTest: NUM ROWS 100000=0A>> >>>>>>>>>> 1=
4/09/10 09:16:04 INFO util.TimerUtil: SCAN=0A>> >>>>>>> PARALJEL:16598ms,Co=
unter:2 ->=0A>> >>>>>>>>>> Caching 100=0A>> >>>>>>>>>>=0A>> >>>>>>>>>> 14/0=
9/10 09:16:22 INFO hbase.HbaseScanTest: NUM ROWS 100000=0A>> >>>>>>>>>> 14/=
09/10 09:16:22 INFO util.TimerUtil: SCAN=0A>> >>>>>>> PARALLEL:16497ms,Coun=
ter:2 ->=0A>> >>>>>>>>>> Caching 1000=0A>> >>>>>>>>>>=0A>> >>>>>>>>>> 14/09=
/10 09:17:41 INFO hbase.HbaseScanTest: NUM ROWS 100000=0A>> >>>>>>>>>> 14/0=
9/10 09:17:41 INFO util.TimerUtil: SCAN=0A>> >> NORMAL:68288ms,Counter:2=0A=
>> >>>>>>> ->=0A>> >>>>>>>>>> Caching 1=0A>> >>>>>>>>>>=0A>> >>>>>>>>>> 14/=
09/10 09:17:48 INFO hbase.HbaseScanTest: NUM ROWS 100000=0A>> >>>>>>>>>> 14=
/09/10 09:17:48 INFO util.TimerUtil: SCAN=0A>> >> NORMAL:2646ms,Counter:2=
=0A>> >>>> ->=0A>> >>>>>>>>>> Caching 100=0A>> >>>>>>>>>>=0A>> >>>>>>>>>> 1=
4/09/10 09:17:58 INFO hbase.HbaseScanTest: NUM ROWS 100000=0A>> >>>>>>>>>> =
14/09/10 09:17:58 INFO util.TimerUtil: SCAN=0A>> >> NORMAL:3903ms,Counter:2=
=0A>> >>>> ->=0A>> >>>>>>>>>> Caching 1000=0A>> >>>>>>>>>>=0A>> >>>>>>>>>> =
Parallel scan works much worse than simple scan,, and I don't=0A>> know=0A>=
> >>>> why=0A>> >>>>>>>>>> it's=0A>> >>>>>>>>>> so fast,, it's really much =
faster than execute an "count" from=0A>> >> hbase=0A>> >>>>>>>>>> shell,=0A=
>> >>>>>>>>>> what it doesn't look pretty notmal. The only time that it wor=
ks=0A>> >>>> better=0A>> >>>>>>>>>> parallel is when I execute a normal sca=
n with caching 1.=0A>> >>>>>>>>>>=0A>> >>>>>>>>>> Any clue about it?=0A>> >=
>>>>>>>>>=0A>> >>>>>>>>>=0A>> >>>>>>>>>=0A>> >>>>>>>=0A>> >>>>>>>=0A>> >>>>=
>>=0A>> >>>>=0A>> >>>>=0A>> >>=0A>> >>=0A>>=0A>>=0A>