hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <tdunn...@maprtech.com>
Subject Re: 0.90 latency performance, cdh3b4
Date Wed, 20 Apr 2011 19:06:16 GMT
What is meant by 8% quartile?  75th %-ile?  98%-ile?  Should quartile have
been quantile?

On Wed, Apr 20, 2011 at 12:00 PM, Dmitriy Lyubimov <dlieu.7@gmail.com>wrote:

> Ok actually we do have 1 region for these exact tables... so back to
> square one.
>
> FWIW i do get 8% quartile under 3ms TTLB. So it is algorithmically
> sound it seems. question is why outliers spread is so much longer than
> in tests on one machine. must be network. What else.
>
>
> On Wed, Apr 20, 2011 at 10:06 AM, Dmitriy Lyubimov <dlieu.7@gmail.com>
> wrote:
> > Got it. This must be the reason. Cause it is a laugh check, and i do
> > see 6 regions for 40 rows so it can span them, although i can't
> > confirm it for sure. It may be due to how table was set up or due to
> > some time running them and rotating some data there. The uniformly
> > distributed hashes are used for the keys so that it is totally
> > plausible 40 rows will get into 6 different regions.
> >
> > Ok i'll take it for working theory for now.
> >
> > Is there a way to set max # of regions per table? I guess the method
> > in the manual is to set max region size. Which means i probably need
> > to re-create the table with one region to get back to 1 region? or
> > maybe there's a way to get it back to one region without recreating
> > it, such as major compaction?
> >
> > thanks.
> > -d
> >
> > On Wed, Apr 20, 2011 at 9:55 AM, Stack <stack@duboce.net> wrote:
> >> On Wed, Apr 20, 2011 at 9:49 AM, Dmitriy Lyubimov <dlieu.7@gmail.com>
> wrote:
> >>> Ok. Let me ask a question.
> >>>
> >>> When scan is performed and it obviously covers several regions, are
> >>> scan performance calls done in sinchronous succession or they are done
> >>> in parallel?
> >>>
> >>
> >> The former.
> >>
> >>
> >>> Assuming scan is returning 40 results but for some weird reason it
> >>> goes to 6 regions and caching is set to 100 (so it can take all of
> >>> them) are individual region request latencies summed or it would be
> >>> max(region request latency)?
> >>>
> >>
> >> Summed.
> >>
> >> The 40 rows are not contiguous in the same region?  If not, the cost
> >> of client setting up new scanner against next region will be inline w/
> >> your read timing (at least an rpc per region).
> >>
> >> St.Ack
> >>
> >>> Thank you very much.
> >>> -D
> >>>
> >>> On Tue, Apr 19, 2011 at 6:28 PM, Ted Dunning <tdunning@maprtech.com>
> wrote:
> >>>> For a tiny test like this, everything should be in memory and latency
> >>>> should be very low.
> >>>>
> >>>> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dlieu.7@gmail.com>
> wrote:
> >>>>> PS so what should latency be for reads in 0.90, assuming moderate
> thruput?
> >>>>>
> >>>>> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dlieu.7@gmail.com>
> wrote:
> >>>>>> for this test, there's just no more than 40 rows in every given
> table.
> >>>>>> This is just a laugh check.
> >>>>>>
> >>>>>> so i think it's safe to assume it all goes to same region server.
> >>>>>>
> >>>>>> But latency would not depend on which server call is going to,
would
> >>>>>> it? Only throughput would, assuming we are not overloading.
> >>>>>>
> >>>>>> And we clearly are not as my single-node local version runs
quite ok
> >>>>>> response times with the same throughput.
> >>>>>>
> >>>>>> It's something with either client connections or network latency
or
> >>>>>> ... i don't know what it is. I did not set up the cluster but
i
> gotta
> >>>>>> troubleshoot it now :)
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Tue, Apr 19, 2011 at 5:23 PM, Ted Dunning <tdunning@maprtech.com>
> wrote:
> >>>>>>> How many regions?  How are they distributed?
> >>>>>>>
> >>>>>>> Typically it is good to fill the table some what and then
drive
> some
> >>>>>>> splits and balance operations via the shell.  One more split
to
> make
> >>>>>>> the regions be local and you should be good to go.  Make
sure you
> have
> >>>>>>> enough keys in the table to support these splits, of course.
> >>>>>>>
> >>>>>>> Under load, you can look at the hbase home page to see how
> >>>>>>> transactions are spread around your cluster.  Without splits
and
> local
> >>>>>>> region files, you aren't going to see what you want in terms
of
> >>>>>>> performance.
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message