hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vrodio...@carrieriq.com>
Subject Random I/O performance
Date Wed, 26 Oct 2011 19:51:39 GMT

We have a reporting tool which runs queries against Oracle DB, collects fact ids and then

queries HBase for these facts (one-by-one). This is single thread, simple Get op

It is slow, of course. 5 hours to retrieve 1M facts from HBase storage. Approx 55 rows per
sec

I know I can use batch get to increase the speed but my question is what else we can do to
make our ops team happier? 

How to optimize random I/O performance in HBase (hi, Facebook we have the same problem as
you guys :)

Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
e-mail: vrodionov@carrieriq.com

________________________________________
From: Gary Helmling [ghelmling@gmail.com]
Sent: Wednesday, October 26, 2011 12:34 PM
To: dev@hbase.apache.org
Subject: Re: proposal for naming convention of patches for TRUNK

Also should be possible to use the file command?

$ file HBASE-4680.txt
HBASE-4680.txt: diff output text



On Wed, Oct 26, 2011 at 12:32 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> Looping in Giri.
>
> Giri:
> Do you think you have enough heuristics for the filter ?
>
> Thanks
>
> On Wed, Oct 26, 2011 at 12:29 PM, Todd Lipcon <todd@cloudera.com> wrote:
>
>> Should be pretty easy to use grep to determine if a file is a patch or
>> not. Patch files have lines starting with "---" and "+++".
>>
>>
>> On Wed, Oct 26, 2011 at 11:58 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>> > #1 is reasonable.
>> >
>> > For #2, the following would be included for test validation:
>> >
>> > how-to-reproduce-the-problem.txt
>> > script-I-used.txt
>> >
>> > Just a few examples.
>> >
>> > On Wed, Oct 26, 2011 at 11:52 AM, Jonathan Hsieh <jon@cloudera.com>
>> wrote:
>> >
>> >> Suggestion:
>> >>
>> >> 1) Don't run check if the apache inclusion flag isn't checked?
>> >> 2) Require extension to be .diff, .patch, or .txt?
>> >>
>> >> Jon.
>> >>
>> >> On Wed, Oct 26, 2011 at 11:37 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>> >>
>> >> > How do we exclude non-patch attachments, such as
>> >> > EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo<
>> >> >
>> >>
>> http://issues.apache.org/jira/secure/attachment/12500832/EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo
>> >> > >?
>> >> >
>> >> > Thanks
>> >> >
>> >> > On Wed, Oct 26, 2011 at 11:32 AM, Todd Lipcon <todd@cloudera.com>
>> wrote:
>> >> >
>> >> > > I prefer to default to trunk, and require a -0.90 or -0.92 to
>> >> > > delineate a different branch. Most patches should be against trunk,
>> so
>> >> > > let's optimize for the common case.
>> >> > >
>> >> > > -Todd
>> >> > >
>> >> > > On Wed, Oct 26, 2011 at 11:04 AM, Ted Yu <yuzhihong@gmail.com>
>> wrote:
>> >> > > > Hi,
>> >> > > > I am working with Giri on a filter that should help us avoid
the
>> >> > > following
>> >> > > > (see HBASE-4377):
>> >> > > >
>> >> > > > -1 overall. Here are the results of testing the latest attachment
>> >> > > >
>> >> > >
>> >> >
>> >>
>> http://issues.apache.org/jira/secure/attachment/12500832/EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo
>> >> > > > against trunk revision .
>> >> > > >
>> >> > > > I am proposing the following convention: TRUNK patch filename
>> should
>> >> > > contain
>> >> > > > the word 'trunk' in a prominent manner - surrounded by either
dash
>> or
>> >> > > dot.
>> >> > > > Valid examples are:
>> >> > > >
>> >> > > > <
>> >> > >
>> >> >
>> >>
>> https://issues.apache.org/jira/secure/attachment/12500830/hbase-4377.trunk.v4.txt
>> >> > > >
>> >> > > >  hbase-4377.trunk.v4.txt<
>> >> > >
>> >> >
>> >>
>> https://issues.apache.org/jira/secure/attachment/12500830/hbase-4377.trunk.v4.txt
>> >> > > >
>> >> > > > <
>> >> > >
>> >> >
>> >>
>> https://issues.apache.org/jira/secure/attachment/12497503/hbase-4377-trunk.v2.patch
>> >> > > >
>> >> > > >  hbase-4377-trunk.v2.patch<
>> >> > >
>> >> >
>> >>
>> https://issues.apache.org/jira/secure/attachment/12497503/hbase-4377-trunk.v2.patch
>> >> > > >
>> >> > > > <
>> >> > >
>> >> >
>> >>
>> https://issues.apache.org/jira/secure/attachment/12499805/0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch
>> >> > > >
>> >> > > >
>> >> > >
>> >> >
>> >>
>>  0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch<
>> >> > >
>> >> >
>> >>
>> https://issues.apache.org/jira/secure/attachment/12499805/0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch
>> >> > > >
>> >> > > >
>> >> > > > This would allow Giri to write filter that correctly uploads
patch
>> >> for
>> >> > > TRUNK
>> >> > > > to Jenkins for test build.
>> >> > > >
>> >> > > > Please provide your comments.
>> >> > > >
>> >> > >
>> >> > >
>> >> > >
>> >> > > --
>> >> > > Todd Lipcon
>> >> > > Software Engineer, Cloudera
>> >> > >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> // Jonathan Hsieh (shay)
>> >> // Software Engineer, Cloudera
>> >> // jon@cloudera.com
>> >>
>> >
>>
>>
>>
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>
>

Mime
View raw message