hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: HBASE-4089 & HBASE-4147 - on the topic of ops output
Date Mon, 01 Aug 2011 21:07:14 GMT
This is the first time I've ever seen a thread here go off into what the fuckville. Bookmarked.
 
    - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


>________________________________
>From: Doug Meil <doug.meil@explorysmedical.com>
>To: "dev@hbase.apache.org" <dev@hbase.apache.org>
>Sent: Sunday, July 31, 2011 4:17 PM
>Subject: Re: HBASE-4089 & HBASE-4147 - on the topic of ops output
>
>
>re:  "Whirr"
>
>I'm sure Cloudera would love to hear your feedback on Whirr, but please
>address it to them directly and not on the Hbase dist-list.
>
>re:  "Is it really-really supported by Microsoft employees?!"
>
>It is really, really not.
>
>
>As you pointed out, and as is cited in the Apache Hbase book, Hbase was
>inspired from BigTable.  And as for the refund that Ryan suggested I'm
>sure Google would be happy to make good on it.  Watch out for the exchange
>rate at the border.
>
>
>
>
>On 7/31/11 6:38 PM, "Fuad Efendi" <fuad@efendi.ca> wrote:
>
>>Great,
>>
>>
>>
>>My VP is former MS-98 :)))
>>
>>
>>
>>(BTW, many thanks to http://www.cloudera.com/ employees; please make Whirr
>>really "rrrrrŠŠŠ" "open source"? I don't see any meaningŠ reinventing a
>>bike is cheaper nowadays!!!)
>>
>>
>>
>>
>>
>>
>>And, SEO of course: (remember: Google is SMARTER!!! You are just "clone".)
>>
>>==========================================================================
>>
>>
>>-- 
>>Fuad Efendi
>>416-993-2060
>>Tokenizer Inc., Canada
>>Data Mining, Search Engines
>>http://www.tokenizer.ca
>>
>>
>>
>>
>>
>>
>>
>>
>>On 11-07-31 6:21 PM, "Ryan Rawson" <ryanobjc@gmail.com> wrote:
>>
>>>You should ask for your money back!!
>>>
>>>On Sun, Jul 31, 2011 at 3:10 PM, Fuad Efendi <fuad@efendi.ca> wrote:
>>>> What is it all about? HBase sucks. Too many problems to newcomers,
>>>> few-weeks-warm-up to begin with!!!!!!!!!!!!!!!! Is it really-really
>>>> supported by Microsoft employees?!
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> And, SEO of course:
>>>> ===================
>>>>
>>>>
>>>> --
>>>> Fuad Efendi
>>>> 416-993-2060
>>>> Tokenizer Inc., Canada
>>>> Data Mining, Search Engines
>>>> http://www.tokenizer.ca
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 11-07-29 7:49 PM, "Otis Gospodnetic" <otis_gospodnetic@yahoo.com>
>>>>wrote:
>>>>
>>>>>Hi,
>>>>>
>>>>>I'm for publishing all performance metrics in JMX (in addition to
>>>>>exposing it wherever else you guys decide).  That's because JMX is
>>>>>probably the easiest for our SPM for HBase [1] to get to HBase
>>>>>performance metrics and I suspect we are not alone.
>>>>>
>>>>>Otis
>>>>>[1] http://sematext.com/spm/hbase-performance-monitoring/index.html
>>>>>----
>>>>>Sematext :: http://sematext.com/ :: Solr - Lucene - Hadoop - HBase
>>>>>Hadoop ecosystem search :: http://search-hadoop.com/
>>>>>
>>>>>
>>>>>
>>>>>>________________________________
>>>>>>From: Andrew Purtell <apurtell@apache.org>
>>>>>>To: Doug Meil <doug.meil@explorysmedical.com>; "dev@hbase.apache.org"
>>>>>><dev@hbase.apache.org>
>>>>>>Sent: Friday, July 29, 2011 4:34 PM
>>>>>>Subject: Re: HBASE-4089 & HBASE-4147 - on the topic of ops output
>>>>>>
>>>>>>> I'd rather see this output being able to be captured by something
>>>>>>>the
>>>>>>>sink that Todd suggested, rather than focusing on shell access.
>>>>>>
>>>>>>
>>>>>>I don't agree.
>>>>>>
>>>>>>
>>>>>>Look at what we have existing and proposed:
>>>>>>
>>>>>>    - Java API access to server and region load information, that
the
>>>>>>shell uses
>>>>>>
>>>>>>    - A proposal to dump some stats into log files, that then has
to
>>>>>>be
>>>>>>scraped
>>>>>>
>>>>>>    - A proposal (by the FB guys) to export some JSON via a HTTP
>>>>>>servlet
>>>>>>
>>>>>>This is not good design, this is a bunch of random shit stuck
>>>>>>together.
>>>>>>
>>>>>>Note that what Todd proposed does not preclude adding Java client
API
>>>>>>support for retrieving it.
>>>>>>
>>>>>>At a minimum all of this information must be accessible via the Java
>>>>>>client API, to enable programmatic monitoring and analysis use cases.
>>>>>>I'll add the shell support if nobody else cares about it, that is
a
>>>>>>relatively small detail, but one I think is important.
>>>>>>
>>>>>>Best regards,
>>>>>>
>>>>>>
>>>>>>    - Andy
>>>>>>
>>>>>>
>>>>>>Problems worthy of attack prove their worth by hitting back. - Piet
>>>>>>Hein
>>>>>>(via Tom White)
>>>>>>
>>>>>>
>>>>>>>________________________________
>>>>>>>From: Doug Meil <doug.meil@explorysmedical.com>
>>>>>>>To: "dev@hbase.apache.org" <dev@hbase.apache.org>;
>>>>>>>"apurtell@apache.org" <apurtell@apache.org>
>>>>>>>Sent: Friday, July 29, 2011 11:39 AM
>>>>>>>Subject: Re: HBASE-4089 & HBASE-4147 - on the topic of ops
output
>>>>>>>
>>>>>>>
>>>>>>>I'd rather see this output being able to be captured by something
the
>>>>>>>sink
>>>>>>>that Todd suggested, rather than focusing on shell access.
>>>>>>>HServerLoad
>>>>>>>is
>>>>>>>super-summary at the RS level, and both the items in 4089 and
4147
>>>>>>>are
>>>>>>>proposed to be "summarized" but still have reasonable detail (e.g.,
>>>>>>>even
>>>>>>>table/CF summary there could be dozens of entries given a reasonably
>>>>>>>complex system).
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>On 7/29/11 1:15 PM, "Andrew Purtell" <apurtell@apache.org>
wrote:
>>>>>>>
>>>>>>>>There is also the matter of HServerLoad and how that is used
by the
>>>>>>>>shell
>>>>>>>>and master UI to report on cluster status.
>>>>>>>>
>>>>>>>>I'd like the shell to be able to let the user explore all
of these
>>>>>>>>different reports interactively.
>>>>>>>>
>>>>>>>>At the very least, they should all be handled the same way.
>>>>>>>>
>>>>>>>>And then there is Riley's work over at FB on a slow query
log. How
>>>>>>>>does
>>>>>>>>that fit in?
>>>>>>>>
>>>>>>>>Best regards,
>>>>>>>>
>>>>>>>>
>>>>>>>>   - Andy
>>>>>>>>
>>>>>>>>Problems worthy of attack prove their worth by hitting back.
- Piet
>>>>>>>>Hein
>>>>>>>>(via Tom White)
>>>>>>>>
>>>>>>>>
>>>>>>>>>________________________________
>>>>>>>>>From: Todd Lipcon <todd@cloudera.com>
>>>>>>>>>To: dev@hbase.apache.org
>>>>>>>>>Sent: Friday, July 29, 2011 9:58 AM
>>>>>>>>>Subject: Re: HBASE-4089 & HBASE-4147 - on the topic
of ops output
>>>>>>>>>
>>>>>>>>>What I'd prefer is something like:
>>>>>>>>>
>>>>>>>>>interface BlockCacheReportSink {
>>>>>>>>>  public void reportStats(BlockCacheReport report);
>>>>>>>>>}
>>>>>>>>>
>>>>>>>>>class LoggingBlockCacheReportSink {
>>>>>>>>>  ... {
>>>>>>>>>    log it with whatever formatting you want
>>>>>>>>>  }
>>>>>>>>>}
>>>>>>>>>
>>>>>>>>>then a configuration which could default to the logging
>>>>>>>>>implementation,
>>>>>>>>>but
>>>>>>>>>orgs could easily substitute their own implementation.
For example,
>>>>>>>>>I
>>>>>>>>>could
>>>>>>>>>see wanting to do an implementation where it keeps local
RRD graphs
>>>>>>>>>of
>>>>>>>>>some
>>>>>>>>>stats, or pushes them to a central management server.
>>>>>>>>>
>>>>>>>>>The assumption is that BlockCacheReport is a fairly straightforward
>>>>>>>>>"struct"
>>>>>>>>>with the non-formatted information available.
>>>>>>>>>
>>>>>>>>>-Todd
>>>>>>>>>
>>>>>>>>>On Fri, Jul 29, 2011 at 4:15 AM, Doug Meil
>>>>>>>>><doug.meil@explorysmedical.com>wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi Folks-
>>>>>>>>>>
>>>>>>>>>> You probably already my email yesterday on this...
>>>>>>>>>>  https://issues.apache.org/jira/browse/HBASE-4089
(block cache
>>>>>>>>>>report)
>>>>>>>>>>
>>>>>>>>>> ...and I just created this one...
>>>>>>>>>>  https://issues.apache.org/jira/browse/HBASE-4147
(StoreFile
>>>>>>>>>>query
>>>>>>>>>> report)
>>>>>>>>>>
>>>>>>>>>> What I'd like to run past the dev-list is this: 
if Hbase had
>>>>>>>>>>periodic
>>>>>>>>>> summary usage statistics, where should they go? 
What I'd like to
>>>>>>>>>>throw
>>>>>>>>>> out for discussion is that I'm suggesting that it
should simply
>>>>>>>>>>go
>>>>>>>>>>to
>>>>>>>>>>the
>>>>>>>>>> log files and users can slice and dice this on their
own.  No UI
>>>>>>>>>>(I.e.,
>>>>>>>>>> JSPs), no JMX, etc.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The summary out the output is this:
>>>>>>>>>> BlockCacheReport:  on configured interval, print
out summary of
>>>>>>>>>>blockcache
>>>>>>>>>> (at table/CF level) to log file. This one is point-in-time,
not
>>>>>>>>>>delta.
>>>>>>>>>>
>>>>>>>>>> StoreFile read report:  on configured interval,
print out summary
>>>>>>>>>>of
>>>>>>>>>> StoreFile accesses and how much time was spent reading
each
>>>>>>>>>>StoreFile
>>>>>>>>>>to
>>>>>>>>>> log file.
>>>>>>>>>>
>>>>>>>>>> Thoughts?
>>>>>>>>>>
>>>>>>>>>> Doug
>>>>>>>>>>
>>>>>>>>>> >
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>--
>>>>>>>>>Todd Lipcon
>>>>>>>>>Software Engineer, Cloudera
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>>
>>>>
>>
>>
>
>
>
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message