accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Newton <eric.new...@gmail.com>
Subject Re: GSOC: Monitor Improvements
Date Mon, 22 Apr 2013 15:03:43 GMT
Presently the information is stored in memory and it certainly could be
stored in tables.

This reminds me of an idea that I've been thinking about for a long time.
 It's a little aggressive to do in a single summer.

----

RRDTool stores time series data in fixed-length files.  One important
feature is the ability to compress time-series data into less-fine-grained
results over time.

However, updating many RRD files, with periodic updates, requires making
lots of small seeks and updates to individual files.  It works well when
all the files fit in the disk cache.  It falls down hard when it doesn't.

My idea is to put updates into an Accumulo row for one collected data
point, along with some recent version in RRD format:

Key                         Value
row, cf:cq
--------------------------------------------------------
point rrd:                  [RRDTool data]
point ts:timestamp    value
point ts:timestamp    value
point ts:timestamp    value
point ts:timestamp    value
point ts:timestamp    value

When the tablet compacts, you use a Combiner to push the updates into the
RRD data:

Key                         Value
row, cf:cv
-------------------------------------------------------
point rrd:                  [Updated RRDTool data]
point ts:timestamp    value

Further, when you scan the data, you could use an RRD iterator to perform
queries on the RRD format, which would extract out only the
summary/graph/data you want.

This leverages the Accumulo write-ahead log, and efficiency of
log-structured merge trees to defer RRD updates to a point where they can
be done efficiently (with respect to disk seeks), and even the block cache
to access recently read information quickly.  And, the data won't grow
indefinitely due to the properties of the RRD storage format.

Sadly, RRDTool does not have a Java API.  But there appear to be java-based
substitutes; I have no idea if they are license compatible.

OpenTSDB does something similar: they compress updates into blocks of
updates in hourly chunks, converting many small records into one larger
one.  Their scheme does not lose data, which was important to them.


-Eric



On Mon, Apr 22, 2013 at 10:33 AM, Supun Kamburugamuva <supun06@gmail.com>wrote:

> I can see how summaries are very helpful to a user. We can introduce new
> fields to the existing table/tablet summery tables that displays problem
> information etc.
>
> To make the JMX polling time configurable we can introduce configuration
> parameters.
>
> For the JMX statistics we can keep data at the server for a constant time
> to avoid memory growth. I think the stats are stored in memory (please
> correct me if I'm wrong). If that is the case, is it possible to store them
> in accumulo tables?
>
> Thanks,
> Supun...
>
> On Mon, Apr 22, 2013 at 9:05 AM, Eric Newton <eric.newton@gmail.com>
> wrote:
>
> > Another thing to consider is scale.  On large clusters (many hundreds of
> > nodes), more data is not helpful for visualization.  Instead, summaries,
> > averages and outliers are important.
> >
> > For example, if one node is consistently slow, it is better to know that
> > than to see one graph with low numbers in a sea of graphs.
>
>
> > If the monitor collects information using JMX, collection time for each
> > node would be a good thing to know, too.
> >
>
>
>
>
> >
> > -Eric
> >
> >
> > On Sun, Apr 21, 2013 at 10:00 PM, Josh Elser <josh.elser@gmail.com>
> wrote:
> >
> > > Supun,
> > >
> > > Yup, very much so. Having a way to consume any and all metrics via JMX
> > > would simplify things for any consumers (internal or external).
> > >
> > >
> > >
> > > On 04/21/2013 02:15 PM, Supun Kamburugamuva wrote:
> > >
> > >> Hi Josh,
> > >>
> > >> Thanks for the suggestions. I'll incorporate these to the proposal.
> > >>
> > >> Another area I would like to work is on JMX. There is a Jira that says
> > to
> > >> replace the Monitor calls from Thrift to JMX (Accumulo 694). Do you
> > think
> > >> this is a good addition to the Monitor?
> > >>
> > >> Thanks,
> > >> Supun..
> > >>
> > >>
> > >> On Sun, Apr 21, 2013 at 1:45 PM, Josh Elser <josh.elser@gmail.com>
> > wrote:
> > >>
> > >>  Supun,
> > >>>
> > >>> Looks good! Can I make some suggestions/comments?
> > >>>
> > >>> For: "Per table plots: ACCUMULO-594", I'd also like to see minor
> > >>> compactions, major compactions, index cache hit rate, and data cache
> > hit
> > >>> rate per table (same graphs that are displayed system-wide when you
> > visit
> > >>> http://${MONITOR_HOST}:50095/.
> > >>>
> > >>> For "Per tablet [server] plots", it would be neat if you could also
> > >>> extract some general statistics like top N least performing, top N
> > >>> highest
> > >>> performing, etc. tablet servers. Ideally, this could correlate with
> > >>> servers
> > >>> that may be having problems :).
> > >>>
> > >>> Do you see these proposed changes as being sufficient for 3-4 months
> of
> > >>> 40hrs/week work? If you plan to really dig into these changes
> (perhaps
> > >>> reworking components of the monitor itself), I could perhaps see
> this.
> > Do
> > >>> you have any ideas for more lofty goals that you could pursue as
> well?
> > I
> > >>> don't want you/us to get one month into things and see you complete
> > >>> everything we initially planned to accomplish :)
> > >>>
> > >>> - Josh
> > >>>
> > >>>
> > >>> On 04/21/2013 10:37 AM, Supun Kamburugamuva wrote:
> > >>>
> > >>>  Hi all,
> > >>>>
> > >>>> I would like to start writing the proposal for the GSoc. I've put
> > >>>> together
> > >>>> some initial high level goals of the project. Please let me know
> what
> > I
> > >>>> can
> > >>>> improve.
> > >>>>
> > >>>> Per table plots: Accumulo 594
> > >>>> ---------------------
> > >>>>
> > >>>> The goal of this is to display plots that explains the various
> > >>>> activtities
> > >>>> that happens per table. When we go to the tables page of the monitor
> > and
> > >>>> go
> > >>>> to a specific table it displays some information in a table format.
> We
> > >>>> can
> > >>>> argument this information by showing graphs for
> > >>>>
> > >>>> 1. Ingest entries
> > >>>> 2. Ingest data size
> > >>>> 3. Scan entries
> > >>>> 4. Scan data size
> > >>>>
> > >>>> Per tablet plots
> > >>>> ----------------------
> > >>>>
> > >>>> Same as in the table plots we can display information regarding
> tablet
> > >>>> servers in the tablet server page. The plots will display the same
> > >>>> information as table plots considering data per tablet server.
> > >>>>
> > >>>> Trace Visualization: Accumulo 1198
> > >>>> ----------------------------
> > >>>>
> > >>>> Since we are displaying graphs about each tablet and each table
we
> can
> > >>>> add
> > >>>> major and minor compaction graph to each table and each tablet.
> > >>>>
> > >>>> Or other option is to display this in a single graph in overview
> page
> > >>>> with
> > >>>> different graph lines for different tables and tablets.
> > >>>>
> > >>>> Server type information : Accumulo 807
> > >>>> ------------------------------****---
> > >>>>
> > >>>> For displaying this informations we can add a new page and display
> the
> > >>>> information as a table. The table should specify the network address
> > of
> > >>>> the
> > >>>> server, server type, weather it is active or in-active etc.
> > >>>>
> > >>>> Thanks,
> > >>>> Supun...
> > >>>>
> > >>>>
> > >>>>
> > >>
> > >
> >
>
>
>
> --
> Supun Kamburugamuva
> Member, Apache Software Foundation; http://www.apache.org
> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> Blog: http://supunk.blogspot.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message