Return-Path: X-Original-To: apmail-incubator-chukwa-user-archive@www.apache.org Delivered-To: apmail-incubator-chukwa-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1A4034C74 for ; Sun, 29 May 2011 22:26:35 +0000 (UTC) Received: (qmail 6931 invoked by uid 500); 29 May 2011 22:26:35 -0000 Delivered-To: apmail-incubator-chukwa-user-archive@incubator.apache.org Received: (qmail 6913 invoked by uid 500); 29 May 2011 22:26:35 -0000 Mailing-List: contact chukwa-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: chukwa-user@incubator.apache.org Delivered-To: mailing list chukwa-user@incubator.apache.org Received: (qmail 6906 invoked by uid 99); 29 May 2011 22:26:34 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 29 May 2011 22:26:34 +0000 X-ASF-Spam-Status: No, hits=4.3 required=5.0 tests=FREEMAIL_FROM,FREEMAIL_REPLYTO,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of billgraham@gmail.com designates 209.85.160.175 as permitted sender) Received: from [209.85.160.175] (HELO mail-gy0-f175.google.com) (209.85.160.175) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 29 May 2011 22:26:26 +0000 Received: by gyf1 with SMTP id 1so1373140gyf.6 for ; Sun, 29 May 2011 15:26:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:reply-to:in-reply-to:references :from:date:message-id:subject:to:content-type; bh=Zz1OtC8A9g4LW29S3LWqbidly2axEB4pEHweDoIgGxc=; b=Nl5vX8DPhCZi8C1Vndn+/dN+Llu3ylsyp0GEm+UmAXxC5jifJtyXkotN9qqJgKNU6d PCmsSzOwcsrydbYzrxBW9jcCNgymA/n7AQUFmzif4zRTfcTUPSIV4SVpxWlOyLsMgt0r OBpqUJmt87LJ/xw7WXQO96E2eguDxDICS4aQ8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:reply-to:in-reply-to:references:from:date:message-id :subject:to:content-type; b=dnF2iA2gffaEUKkdaYirZAZw9Pe1Ps5fUs9LPgeY8WmVza82VtQTogaqVZ5nMEztAK nZ84ocZ5SDreL5kz6VMbzxDKKOtPkxhWkJYOcv0axiy5rW7pEzdjLHZ+3BVfijdFTpPy v7mk8koIsBR7iMAg0uEPWinqCVAMRXAB9GPzQ= Received: by 10.236.109.164 with SMTP id s24mr3018101yhg.353.1306707966072; Sun, 29 May 2011 15:26:06 -0700 (PDT) MIME-Version: 1.0 Received: by 10.236.36.6 with HTTP; Sun, 29 May 2011 15:25:46 -0700 (PDT) Reply-To: billgraham@gmail.com In-Reply-To: References: From: Bill Graham Date: Sun, 29 May 2011 15:25:46 -0700 Message-ID: Subject: Re: Using Chukwa for Web Analytics To: chukwa-user@incubator.apache.org Content-Type: multipart/alternative; boundary=0023547c8be72499c004a471a8e5 X-Virus-Checked: Checked by ClamAV on apache.org --0023547c8be72499c004a471a8e5 Content-Type: text/plain; charset=ISO-8859-1 We use chukwa for near-real time trending, conceptually similar to near-real time anomaly detection. We use Chukwa agents, collectors and Demux to collect log data in 5 minute increments which we then run MR jobs on, as Ari describes. It works well for us. On Sun, May 29, 2011 at 10:02 AM, Ariel Rabkin wrote: > My impression is that web log analysis is the main use that people are > putting Chukwa to. > The idea is that you scoop up web logs, throw them into HDFS, and then > run Pig jobs. > > --Ari > > On Sun, May 29, 2011 at 4:39 AM, Amos Shapira > wrote: > > In case this interests anyone - I'm following Chukwa for such purposes > too. > > Not just Google Analytics- like but also hoping to use it for near real > time > > anomaly detection... > > > > On 29 May 2011 18:19, Nikola Veber wrote: > >> > >> Hello, > >> > >> I have just discovered Chukwa, and after the initial feeling that it > >> would be a great tool to process large quantities of web-logs and > >> generate statistics like google analytics and co, I started searching > >> the web for hints - but I couldn't find any clue regarding this. > >> > >> Has anyone tried using Chukwa for Web-Analytics, or do you know any > >> a-priori limitations which speak against using it in this manner? > >> > >> > >> Thanks, > >> NIkola > > > > > > > > -- > Ari Rabkin asrabkin@gmail.com > UC Berkeley Computer Science Department > --0023547c8be72499c004a471a8e5 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable We use chukwa for near-real time trending, conceptually similar to near-rea= l time anomaly detection.

We use Chukwa agents, collecto= rs and Demux to collect log data in 5 minute increments which we then run M= R jobs on, as Ari describes. It works well for us.


On Sun, May 29, 2011 at 10:02= AM, Ariel Rabkin <asrabkin@gmail.com> wrote:
My impression is that web log analysis is the main use that people are
putting Chukwa to.
The idea is that you scoop up web logs, throw them into HDFS, and then
run Pig jobs.

--Ari

On Sun, May 29, 2011 at 4:39 AM, Amos Shapira <amos.shapira@gmail.com> wrote:
> In case this interests anyone - I'm following Chukwa for such purp= oses too.
> Not just Google Analytics- like but also hoping to use it for near rea= l time
> anomaly detection...
>
> On 29 May 2011 18:19, Nikola Veber <nikola.veber@gmail.com> wrote:
>>
>> Hello,
>>
>> I have just discovered Chukwa, and after the initial feeling that = it
>> would be a great tool to process large quantities of web-logs and<= br> >> generate statistics like google analytics and co, I started search= ing
>> the web for hints - but I couldn't find any clue regarding thi= s.
>>
>> Has anyone tried using Chukwa for Web-Analytics, or do you know an= y
>> a-priori limitations which speak against using it in this manner?<= br> >>
>>
>> Thanks,
>> NIkola
>
>



--
Ari Rabkin asrabkin@gmail.com
UC Berkeley Computer Science Department

--0023547c8be72499c004a471a8e5--