Return-Path: Delivered-To: apmail-hadoop-chukwa-dev-archive@minotaur.apache.org Received: (qmail 46574 invoked from network); 10 Aug 2010 01:00:19 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 10 Aug 2010 01:00:19 -0000 Received: (qmail 17441 invoked by uid 500); 10 Aug 2010 01:00:19 -0000 Delivered-To: apmail-hadoop-chukwa-dev-archive@hadoop.apache.org Received: (qmail 17404 invoked by uid 500); 10 Aug 2010 01:00:18 -0000 Mailing-List: contact chukwa-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: chukwa-dev@hadoop.apache.org Delivered-To: mailing list chukwa-dev@hadoop.apache.org Received: (qmail 17396 invoked by uid 99); 10 Aug 2010 01:00:18 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Aug 2010 01:00:18 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of asrabkin@gmail.com designates 74.125.82.48 as permitted sender) Received: from [74.125.82.48] (HELO mail-ww0-f48.google.com) (74.125.82.48) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Aug 2010 01:00:12 +0000 Received: by wwb39 with SMTP id 39so2635917wwb.29 for ; Mon, 09 Aug 2010 17:59:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=h8/rWhr4yVE3vQ55h9cA+n7vWkBNWBqhidtKSX6+E4Q=; b=kYxtNqIVZXkCH6Mvgd0PHgtTP6kB/yI2pF3hg+UJH3dm0Q7/tLZJwrCnC85KlaTD1f mHHTeJEEvVslCt/q3jqbnvwcPf2Gg89qVVtOS0eXv813hb2jAimKaK8oDibQda0O91Oa AWQ7juD5SVu6hxuJeezo35iqueqxllDnJWbcY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=mx27eTe1ujNWT0hkPViBTMm8KYRu37xgZTXZdyCOFysc9CwqxoJCvgHYuVEAOhO2Za S+/+vJAbqyHv/EBNylsVj/o8hYDs4Le1bWawwcKWa5jPCWWJRKxfuatLQDnFN90HH4No 2dnT+T4b3C6deL5xnb9q1l9rx1L6j1HjhEshg= MIME-Version: 1.0 Received: by 10.216.162.72 with SMTP id x50mr3312467wek.3.1281401991864; Mon, 09 Aug 2010 17:59:51 -0700 (PDT) Received: by 10.216.231.29 with HTTP; Mon, 9 Aug 2010 17:59:51 -0700 (PDT) In-Reply-To: References: Date: Mon, 9 Aug 2010 17:59:51 -0700 Message-ID: Subject: Re: Improvement for Chukwa Agent and Collector From: Ariel Rabkin To: chukwa-dev@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Proposal overall sounds useful. I like versioning. --Ari On Mon, Aug 9, 2010 at 5:03 PM, Eric Yang wrote: > I like to have /v1/ at least to identify the URL versioning. =A0Just to b= e > safe, if we change URL in the future. =A0/ and /tool to point to informat= ion > UI make sense. > > Regards, > Eric > > On 8/9/10 2:59 PM, "Bill Graham" wrote: > >> I generally feel that all params should be able to be passed either enti= rely >> in the body or entirely in the URI regardless which ones are required/op= tional >> (with the exception of the asset id, which typically is in the path >> regardless). I vote for passing them all in the body as a json blob in t= his >> case (if Content-Type is set to application/json that is). >> >> Thinking more about the base path to the API that I proposed, perhaps th= e >> /v1.0 in the URL is overkill. I could go for removing that part. The /re= st >> path has value though to me though, because I could see keeping '/' or '= /tool' >> to potentially point to an HTML summary page or mini-UI at some point. >> >> >> >> On Mon, Aug 9, 2010 at 2:42 PM, Eric Yang wrote: >>> Hi Bill, >>> >>> I like your design better. =A0+1 on the revised version. =A0RecordType = and >>> Adaptor are required parameters, would it make sense if we could put th= em on >>> the path parameters for POST? >>> >>> Regards, >>> Eric >>> >>> On 8/9/10 11:33 AM, "Bill Graham" wrote: >>> >>>> I agree that we should implement the features you suggest. I've been >>>> thinking about a REST API for the agents lately, as I'd also like to b= e able >>>> to expose statistics to help with monitoring. Something similar to wha= t the >>>> collector does so you can attach monitoring to a URL see if the averag= e data >>>> rate suddenly drops. >>>> >>>> Regarding the proposed API protocol, I think we should use POST, GET a= nd >>>> DELETE to create, fetch and remove adaptors, similar to how you propos= e, but >>>> the identifier in the rest resource should be the adaptor id, not the >>>> filename. This is more RESTful since the adaptor is the thing being >>>> accessed, not the file. Also, you could have more than one adaptor on = a >>>> given file and some adaptors (i.e., JMSAdaptor) don't have a file asso= ciated >>>> with them. >>>> >>>> I propose something like this: >>>> >>>> - Add Adaptor: >>>> >>>> POST /rest/v1.0/adaptor HTTP/1.0 >>>> Accept: text/plain >>>> Content-Type: application/json >>>> { "RecordType" : "jvm", "Cluster": "demo", adaptor configs including o= ffset, >>>> other tags ... } >>>> >>>> Returns: adaptor metadata including id >>>> >>>> - Get Adaptor fcb0fe44e9dd6d2283962cb0e3b4ea0f: >>>> >>>> GET /rest/v1.0/adaptor/fcb0fe44e9dd6d2283962cb0e3b4ea0f HTTP/1.0 >>>> >>>> - Remove Adaptor fcb0fe44e9dd6d2283962cb0e3b4ea0f: >>>> >>>> DELETE /rest/v1.0/adaptor/fcb0fe44e9dd6d2283962cb0e3b4ea0f HTTP/1.0 >>>> >>>> - List all adaptors: >>>> GET /rest/v1.0/adaptor HTTP/1.0 >>>> >>>> - Help >>>> GET /rest/v1.0/help HTTP/1.0 >>>> >>>> - Statistics for all adaptors >>>> GET /rest/v1.0/adaptorStats HTTP/1.0 >>>> >>>> - Statistics for a single adaptor >>>> GET /rest/v1.0/adaptorStats/fcb0fe44e9dd6d2283962cb0e3b4ea0f HTTP/1.0 >>>> >>>> Thoughts? >>>> >>>> thanks, >>>> Bill >>>> >>>> On Mon, Aug 9, 2010 at 10:01 AM, Eric Yang wrote= : >>>> >>>>> Hi all, >>>>> >>>>> =A0Chukwa Agent has a custom command protocol (port 9093). =A0The cur= rent >>>>> protocol is not easy to modify to implement security related features= such >>>>> as authentication and authorization. =A0I would like to propose that = we use >>>>> web service REST like protocol to improve security and be more aligne= d with >>>>> web standards. =A0Let=B9s go through the use cases of Chukwa Agent co= mmand >>>>> protocol: >>>>> >>>>> Start an adaptor: >>>>> >>>>> Current command: Add >>>>> >>>>> > org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailin= gA>>>> > d >>>>> aptorUTF8NewLineEscaped >>>>> /tmp/chukwa/var/log/metrics/chukwa-hdfs-jvm-1271121726962.log 0 >>>>> >>>>> Proposed: >>>>> POST /tmp/chukwa/var/log/metrics/chukwa-hdfs-jvm-1271121726962.log HT= TP/1.0 >>>>> Accept: chukwa/UTF8NewLineEscaped (optional) >>>>> Offset: 0 (optional) >>>>> Content-Type: application/json >>>>> { =B3RecordType=B2 : =B3jvm=B2, "Cluster": "demo", other tags ... } >>>>> >>>>> List adaptors: >>>>> >>>>> Current command: List >>>>> >>>>> Proposed: >>>>> GET / HTTP/1.0 >>>>> Accept: text/html >>>>> Get list of information about all streaming adatpors >>>>> >>>>> HEAD /tmp/chukwa/var/log/metrics/chukwa-hdfs-jvm-1271121726962.log HT= TP/1.0 >>>>> or >>>>> HEAD /adaptor_fcb0fe44e9dd6d2283962cb0e3b4ea0f HTTP/1.0 >>>>> Get information about the streaming adaptor only. >>>>> >>>>> Stop adaptors: >>>>> >>>>> Current command: Stop adaptor_fcb0fe44e9dd6d2283962cb0e3b4ea0f >>>>> >>>>> Proposed: >>>>> DELETE /tmp/chukwa/var/log/metrics/chukwa-hdfs-jvm-1271121726962.log >>>>> HTTP/1.0 or >>>>> DELETE /adaptor_fcb0fe44e9dd6d2283962cb0e3b4ea0f HTTP/1.0 >>>>> Delete the adaptor >>>>> >>>>> Help: >>>>> Current command: Help >>>>> >>>>> Proposed: >>>>> GET /help HTTP/1.0 >>>>> Accept: text/html >>>>> >>>>> With this modification, we can support encryption and Basic/Digest >>>>> Authentication from existing libraries without reinvent the wheel. = =A0If the >>>>> community is ok with this change, I would like to propose the next >>>>> improvement: >>>>> >>>>> Chukwa Agent and collectors are two different feature sets, but there >>>>> shouldn=B9t be any road block to build a switch to toggle the machine= to >>>>> serve >>>>> different responsibilities. =A0For example, a chukwa agent machine ca= n flip a >>>>> switch to join collector pool and continue to stream data from itself= . >>>>> =A0With >>>>> this improvement, it is more easily to dynamically create bigger data >>>>> collection pipeline on the fly. =A0Both system use the same communica= tion >>>>> protocol, hence it is easier to manage. =A0In the future, we can add = addition >>>>> commands like TRACE /config/reload to reload configuration, and tap i= nto >>>>> ZooKeeper for managing data flow in centralized configuration managem= ent. >>>>> >>>>> Any thoughts? >>>>> >>>>> Regards, >>>>> Eric >>>>> >>>>> >>>> >>> >> >> > > --=20 Ari Rabkin asrabkin@gmail.com UC Berkeley Computer Science Department