Return-Path: Delivered-To: apmail-hadoop-chukwa-user-archive@minotaur.apache.org Received: (qmail 80933 invoked from network); 5 Jan 2010 20:07:41 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 5 Jan 2010 20:07:41 -0000 Received: (qmail 62066 invoked by uid 500); 5 Jan 2010 20:07:41 -0000 Delivered-To: apmail-hadoop-chukwa-user-archive@hadoop.apache.org Received: (qmail 62043 invoked by uid 500); 5 Jan 2010 20:07:41 -0000 Mailing-List: contact chukwa-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: chukwa-user@hadoop.apache.org Delivered-To: mailing list chukwa-user@hadoop.apache.org Received: (qmail 62034 invoked by uid 99); 5 Jan 2010 20:07:41 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Jan 2010 20:07:41 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of asrabkin@gmail.com designates 209.85.160.50 as permitted sender) Received: from [209.85.160.50] (HELO mail-pw0-f50.google.com) (209.85.160.50) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Jan 2010 20:07:34 +0000 Received: by pwi20 with SMTP id 20so11234757pwi.29 for ; Tue, 05 Jan 2010 12:07:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=USkhy5fR9NmWc6XZxey9cF8CrdVxCIm+AtFBSrM2A8Q=; b=LroucSIDMyayvS9H2kXJYvOCjx38Qw6yBLA8KtNJx6YTv2BkrV4hOfDuOlw4vs5Edv aGTinFeJdi+KvMVnyrJFg2fwJ8HPhUhQBMYn0kGG5lQOmyVT6g1NTqGeGbWK8wQYovAP MerzzSH0FlA+ZY3fjZU97cHPIt1ykfJqMkUpU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=A6wZ5xYZxXPThunqy+fs1PwVjAYr7ygYNeIcViDS+Rp5mbk54K3FV4rNZ/2zw4+mOa EmZ2wda0pmPusboZzsdjQUo8VHggVFJtPaGl13AqGMUoGnmwEsmJEqrpwLFLWAHE4jSi cyq7bQb+EuFqLKnZVdsOpWTSvT2T1CIhEFpJY= MIME-Version: 1.0 Received: by 10.142.6.20 with SMTP id 20mr348391wff.262.1262722034582; Tue, 05 Jan 2010 12:07:14 -0800 (PST) In-Reply-To: References: Date: Tue, 5 Jan 2010 15:07:14 -0500 Message-ID: <39b0afc01001051207x63754b2te1f6379a7f74f67@mail.gmail.com> Subject: Re: who transfers the data? From: Ariel Rabkin To: chukwa-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable There is one connection from the agent to a collector, that carries all the data coming from that agent. A collector can saturate the single-writer bandwidth of HDFS -- call it 20-40 MB/sec. If you have a single agent writing a large fraction of that, say more than 4 MB/sec, then Chukwa is probably not solving your problem anyway. On Tue, Jan 5, 2010 at 2:51 PM, Corbin Hoenes wrote: > It's a little unclear to me who is transferring the chunks to the > collectors. =A0Does each adaptor have a connection or does the agent have= a > single connection to the collector? =A0 For example if I have 10 log file= s > that I am tailing (an adaptor for each) do they all go to the same collec= tor > or does it distribute those to any one of the collectors I have listed in= my > collectors file? > http://hadoop.apache.org/chukwa/docs/current/design.html#Collectors > "Rather than have each adaptor write directly to HDFS, data is sent acros= s > the network to a=A0collector=A0process, that does the HDFS writes. Each > collector receives data from up to several hundred hosts, and writes all > this data to a single=A0sink file, which is a Hadoop sequence file of > serialized Chunks. Periodically, collectors close their sink files, renam= e > them to mark them available for processing, and resume writing a new file= . > Data is sent to collectors over HTTP." > > > > Corbin Hoenes > corbin@tynt.com > skype: choenes > > > --=20 Ari Rabkin asrabkin@gmail.com UC Berkeley Computer Science Department