Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 21757 invoked from network); 27 Oct 2008 19:51:51 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 27 Oct 2008 19:51:51 -0000 Received: (qmail 94886 invoked by uid 500); 27 Oct 2008 19:51:52 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 94160 invoked by uid 500); 27 Oct 2008 19:51:50 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 94149 invoked by uid 99); 27 Oct 2008 19:51:50 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Oct 2008 12:51:50 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jeff.hammerbacher@gmail.com designates 72.14.220.157 as permitted sender) Received: from [72.14.220.157] (HELO fg-out-1718.google.com) (72.14.220.157) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Oct 2008 19:50:36 +0000 Received: by fg-out-1718.google.com with SMTP id l26so1936459fgb.35 for ; Mon, 27 Oct 2008 12:51:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=/KKdTMU4/BmaIH5B+B1chqbaI4LmcQ2HO/deeu8BkHc=; b=k9Z3z8B6I02MBuk6tv3UOzcJ+tg+mUf9hT0PoqGD21kwUbQI2MxsLxkGEMBXmMKiGG o3AeOWa6E8c/23Mp09a1Y++alULnTToEC0Cg14u3CUwnZEb6RSqdog/3z56bVIxfIZSh JDaYdLzEbmHVIyZMyB5TRjdi+JOQEpXbiK4r8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=Ez2WuyCIvYrNNZlq/SJBN99BelHwx1g1+RRlQ3HzABza/locto4tkS9hG6apDkVfCx cnT07Tof7xMG56y9F1SbqCiFusi7NtdxqMJg9f4x91UJSqIAt12sTg7jdPoXzNaaWEsf nfTArXowmLCO2/n4z0K36C064cewL3REhZInA= Received: by 10.187.173.12 with SMTP id a12mr574666fap.104.1225137073901; Mon, 27 Oct 2008 12:51:13 -0700 (PDT) Received: by 10.187.195.18 with HTTP; Mon, 27 Oct 2008 12:51:13 -0700 (PDT) Message-ID: Date: Mon, 27 Oct 2008 12:51:13 -0700 From: "Jeff Hammerbacher" To: core-user@hadoop.apache.org Subject: Re: LHadoop Server simple Hadoop input and output In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: X-Virus-Checked: Checked by ClamAV on apache.org It could, but we have been unable to get Chukwa to run outside of Yahoo. On Fri, Oct 24, 2008 at 12:26 PM, Pete Wyckoff wrote: > > Chukwa also could be used here. > > > On 10/24/08 11:47 AM, "Jeff Hammerbacher" wrote: > > Hey Edward, > > The application we used at Facebook to transmit new data is open > source now and available at > http://sourceforge.net/projects/scribeserver/. > > Later, > Jeff > > On Fri, Oct 24, 2008 at 10:14 AM, Edward Capriolo wrote: >> I came up with my line of thinking after reading this article: >> >> http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-data >> >> As a guy that was intrigued by the java coffee cup in 95, that now >> lives as a data center/noc jock/unix guy. Lets say I look at a log >> management process from a data center prospective. I know: >> >> Syslog is a familiar model (human readable: UDP text) >> INETD/XINETD is a familiar model (programs that do amazing things with >> STD IN/STD OUT) >> Variety of hardware and software >> >> I may be supporting an older Solaris 8, windows or Free BSD 5 for example. >> >> I want to be able to pipe apache custom log at HDFS, or forward >> syslog. That is where LHadoop (or something like it) would come into >> play. >> >> I am thinking to even accept raw streams and have the server side use >> source-host/regex to determine what file the data should go to. >> >> I want to stay light on the client side. An application that tails log >> files and transmits new data is another component to develop and >> manage. Had anyone had experience with moving this type of data? >> > > >