hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Hammerbacher" <jeff.hammerbac...@gmail.com>
Subject Re: LHadoop Server simple Hadoop input and output
Date Mon, 27 Oct 2008 19:51:13 GMT
It could, but we have been unable to get Chukwa to run outside of Yahoo.

On Fri, Oct 24, 2008 at 12:26 PM, Pete Wyckoff <pwyckoff@facebook.com> wrote:
>
> Chukwa also could be used here.
>
>
> On 10/24/08 11:47 AM, "Jeff Hammerbacher" <jeff.hammerbacher@gmail.com> wrote:
>
> Hey Edward,
>
> The application we used at Facebook to transmit new data is open
> source now and available at
> http://sourceforge.net/projects/scribeserver/.
>
> Later,
> Jeff
>
> On Fri, Oct 24, 2008 at 10:14 AM, Edward Capriolo <edlinuxguru@gmail.com> wrote:
>> I came up with my line of thinking after reading this article:
>>
>> http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-data
>>
>> As a guy that was intrigued by the java coffee cup in 95, that now
>> lives as a data center/noc jock/unix guy. Lets say I look at a log
>> management process from a data center prospective. I know:
>>
>> Syslog is a familiar model (human readable: UDP text)
>> INETD/XINETD is a familiar model (programs that do amazing things with
>> STD IN/STD OUT)
>> Variety of hardware and software
>>
>> I may be supporting an older Solaris 8, windows or  Free BSD 5 for example.
>>
>> I want to be able to pipe apache custom log at HDFS, or forward
>> syslog. That is where LHadoop (or something like it) would come into
>> play.
>>
>> I am thinking to even accept raw streams and have the server side use
>> source-host/regex to determine what file the data should go to.
>>
>> I want to stay light on the client side. An application that tails log
>> files and transmits new data is another component to develop and
>> manage. Had anyone had experience with moving this type of data?
>>
>
>
>

Mime
View raw message