hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: Feedback on real world production experience with Flume
Date Sun, 22 Apr 2012 14:14:47 GMT
I think this is valid to talk about for example one need not need a
decentralized collector if they can just write log directly to
decentralized files in a decentralized file system. In any case it was
not even a hard vendor pitch. It was someone describing how they
handle centralized logging. It stated facts and it was informative.

Lets face it, if fuse-mounting-hdfs or directly soft mounting NFS in a
way that performs well many of the use cases for flume and scribe like
tools would be gone. (not all but many)

I never knew there was a rule that discussing alternative software on
a mailing list. It seems like a closed minded thing. I also doubt the
ASF would back a rule like that. Are we not allowed to talk about EMR
or S3, or am I not even allowed to mention S3?

Can flume run on ec2 and log to S3? (oops party foul I guess I cant ask that.)


On Sun, Apr 22, 2012 at 12:59 AM, Alexander Lorenz
<wget.null@googlemail.com> wrote:
> no. That is the Flume Open Source Mailinglist. Not a vendor list.
> NFS logging has nothing to do with decentralized collectors like Flume, JMS or Scribe.
> sent via my mobile device
> On Apr 22, 2012, at 12:23 AM, Edward Capriolo <edlinuxguru@gmail.com> wrote:
>> It seems pretty relevant. If you can directly log via NFS that is a
>> viable alternative.
>> On Sat, Apr 21, 2012 at 11:42 AM, alo alt <wget.null@googlemail.com> wrote:
>>> We decided NO product and vendor advertising on apache mailing lists!
>>> I do not understand why you'll put that closed source stuff from your employe
in the room. It has nothing to do with flume or the use cases!
>>> --
>>> Alexander Lorenz
>>> http://mapredit.blogspot.com
>>> On Apr 21, 2012, at 4:06 PM, M. C. Srivas wrote:
>>>> Karl,
>>>> since you did ask for alternatives,  people using MapR prefer to use the
>>>> NFS access to directly deposit data (or access it).  Works seamlessly from
>>>> all Linuxes, Solaris, Windows, AIX and a myriad of other legacy systems
>>>> without having to load any agents on those machines. And it is fully
>>>> automatic HA
>>>> Since compression is built-in in MapR, the data gets compressed coming in
>>>> over NFS automatically without much fuss.
>>>> Wrt to performance,  can get about 870 MB/s per node if you have 10GigE
>>>> attached (of course, with compression, the effective throughput will
>>>> surpass that based on how good the data can be squeezed).
>>>> On Fri, Apr 20, 2012 at 3:14 PM, Karl Hennig <khennig@baynote.com>
>>>>> I am investigating automated methods of moving our data from the web
>>>>> into HDFS for processing, a process that's performed periodically.
>>>>> I am looking for feedback from anyone who has actually used Flume in
>>>>> production setup (redundant, failover) successfully.  I understand it
>>>>> now being largely rearchitected during its incubation as Apache Flume-NG,
>>>>> so I don't have full confidence in the old, stable releases.
>>>>> The other option would be to write our own tools.  What methods are
>>>>> using for these kinds of tasks?  Did you write your own or does Flume
>>>>> something else) work for you?
>>>>> I'm also on the Flume mailing list, but I wanted to ask these questions
>>>>> here because I'm interested in Flume _and_ alternatives.
>>>>> Thank you!

View raw message