flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guyle Taber <gu...@gmtech.net>
Subject Re: flume sink and substitution variables
Date Fri, 29 Jul 2016 00:55:57 GMT
Thanks. Yeah we're actually capturing JSON POST data in the Apache logs (not GET data), so
at this point there is no hostname in the payload so, we'd have to figure out a way to derive
that by virtual host. 

> On Jul 28, 2016, at 5:33 PM, Ahmed Vila <ahmed.vila@symphony.is> wrote:
> 
> I didn't quite got it - are you ingesting apache access log or what ?
> 
> Either way, there is regex_extractor interceptor that you can configure to extract hostname
into the variable of your choice (f.e. %ApacheVirtualHostname). Of course, your event payload
has to contain vhost fqdn.
> https://flume.apache.org/FlumeUserGuide.html#regex-extractor-interceptor
> 
> Then, you can use that variable in the HdfsSink like you described
> 
> 
>> On Fri, Jul 29, 2016 at 12:57 AM, Guyle M. Taber <guyle@gmtech.net> wrote:
>> I’m trying to determine if I can use a substitution variable in the hdfs file path
that is derived from the apache virtual host name that is called on a web server listening
as multiple vhost names. Where is the substitution variable %host deriving that value and
is there another var I can use? Or can I use an interceptor to somehow extract the apache
virtual hostname called?
>> 
>> For instance, a single web server is hosting 3 virtual hosts.
>> 
>> vhost1.example.com
>> vhost2.example.com
>> vhost3.example.com
>> 
>> Can a single sink hdfs path be customized based on the vhost (not the server’s
system hostname) called?
>> 
>> Something like   "/hdfs/logdata/%ApacheVirtualHostname"
> 
> 
> 
> -- 
> 
> Best regards,
> 
> Ahmed Vila | Senior software engineer
> 
> 
> Mobile | +387 62 139 348
> Web | www.symphony.is
> Skype | wylla_av
> 
> San Francisco | Sarajevo | Belgrade
> No one can whistle a symphony

Mime
View raw message