ambari-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mahadev Konar <maha...@hortonworks.com>
Subject Re: Setting up and running Flume agents using Ambari
Date Tue, 06 Jan 2015 23:17:03 GMT
David,
 Does this look related? https://issues.apache.org/jira/browse/AMBARI-9009
?



On Wed, Dec 24, 2014 at 3:20 PM, David Novogrodsky <
david.novogrodsky@gmail.com> wrote:

> All,
>
> I have run Flume agents on a pusedo-distributed VM from Cloudera
> ingesting tweets from twitter.  When I paste the same configuratons
> into the Flume section of Ambari I do not get any data from twitter.
> The screen in Ambari says the agents are running but when I go to the
> directory, I see no files:
>
> [root@namenode PBX]# hadoop fs -ls  /user/flume/tweets
> [root@namenode PBX]# hadoop fs -ls  /user/flume/tweets
> [root@namenode PBX]# hadoop fs -ls  /user/flume/tweets/
> [root@namenode PBX]#
>
>
> I have attached the cluster parameters in a PDF.
>
> Here is the URL I am using  to add the configuration to the Flume agents:
>      http://namenode.localdomain.com:8080/#/main/services/FLUME/configs
>
> Here is the configuration for the twitter agent:
> # defining the source for the agent for Twitter
> TwitterAgent.sources.Twitter.type =
> org.apache.flume.source.twitter.TwitterSource
> TwitterAgent.sources.Twitter.channels = MemoryChannel
> TwitterAgent.sources.Twitter.consumerKey = (just removing for security)
> TwitterAgent.sources.Twitter.accessToken = (removing)
> TwitterAgent.sources.Twitter.accessTokenSecret =(removing)
> TwitterAgent.sources.Twitter.keywords = hadoop, big data, analytics,
> bigdata, cloudera, data science, data scientist, business
> intelligence, mapreduce, data warehouse, data warehousing, mahout,
> hbase, nosql, newsql, businessintelligence, cloudcomputing
> TwitterAgent.sources.Twitter.maxBatchSize = 10
> TwitterAgent.sources.Twitter.maxBatchDurationMillis = 200
>
> # defining the interceptors
> TwitterAgent.sources.Twitter.interceptors = i1
> TwitterAgent.sources.Twitter.interceptors.i1.type = timestamp
>
>
> # defining the sink for the agent
> TwitterAgent.sinks.HDFS.channel = MemoryChannel
> TwitterAgent.sinks.HDFS.type = hdfs
> TwitterAgent.sinks.HDFS.hdfs.path = /user/flume/tweets/%Y/%m/%d
> TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
> TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
> TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000
> TwitterAgent.sinks.HDFS.hdfs.rollSize = 0
> TwitterAgent.sinks.HDFS.hdfs.rollCount = 100000
> TwitterAgent.sinks.HDFS.hdfs.rollInterval = 6000
> TwitterAgent.sinks.HDFS.hdfs.filePrefix = events-
>
> # definning the channel for the agent
> TwitterAgent.channels.MemoryChannel.type = memory
> TwitterAgent.channels.MemoryChannel.capacity = 10000
> TwitterAgent.channels.MemoryChannel.transactionCapacity = 10000
>
>
> David Novogrodsky
> david.novogrodsky@gmail.com
> http://www.linkedin.com/in/davidnovogrodsky
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Mime
View raw message