flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mardan Khan <mardan8...@gmail.com>
Subject Re: How to upload the SEQ data into hdfs
Date Tue, 24 Jul 2012 23:18:17 GMT
Hi,


Many thanks for help. I have exactly used the same configuration but now
getting the following error. My hadoop is properly runing and I can write
the data to hdfs by -put command. I am providing the host IP address but
still give me hostname expected error. Any suggestion please.



12/07/25 00:04:16 INFO conf.FlumeConfiguration: Added sinks:
hdfs-Cluster1-sink Agent: agent1
12/07/25 00:04:17 WARN conf.FlumeConfiguration: Configuration empty for:
hdfs-Cluster1-sink.Removed.
12/07/25 00:04:17 INFO conf.FlumeConfiguration: Post-validation flume
configuration contains configuration  for agents: [agent, agent1]
12/07/25 00:04:17 INFO properties.PropertiesFileConfigurationProvider:
Creating channels
12/07/25 00:04:17 INFO properties.PropertiesFileConfigurationProvider:
created channel mem-channel-1
12/07/25 00:04:17 INFO sink.DefaultSinkFactory: Creating instance of sink
hdfs-Cluster1-sink typehdfs
12/07/25 00:04:17 INFO hdfs.HDFSEventSink: Hadoop Security enabled: false
12/07/25 00:04:17 INFO nodemanager.DefaultLogicalNodeManager: Starting new
configuration:{ sourceRunners:{avro-AppSrv-source=PollableSourceRunner: {
source:org.apache.flume.source.SequenceGeneratorSource@6659fb21counterGroup:{
name:null counters:{} } }}
sinkRunners:{hdfs-Cluster1-sink=SinkRunner: {
policy:org.apache.flume.sink.DefaultSinkProcessor@1d766806 counterGroup:{
name:null counters:{} } }}
channels:{mem-channel-1=org.apache.flume.channel.MemoryChannel@48a77106} }
12/07/25 00:04:17 INFO nodemanager.DefaultLogicalNodeManager: Starting
Channel mem-channel-1
12/07/25 00:04:17 INFO nodemanager.DefaultLogicalNodeManager: Waiting for
channel: mem-channel-1 to start. Sleeping for 500 ms
12/07/25 00:04:17 INFO nodemanager.DefaultLogicalNodeManager: Starting Sink
hdfs-Cluster1-sink
12/07/25 00:04:17 INFO nodemanager.DefaultLogicalNodeManager: Starting
Source avro-AppSrv-source
12/07/25 00:04:17 INFO source.SequenceGeneratorSource: Sequence generator
source starting
12/07/25 00:04:18 INFO hdfs.BucketWriter: Creating hdfs://
134.83.35.24:9000/user/mardan/flume/FlumeData.1343171057851.tmp
12/07/25 00:04:18 ERROR hdfs.HDFSEventSink: process failed
java.lang.IllegalArgumentException: java.net.URISyntaxException: Expected
hostname at index 7: hdfs://:9000
    at org.apache.hadoop.net.NetUtils.getCanonicalUri(NetUtils.java:267)
    at org.apache.hadoop.fs.FileSystem.getCanonicalUri(FileSystem.java:214)
    at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:526)
    at
org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:166)
    at
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:232)
    at
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:75)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:806)
    at
org.apache.hadoop.io.SequenceFile$Writer.<init>(SequenceFile.java:1059)
    at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:269)
    at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:368)
    at
org.apache.flume.sink.hdfs.HDFSSequenceFile.open(HDFSSequenceFile.java:65)
    at
org.apache.flume.sink.hdfs.HDFSSequenceFile.open(HDFSSequenceFile.java:49)
    at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:125)
    at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:183)
    at
org.apache.flume.sink.hdfs.HDFSEventSink$1.doCall(HDFSEventSink.java:432)
    at
org.apache.flume.sink.hdfs.HDFSEventSink$1.doCall(HDFSEventSink.java:429)
    at
org.apache.flume.sink.hdfs.HDFSEventSink$ProxyCallable.call(HDFSEventSink.java:164)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.URISyntaxException: Expected hostname at index 7:
hdfs://:9000
    at java.net.URI$Parser.fail(URI.java:2810)
    at java.net.URI$Parser.failExpecting(URI.java:2816)
    at java.net.URI$Parser.parseHostname(URI.java:3352)
    at java.net.URI$Parser.parseServer(URI.java:3198)
    at java.net.URI$Parser.parseAuthority(URI.java:3117)
    at java.net.URI$Parser.parseHierarchical(URI.java:3059)
    at java.net.URI$Parser.parse(URI.java:3015)
    at java.net.URI.<init>(URI.java:662)
    at org.apache.hadoop.net.NetUtils.getCanonicalUri(NetUtils.java:263)


The configuration file as below:



agent.sources = avro-AppSrv-source
agent.sinks = hdfs-Cluster1-sink
agent.channels = mem-channel-1
# set channel for sources, sinks
# properties of avro-AppSrv-source
agent.sources.avro-AppSrv-source.type = SEQ

agent.sources.avro-AppSrv-source.bind = localhost
agent.sources.avro-AppSrv-source.port = 10000

agent.sources.avro-AppSrv-source.channels = mem-channel-1

# properties of mem-channel-1
agent.channels.mem-channel-1.type = memory
agent.channels.mem-channel-1.capacity = 1000
agent.channels.mem-channel-1.transactionCapacity = 100
# properties of hdfs-Cluster1-sink
agent.sinks.hdfs-Cluster1-sink.type = hdfs

agent.sinks.hdfs-Cluster1-sink.channel = mem-channel-1
agent.sinks.hdfs-Cluster1-sink.hdfs.path = hdfs://
134.83.35.24:9000/user/mardan/flume





Thanks









On Tue, Jul 24, 2012 at 1:23 PM, Brock Noland <brock@cloudera.com> wrote:

> Hi,
>
> Your channel is not hooked up to the source and sink. See the additions
> below.
>
> agent.sources = avro-AppSrv-source
> agent.sinks = hdfs-Cluster1-sink
> agent.channels = mem-channel-1
> # set channel for sources, sinks
> # properties of avro-AppSrv-source
> agent.sources.avro-AppSrv-source.type = SEQ
>
> agent.sources.avro-AppSrv-source.bind = localhost
> agent.sources.avro-AppSrv-source.port = 10000
> agent.sources.avro-AppSrv-source.channels = mem-channel-1
> # properties of mem-channel-1
> agent.channels.mem-channel-1.type = memory
> agent.channels.mem-channel-1.capacity = 1000
> agent.channels.mem-channel-1.transactionCapacity = 100
> # properties of hdfs-Cluster1-sink
> agent.sinks.hdfs-Cluster1-sink.type = hdfs
> agent.sinks.hdfs-Cluster1-sink.channel = mem-channel-1
> agent.sinks.hdfs-Cluster1-sink.hdfs.path =
> hdfs://134.83.35.24/user/mukhtaj/flume/
>
>
> Also we seem to give a better error message here:
> https://issues.apache.org/jira/browse/FLUME-1271
>
>
> Brock
>
>
> On Tue, Jul 24, 2012 at 6:58 AM, mardan Khan <mardan8310@gmail.com> wrote:
> > Hi Will,
> >
> > I did changed in configuration file as per your suggestion
> > (agent.sources.avro-AppSrv-source.type = SEQ) but still I am getting the
> > same error.
> >
> > The configiration file as:
> >
> >
> > agent.sources = avro-AppSrv-source
> > agent.sinks = hdfs-Cluster1-sink
> > agent.channels = mem-channel-1
> > # set channel for sources, sinks
> > # properties of avro-AppSrv-source
> > agent.sources.avro-AppSrv-source.type = SEQ
> >
> > agent.sources.avro-AppSrv-source.bind = localhost
> > agent.sources.avro-AppSrv-source.port = 10000
> > # properties of mem-channel-1
> > agent.channels.mem-channel-1.type = memory
> > agent.channels.mem-channel-1.capacity = 1000
> > agent.channels.mem-channel-1.transactionCapacity = 100
> > # properties of hdfs-Cluster1-sink
> > agent.sinks.hdfs-Cluster1-sink.type = hdfs
> > agent.sinks.hdfs-Cluster1-sink.hdfs.path =
> > hdfs://134.83.35.24/user/mukhtaj/flume/
> >
> >
> >
> >
> > The error as:
> >
> >
> > 12/07/24 12:52:33 ERROR properties.PropertiesFileConfigurationProvider:
> > Failed to load configuration data. Exception follows.
> >
> > java.lang.NullPointerException
> >  at
> >
> org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.loadSources(PropertiesFileConfigurationProvider.java:324)
> >  at
> >
> org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:222)
> >  at
> >
> org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:123)
> >  at
> >
> org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> >  at
> >
> org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:202)
> >  at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> >  at
> >
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> >  at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> >
> >
> > Why i am getting this error. I am struggling from few days for this
> proble.
> > Runing any command get this error.
> >
> >
> > Any sugesstion please.
> >
> >
> > Thanks
> >
> >
> >
> >
> >
> >
> >
> > On Tue, Jul 24, 2012 at 3:46 AM, Will McQueen <will@cloudera.com> wrote:
> >>
> >> Or as Brock said, you can refer to the link he posted and use the
> example
> >> from the user guide instead, then you'll need to include this:
> >>
> >> agent.sources = avro-AppSrv-source
> >> agent.sinks = hdfs-Cluster1-sink
> >> agent.channels = mem-channel-1
> >>
> >> ... but that example uses an Avro source so you'll likely need to start
> an
> >> avro-client to test (or use Flume SDK). Or just change the source type
> to
> >> SEQ.
> >>
> >> Cheers,
> >> Will
> >>
> >>
> >> On Mon, Jul 23, 2012 at 6:07 PM, mardan Khan <mardan8310@gmail.com>
> wrote:
> >>>
> >>>
> >>>
> >>>
> >>> Thanks Brocks,
> >>>
> >>> I have just gone through the posted link and just copy past the one of
> >>> configuration file  and change the hdfs path as below:
> >>>
> >>>
> >>>
> >>> # properties of avro-AppSrv-source
> >>> agent.sources.avro-AppSrv-source.type = avro
> >>> agent.sources.avro-AppSrv-source.bind = localhost
> >>> agent.sources.avro-AppSrv-source.port = 10000
> >>>
> >>> # properties of mem-channel-1
> >>> agent.channels.mem-channel-1.type = memory
> >>> agent.channels.mem-channel-1.capacity = 1000
> >>> agent.channels.mem-channel-1.transactionCapacity = 100
> >>>
> >>> # properties of hdfs-Cluster1-sink
> >>> agent.sinks.hdfs-Cluster1-sink.type = hdfs
> >>> agent.sinks.hdfs-Cluster1-sink.hdfs.path =
> >>> hdfs://134.83.35.24/user/mardan/flume/
> >>>
> >>>
> >>> apply the following command:
> >>>
> >>> $  /usr/bin/flume-ng agent -n agent -c conf -f
> >>> /usr/lib/flume-ng/conf/flume.conf
> >>>
> >>>
> >>> and got the following error. Most of the time of getting this error
> >>>
> >>> 12/07/24 01:54:43 ERROR properties.PropertiesFileConfigurationProvider:
> >>> Failed to load configuration data. Exception follows.
> >>> java.lang.NullPointerException
> >>>     at
> >>>
> org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.loadSources(PropertiesFileConfigurationProvider.java:324)
> >>>     at
> >>>
> org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.load(PropertiesFileConfigurationProvider.java:222)
> >>>     at
> >>>
> org.apache.flume.conf.file.AbstractFileConfigurationProvider.doLoad(AbstractFileConfigurationProvider.java:123)
> >>>     at
> >>>
> org.apache.flume.conf.file.AbstractFileConfigurationProvider.access$300(AbstractFileConfigurationProvider.java:38)
> >>>     at
> >>>
> org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:202)
> >>>     at
> >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> >>>     at
> >>>
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> >>>     at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> >>>     at
> >>>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> >>>     at
> >>>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> >>>     at
> >>>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> >>>     at
> >>>
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> >>>     at
> >>>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> >>>     at java.lang.Thread.run(Thread.java:662)
> >>>
> >>> I think some thing wrong in the configuration file. I am using flume1.x
> >>> version and installed in /usr/lib/flume-ng/
> >>>
> >>> Could you please check the command and configuration file.
> >>>
> >>> Thanks
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Tue, Jul 24, 2012 at 1:33 AM, Brock Noland <brock@cloudera.com>
> wrote:
> >>>>
> >>>> Yes, you can do that. In fact that is the most common case. The
> >>>> documents which should help you do so are here:
> >>>>
> >>>>
> >>>>
> https://cwiki.apache.org/confluence/display/FLUME/Flume+1.x+Documentation
> >>>>
> >>>> Brock
> >>>>
> >>>> On Mon, Jul 23, 2012 at 7:26 PM, mardan Khan <mardan8310@gmail.com>
> >>>> wrote:
> >>>> > Hi,
> >>>> >
> >>>> > I am just doing testing. I am generating the sequence and want
to
> >>>> > upload
> >>>> > into hdfs. My configuration file as:
> >>>> >
> >>>> > agent2.channels = c1
> >>>> > agent2.sources = r1
> >>>> > agent2.sinks = k1
> >>>> >
> >>>> > agent2.channels.c1.type = MEMORY
> >>>> >
> >>>> > agent2.sources.r1.channels = c1
> >>>> > agent2.sources.r1.type = SEQ
> >>>> >
> >>>> > agent2.sinks.k1.channel = c1
> >>>> > agent2.sinks.k1.type = LOGGER
> >>>> >
> >>>> >
> >>>> > Is it possible to upload into hdfs, if possible then how I can
make
> >>>> > the
> >>>> > changes in configuration file.
> >>>> >
> >>>> >
> >>>> > Many thanks
> >>>> >
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Apache MRUnit - Unit testing MapReduce -
> >>>> http://incubator.apache.org/mrunit/
> >>>
> >>>
> >>
> >
>
>
>
> --
> Apache MRUnit - Unit testing MapReduce -
> http://incubator.apache.org/mrunit/
>

Mime
View raw message