flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Vachon <vac...@sessionm.com>
Subject Re: Flume 0.9.4 and AWS EMR 0.20.250
Date Wed, 14 Dec 2011 20:13:42 GMT
I was able to get flume to write to EMR using 0.9.2, but I fear I have run into other bugs.

2011-12-14 20:01:05,177 ERROR com.cloudera.flume.handlers.rolling.RollSink: Failure when attempting
to rotate and open new sink: java.io.IOException: File /rails/adstats/2011/12/09/19/adstatslog.00000019.20111214-195722621+0000.614859534929985.seq
could only be replicated to 0 nodes, instead of 1
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1531)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:685)
	at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)

Any ideas if this can be safely ignored?  I checked the HDFS and I see: 

hadoop@domU-12-31-39-06-E6-CE:~$ hadoop fs -ls /rails/adstats/2011/12/14/19
Found 1 items
-rw-r--r--   3 hadoop supergroup          0 2011-12-14 19:57 /rails/adstats/2011/12/14/19/adstatslog.00000019.20111214-195722621+0000.614859534929985.seq

I looked at the DFS admin pages and I see (which leads me to believe it replicated correctly):

Node	Last 
Contact	Admin State	Configured 
Capacity (GB)	Used 
(GB)	Non DFS 
Used (GB)	Remaining 
(GB)	Used 
(%)	Used 
(%)	Remaining 
(%)	Blocks
HOST1	2	In Service	9.34	0	2.4	6.94	0	
74.31	1
HOST2	2	In Service	9.34	0	2.4	6.94	0	
74.3	0

The other question is, does s3n work in this version?  We want to dual-stack (if you will)
our collectorSinks

On Dec 14, 2011, at 12:41 PM, Thomas Vachon wrote:

> AWS uses Hadoop 0.20.250 and Flume 0.9.4 seems to be using a newer system. This is causing
delivery of logs into HDFS to fail with: Protocol org.apache.hadoop.hdfs.protocol.ClientProtocol
version mismatch. (client = 63, server = 61).
> I tried replacing hadoop-core.jar with the Apache Hadoop 0.20.250 version, but that caused
bigger problems (flume was saying "not logged in" and throwing exceptions).  What is the correct
way to fix this problem?  I obvisouly cannot change Amazon's version of Hadoop, so I need
to find a compatible version of Flume.

View raw message