Return-Path: X-Original-To: apmail-flume-user-archive@www.apache.org Delivered-To: apmail-flume-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 10AF8E9D7 for ; Sat, 23 Feb 2013 15:17:45 +0000 (UTC) Received: (qmail 64355 invoked by uid 500); 23 Feb 2013 15:17:44 -0000 Delivered-To: apmail-flume-user-archive@flume.apache.org Received: (qmail 63921 invoked by uid 500); 23 Feb 2013 15:17:38 -0000 Mailing-List: contact user-help@flume.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flume.apache.org Delivered-To: mailing list user@flume.apache.org Received: (qmail 63885 invoked by uid 99); 23 Feb 2013 15:17:37 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 Feb 2013 15:17:37 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: 216.32.181.183 is neither permitted nor denied by domain of Jim.Langston@compuware.com) Received: from [216.32.181.183] (HELO ch1outboundpool.messaging.microsoft.com) (216.32.181.183) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 Feb 2013 15:17:27 +0000 Received: from mail193-ch1-R.bigfish.com (10.43.68.245) by CH1EHSOBE004.bigfish.com (10.43.70.54) with Microsoft SMTP Server id 14.1.225.23; Sat, 23 Feb 2013 15:17:05 +0000 Received: from mail193-ch1 (localhost [127.0.0.1]) by mail193-ch1-R.bigfish.com (Postfix) with ESMTP id BB95640126 for ; Sat, 23 Feb 2013 15:17:05 +0000 (UTC) X-Forefront-Antispam-Report: CIP:157.56.240.101;KIP:(null);UIP:(null);IPV:NLI;H:BL2PRD0510HT004.namprd05.prod.outlook.com;RD:none;EFVD:NLI X-SpamScore: 1 X-BigFish: PS1(zzbb2dIc85fhd799h4015I14ffI853kzz1f42h1ee6h1de0h1202h1e76h1d1ah1d2ahzz8275bh18c673hz2ei2a8h668h839hbe3he5bhf0ah1288h12a5h12bdh137ah1441h1504h1537h153bh162dh1631h1758h18e1h1946h19b5h1155h) Received-SPF: neutral (mail193-ch1: 157.56.240.101 is neither permitted nor denied by domain of compuware.com) client-ip=157.56.240.101; envelope-from=Jim.Langston@compuware.com; helo=BL2PRD0510HT004.namprd05.prod.outlook.com ;.outlook.com ; Received: from mail193-ch1 (localhost.localdomain [127.0.0.1]) by mail193-ch1 (MessageSwitch) id 13616326241160_14566; Sat, 23 Feb 2013 15:17:04 +0000 (UTC) Received: from CH1EHSMHS014.bigfish.com (snatpool3.int.messaging.microsoft.com [10.43.68.226]) by mail193-ch1.bigfish.com (Postfix) with ESMTP id F221DC01FD for ; Sat, 23 Feb 2013 15:17:03 +0000 (UTC) Received: from BL2PRD0510HT004.namprd05.prod.outlook.com (157.56.240.101) by CH1EHSMHS014.bigfish.com (10.43.70.14) with Microsoft SMTP Server (TLS) id 14.1.225.23; Sat, 23 Feb 2013 15:17:03 +0000 Received: from BL2PRD0510MB386.namprd05.prod.outlook.com ([169.254.11.251]) by BL2PRD0510HT004.namprd05.prod.outlook.com ([10.255.100.39]) with mapi id 14.16.0263.000; Sat, 23 Feb 2013 15:17:03 +0000 From: "Langston, Jim" To: "user@flume.apache.org" Subject: compression over-the-wire with 1.3.1 ? Thread-Topic: compression over-the-wire with 1.3.1 ? Thread-Index: AQHOEdjVzf5Ck+7lzkCmAD/hWySGWg== Date: Sat, 23 Feb 2013 15:17:03 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.14.0.111121 x-originating-ip: [10.255.100.4] Content-Type: multipart/alternative; boundary="_000_CD4E47953180Ejimlangstoncompuwarecom_" MIME-Version: 1.0 X-MS-Exchange-CrossPremises-AuthAs: Internal X-MS-Exchange-CrossPremises-AuthMechanism: 04 X-MS-Exchange-CrossPremises-AuthSource: BL2PRD0510HT004.namprd05.prod.outlook.com X-MS-Exchange-CrossPremises-SCL: -1 X-MS-Exchange-CrossPremises-messagesource: StoreDriver X-MS-Exchange-CrossPremises-BCC: X-MS-Exchange-CrossPremises-processed-by-journaling: Journal Agent X-MS-Exchange-CrossPremises-ContentConversionOptions: False;00160000;True;;iso-8859-1 X-OrganizationHeadersPreserved: BL2PRD0510HT004.namprd05.prod.outlook.com X-OriginatorOrg: compuware.com X-Virus-Checked: Checked by ClamAV on apache.org --_000_CD4E47953180Ejimlangstoncompuwarecom_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi all, A question on sending compressed files from a remote source to HDFS. I have been working with .94 of flume and gzip a file before sending it from a remote location to a HDFS cluster. Works great. Now, I'm looking to move to 1.2 or 1.3.1 (CDH4 installs 1.2 by default through the management tool), but I don't see the equivalent in 1.2 or 1.3.1. I found the reference to utilize the new source in 1.3.1, spoolDir, but when I try to pick up a compressed file in the spool directory I'm getting an error: 13/02/22 19:32:41 ERROR source.SpoolDirectorySource: Uncaught exception in = Runnable org.apache.flume.ChannelException: Unable to put batch on required channel:= org.apache.flume.channel.MemoryChannel{name: c1} at org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProce= ssor.java:200) at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(= SpoolDirectorySource.java:143) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:35= 1) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.acc= ess$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run= (ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav= a:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja= va:603) at java.lang.Thread.run(Thread.java:722) Caused by: org.apache.flume.ChannelException: Space for commit to queue cou= ldn't be acquired Sinks are likely not keeping up with sources, or the buff= er size is too tight at org.apache.flume.channel.MemoryChannel$MemoryTransaction.doCommit(Memory= Channel.java:126) at org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransacti= onSemantics.java:151) at org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProce= ssor.java:192) ... 9 more I have tried to increase the buffer size but it did not change the error. M= y current configuration file which generated the error: # Compressed Source agent_compressed.sources =3D r1 agent_compressed.channels =3D c1 agent_compressed.channels.c1.type =3D memory agent_compressed.sources.r1.type =3D spooldir agent_compressed.sources.r1.bufferMaxLineLength =3D 50000 agent_compressed.sources.r1.spoolDir =3D /tmp/COMPRESS agent_compressed.sources.r1.fileHeader =3D true agent_compressed.sources.r1.channels =3D c1 # Sink for Avro agent_compressed.sinks =3D avroSink-2 agent_compressed.sinks.avroSink-2.type =3D avro agent_compressed.sinks.avroSink-2.channel =3D c1 agent_compressed.sinks.avroSink-2.hostname =3D xxx.xxx.xxx.xxx agent_compressed.sinks.avroSink-2.port =3D xxxxx Thoughts? Hints ? Thanks, Jim --_000_CD4E47953180Ejimlangstoncompuwarecom_ Content-Type: text/html; charset="us-ascii" Content-ID: Content-Transfer-Encoding: quoted-printable
Hi all,

A question on sending compressed files from a remote source
to HDFS.

I have been working with .94 of flume and gzip a file before sending&n= bsp;
it from a remote location to a HDFS cluster. Works great.

Now, I'm looking to move to 1.2 or 1.3.1 (CDH4 installs 1.2 by 
default through the management tool), but I don't see the 
equivalent in 1.2 or 1.3.1. I found the reference to utilize the =
new source in 1.3.1, spoolDir, but when I try to pick up a compressed<= /div>
file in the spool directory I'm getting an error:

13/02/22 19:32:41 ERROR source.SpoolDirectorySource: Uncaught exceptio= n in Runnable
org.apache.flume.ChannelException: Unable to put batch on required cha= nnel: org.apache.flume.channel.MemoryChannel{name: c1}
at org= .apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.j= ava:200)
at org= .apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolD= irectorySource.java:143)
at jav= a.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at jav= a.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
at jav= a.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
at jav= a.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$30= 1(ScheduledThreadPoolExecutor.java:178)
at jav= a.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Sched= uledThreadPoolExecutor.java:293)
at jav= a.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110= )
at jav= a.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603= )
at jav= a.lang.Thread.run(Thread.java:722)
Caused by: org.apache.flume.ChannelException: Space for commit to queu= e couldn't be acquired Sinks are likely not keeping up with sources, or the= buffer size is too tight
at org= .apache.flume.channel.MemoryChannel$MemoryTransaction.doCommit(MemoryChanne= l.java:126)
at org= .apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSema= ntics.java:151)
at org= .apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.j= ava:192)
... 9 = more


I have tried to increase the buffer size but it did not change the err= or. My current configuration file which
generated the error:

# Compressed Source
agent_compressed.sources =3D r1
agent_compressed.channels =3D c1
agent_compressed.channels.c1.type =3D memory
agent_compressed.sources.r1.type =3D spooldir
agent_compressed.sources.r1.bufferMaxLineLength =3D 50000
agent_compressed.sources.r1.spoolDir =3D /tmp/COMPRESS
agent_compressed.sources.r1.fileHeader =3D true
agent_compressed.sources.r1.channels =3D c1

# Sink for Avro
agent_compressed.sinks =3D avroSink-2
agent_compressed.sinks.avroSink-2.type =3D avro
agent_compressed.sinks.avroSink-2.channel =3D c1
agent_compressed.sinks.avroSink-2.hostname =3D xxx.xxx.xxx.xxx
agent_compressed.sinks.avroSink-2.port =3D xxxxx


Thoughts? Hints ?


Thanks,

Jim

--_000_CD4E47953180Ejimlangstoncompuwarecom_--