flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Justin Workman <justinjwork...@gmail.com>
Subject Re: hdfs.idleTime
Date Sat, 14 Jan 2017 01:44:49 GMT
Sorry for wasting anyones time. In reviewing my configuration, I have a
typo in the hdfs.idleTimeout configuration.

On Fri, Jan 13, 2017 at 2:14 PM, Justin Workman <justinjworkman@gmail.com>
wrote:

> I'll try  debug again. The output /regex seems to be fine, but I never see
> a call to close/rename the last files in each directory until flume shuts
> down or restarts.
>
> I would expect to see this call when the idleTimeout value is reached.
>
> Sent from my iPhone
>
> On Jan 13, 2017, at 2:05 PM, iain wright <iainwrig@gmail.com> wrote:
>
> Might be worth trying the debug output (I forget exact sink name) to just
> log the headers being attached to events after the interceptor to validate
> the regex is working correctly, and for all events.
>
> I setup this exact config at previous company so I know it works.
>
> I also remember needing to escape the regex in an odd way due to how java
> was loading/parsing the config
>
> Best,
> Iain
>
> Sent from my iPhone
>
> On Jan 13, 2017, at 12:00 PM, Justin Workman <justinjworkman@gmail.com>
> wrote:
>
> Absolutey, see below. Just to reiterate, when using the timestamp
> interceptor values to build the output path based on timestamp in the flume
> header, things roll correct. The files also roll just fine base on file
> size as well. However when using the regex_interceptor to get the actual
> events timestamp to use in the output path, the last file in each directory
> does not ever rename/close until flume is restarted.
>
>
> *flume-conf.properties*
> agent1.sources  = fpssKafkaTopic
> agent1.channels = fpssHdfsFileChannel
> agent1.sinks = fpssHdfsSink
>
> agent1.sources.fpssKafkaTopic.type = org.apache.flume.source.kafka.
> KafkaSource
> agent1.sources.fpssKafkaTopic.zookeeperConnect = zk-host:2181
> agent1.sources.fpssKafkaTopic.topic = first-pass-stream-sessionized
> agent1.sources.fpssKafkaTopic.groupId =  flume-first-pass-stream-
> sessionized
> agent1.sources.fpssKafkaTopic.kafka.auto.offset.reset = smallest
> agent1.sources.fpssKafkaTopic.channels = fpssHdfsFileChannel
> agent1.sources.fpssKafkaTopic.interceptors = i1 i2 i3
> agent1.sources.fpssKafkaTopic.interceptors.i1.type = timestamp
> agent1.sources.fpssKafkaTopic.interceptors.i1.preserveExisting = false
> agent1.sources.fpssKafkaTopic.interceptors.i2.type =
> org.apache.flume.interceptor.HostInterceptor$Builder
> agent1.sources.fpssKafkaTopic.interceptors.i2.hostHeader = hostname
> agent1.sources.fpssKafkaTopic.interceptors.i2.useIP= false
> agent1.sources.fpssKafkaTopic.interceptors.i2.preserveExisting = true
> agent1.sources.fpssKafkaTopic.interceptors.i3.type = regex_extractor
> agent1.sources.fpssKafkaTopic.interceptors.i3.regex =
> ^.*\\"entryId\\":\\{\\"date\\":\\"(\\d\\d\\d\\d)-(\\d\\d)-(\
> \d\\d)T(\\d\\d):.*\\"\\}.*$
> agent1.sources.fpssKafkaTopic.interceptors.i3.serializers = s1 s2 s3 s4
> agent1.sources.fpssKafkaTopic.interceptors.i3.serializers.s1.name = year
> agent1.sources.fpssKafkaTopic.interceptors.i3.serializers.s2.name = month
> agent1.sources.fpssKafkaTopic.interceptors.i3.serializers.s3.name = day
> agent1.sources.fpssKafkaTopic.interceptors.i3.serializers.s4.name = hour
> agent1.sources.fpssKafkaTopic.kafka.consumer.timeout.ms = 100
>
> agent1.channels.fpssHdfsFileChannel.type = file
> agent1.channels.fpssHdfsFileChannel.checkpointDir =
> /opt/flume/file-channel/fpss/checkpoint
> agent1.channels.fpssHdfsFileChannel.dataDirs =
> /opt/flume/file-channel/fpss/data
>
> agent1.sinks.fpssHdfsSink.type = hdfs
> agent1.sinks.fpssHdfsSink.hdfs.filePrefix = %{hostname}-log
> agent1.sinks.fpssHdfsSink.hdfs.inUseSuffix = .tmp
> agent1.sinks.fpssHdfsSink.hdfs.path = hdfs://prodcluster/flumedata/
> processed/first-pass-stream/%{year}/%{month}/%{day}/%{hour}-00
> agent1.sinks.fpssHdfsSink.hdfs.kerberosPrincipal = runtime@EXAMPLE.COM
> agent1.sinks.fpssHdfsSink.hdfs.kerberosKeytab = <keytab path removed for
> privacy>
> agent1.sinks.fpssHdfsSink.hdfs.rollInterval = 0
> agent1.sinks.fpssHdfsSink.hdfs.rollCount = 0
> ## Account for compression. See flume-2128
> ## My calculation: 512 * 1024 * 1024 * 2.75
> agent1.sinks.fpssHdfsSink.hdfs.rollSize = 1476395008
> # Close file if idle more than 300 seconds
> agent1.sinks.hdfsSink.hdfs.idleTimeout = 300
> agent1.sinks.hdfsSink.hdfs.useLocalTimeStamp = true
> agent1.sinks.fpssHdfsSink.hdfs.fileType = CompressedStream
> agent1.sinks.fpssHdfsSink.hdfs.codeC = snappy
> agent1.sinks.fpssHdfsSink.hdfs.writeFormat = Text
> agent1.sinks.fpssHdfsSink.channel = fpssHdfsFileChannel
> agent1.sinks.fpssHdfsSink.hdfs.batchSize = 10000
> agent1.sinks.fpssHdfsSink.hdfs.threadsPoolSize = 20
> agent1.sinks.fpssHdfsSink.hdfs.callTimeout = 20000
>
> *HDFS Output Since Midnight (Notice the last file is never closed/renamed)*
>  hdfs dfs -ls /flumedata/processed/first-pass-stream/2017/01/13/*/
> 17/01/13 12:38:52 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> Found 7 items
> -rw-r--r--   3 b2c_runtime hadoop  513710580 2017-01-13 00:09
> /flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log.
> 1484290815397.snappy
> -rw-r--r--   3 b2c_runtime hadoop  514439844 2017-01-13 00:18
> /flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log.
> 1484290815398.snappy
> -rw-r--r--   3 b2c_runtime hadoop  515125962 2017-01-13 00:28
> /flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log.
> 1484290815399.snappy
> -rw-r--r--   3 b2c_runtime hadoop  513010837 2017-01-13 00:38
> /flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log.
> 1484290815400.snappy
> -rw-r--r--   3 b2c_runtime hadoop  511315467 2017-01-13 00:49
> /flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log.
> 1484290815401.snappy
> -rw-r--r--   3 b2c_runtime hadoop  508420966 2017-01-13 00:59
> /flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log.
> 1484290815402.snappy
> -rw-r--r--   3 b2c_runtime hadoop    2503353 2017-01-13 00:59
> /flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log.
> 1484290815403.snappy.tmp
> Found 6 items
> -rw-r--r--   3 b2c_runtime hadoop  509116221 2017-01-13 01:10
> /flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log.
> 1484294415705.snappy
> -rw-r--r--   3 b2c_runtime hadoop  507800675 2017-01-13 01:21
> /flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log.
> 1484294415706.snappy
> -rw-r--r--   3 b2c_runtime hadoop  504432110 2017-01-13 01:32
> /flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log.
> 1484294415707.snappy
> -rw-r--r--   3 b2c_runtime hadoop  501932914 2017-01-13 01:42
> /flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log.
> 1484294415708.snappy
> -rw-r--r--   3 b2c_runtime hadoop  498136257 2017-01-13 01:50
> /flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log.
> 1484294415709.snappy
> -rw-r--r--   3 b2c_runtime hadoop      60539 2017-01-13 01:50
> /flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log.
> 1484294415710.snappy.tmp
> Found 6 items
> -rw-r--r--   3 b2c_runtime hadoop  500879399 2017-01-13 02:11
> /flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log.
> 1484298016017.snappy
> -rw-r--r--   3 b2c_runtime hadoop  501827071 2017-01-13 02:21
> /flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log.
> 1484298016018.snappy
> -rw-r--r--   3 b2c_runtime hadoop  501489101 2017-01-13 02:32
> /flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log.
> 1484298016019.snappy
> -rw-r--r--   3 b2c_runtime hadoop  501527838 2017-01-13 02:43
> /flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log.
> 1484298016020.snappy
> -rw-r--r--   3 b2c_runtime hadoop  499393977 2017-01-13 02:54
> /flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log.
> 1484298016021.snappy
> -rw-r--r--   3 b2c_runtime hadoop    1282327 2017-01-13 02:54
> /flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log.
> 1484298016022.snappy.tmp
> Found 6 items
> -rw-r--r--   3 b2c_runtime hadoop  501033294 2017-01-13 03:10
> /flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log.
> 1484301615579.snappy
> -rw-r--r--   3 b2c_runtime hadoop  500933906 2017-01-13 03:20
> /flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log.
> 1484301615580.snappy
> -rw-r--r--   3 b2c_runtime hadoop  505869233 2017-01-13 03:31
> /flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log.
> 1484301615581.snappy
> -rw-r--r--   3 b2c_runtime hadoop  502910608 2017-01-13 03:41
> /flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log.
> 1484301615582.snappy
> -rw-r--r--   3 b2c_runtime hadoop  499561080 2017-01-13 03:52
> /flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log.
> 1484301615583.snappy
> -rw-r--r--   3 b2c_runtime hadoop    3616826 2017-01-13 03:52
> /flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log.
> 1484301615584.snappy.tmp
> Found 6 items
> -rw-r--r--   3 b2c_runtime hadoop  502243204 2017-01-13 04:11
> /flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log.
> 1484305215893.snappy
> -rw-r--r--   3 b2c_runtime hadoop  508966498 2017-01-13 04:22
> /flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log.
> 1484305215894.snappy
> -rw-r--r--   3 b2c_runtime hadoop  510972236 2017-01-13 04:34
> /flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log.
> 1484305215895.snappy
> -rw-r--r--   3 b2c_runtime hadoop  513225577 2017-01-13 04:46
> /flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log.
> 1484305215896.snappy
> -rw-r--r--   3 b2c_runtime hadoop  512743679 2017-01-13 04:57
> /flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log.
> 1484305215897.snappy
> -rw-r--r--   3 b2c_runtime hadoop    3888775 2017-01-13 04:57
> /flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log.
> 1484305215898.snappy.tmp
> Found 7 items
> -rw-r--r--   3 b2c_runtime hadoop  515832251 2017-01-13 05:11
> /flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log.
> 1484308811983.snappy
> -rw-r--r--   3 b2c_runtime hadoop  518077964 2017-01-13 05:20
> /flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log.
> 1484308811984.snappy
> -rw-r--r--   3 b2c_runtime hadoop  519490676 2017-01-13 05:29
> /flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log.
> 1484308811985.snappy
> -rw-r--r--   3 b2c_runtime hadoop  519105563 2017-01-13 05:37
> /flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log.
> 1484308811986.snappy
> -rw-r--r--   3 b2c_runtime hadoop  518672209 2017-01-13 05:46
> /flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log.
> 1484308811987.snappy
> -rw-r--r--   3 b2c_runtime hadoop  520019853 2017-01-13 05:53
> /flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log.
> 1484308811988.snappy
> -rw-r--r--   3 b2c_runtime hadoop    1574211 2017-01-13 05:53
> /flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log.
> 1484308811989.snappy.tmp
> Found 9 items
> -rw-r--r--   3 b2c_runtime hadoop  521428204 2017-01-13 06:07
> /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.
> 1484312413743.snappy
> -rw-r--r--   3 b2c_runtime hadoop  519885769 2017-01-13 06:15
> /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.
> 1484312413744.snappy
> -rw-r--r--   3 b2c_runtime hadoop  519050891 2017-01-13 06:21
> /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.
> 1484312413745.snappy
> -rw-r--r--   3 b2c_runtime hadoop  520691322 2017-01-13 06:29
> /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.
> 1484312413746.snappy
> -rw-r--r--   3 b2c_runtime hadoop  520902319 2017-01-13 06:36
> /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.
> 1484312413747.snappy
> -rw-r--r--   3 b2c_runtime hadoop  520831873 2017-01-13 06:42
> /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.
> 1484312413748.snappy
> -rw-r--r--   3 b2c_runtime hadoop  519785647 2017-01-13 06:49
> /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.
> 1484312413749.snappy
> -rw-r--r--   3 b2c_runtime hadoop  520590143 2017-01-13 06:55
> /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.
> 1484312413750.snappy
> -rw-r--r--   3 b2c_runtime hadoop    4621367 2017-01-13 06:55
> /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.
> 1484312413751.snappy.tmp
> Found 11 items
> -rw-r--r--   3 b2c_runtime hadoop  522623760 2017-01-13 07:06
> /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.
> 1484316015214.snappy
> -rw-r--r--   3 b2c_runtime hadoop  523065112 2017-01-13 07:12
> /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.
> 1484316015215.snappy
> -rw-r--r--   3 b2c_runtime hadoop  523445533 2017-01-13 07:18
> /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.
> 1484316015216.snappy
> -rw-r--r--   3 b2c_runtime hadoop  523084945 2017-01-13 07:24
> /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.
> 1484316015217.snappy
> -rw-r--r--   3 b2c_runtime hadoop  524283976 2017-01-13 07:30
> /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.
> 1484316015218.snappy
> -rw-r--r--   3 b2c_runtime hadoop  523923379 2017-01-13 07:36
> /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.
> 1484316015219.snappy
> -rw-r--r--   3 b2c_runtime hadoop  523910723 2017-01-13 07:42
> /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.
> 1484316015220.snappy
> -rw-r--r--   3 b2c_runtime hadoop  524266095 2017-01-13 07:47
> /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.
> 1484316015221.snappy
> -rw-r--r--   3 b2c_runtime hadoop  523002505 2017-01-13 07:53
> /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.
> 1484316015222.snappy
> -rw-r--r--   3 b2c_runtime hadoop  520706211 2017-01-13 07:58
> /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.
> 1484316015223.snappy
> -rw-r--r--   3 b2c_runtime hadoop    8051588 2017-01-13 07:58
> /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.
> 1484316015224.snappy.tmp
> Found 11 items
> -rw-r--r--   3 b2c_runtime hadoop  520528155 2017-01-13 08:05
> /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.
> 1484319618433.snappy
> -rw-r--r--   3 b2c_runtime hadoop  521761390 2017-01-13 08:11
> /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.
> 1484319618434.snappy
> -rw-r--r--   3 b2c_runtime hadoop  522548272 2017-01-13 08:16
> /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.
> 1484319618435.snappy
> -rw-r--r--   3 b2c_runtime hadoop  522616117 2017-01-13 08:22
> /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.
> 1484319618436.snappy
> -rw-r--r--   3 b2c_runtime hadoop  525953759 2017-01-13 08:28
> /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.
> 1484319618437.snappy
> -rw-r--r--   3 b2c_runtime hadoop  524475009 2017-01-13 08:34
> /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.
> 1484319618438.snappy
> -rw-r--r--   3 b2c_runtime hadoop  523995339 2017-01-13 08:40
> /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.
> 1484319618439.snappy
> -rw-r--r--   3 b2c_runtime hadoop  524188832 2017-01-13 08:47
> /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.
> 1484319618440.snappy
> -rw-r--r--   3 b2c_runtime hadoop  525303001 2017-01-13 08:53
> /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.
> 1484319618441.snappy
> -rw-r--r--   3 b2c_runtime hadoop  525606532 2017-01-13 08:59
> /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.
> 1484319618442.snappy
> -rw-r--r--   3 b2c_runtime hadoop    4486982 2017-01-13 08:59
> /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.
> 1484319618443.snappy.tmp
> Found 11 items
> -rw-r--r--   3 b2c_runtime hadoop  525207364 2017-01-13 09:06
> /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.
> 1484323216987.snappy
> -rw-r--r--   3 b2c_runtime hadoop  526105891 2017-01-13 09:12
> /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.
> 1484323216988.snappy
> -rw-r--r--   3 b2c_runtime hadoop  526426735 2017-01-13 09:18
> /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.
> 1484323216989.snappy
> -rw-r--r--   3 b2c_runtime hadoop  525298099 2017-01-13 09:24
> /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.
> 1484323216990.snappy
> -rw-r--r--   3 b2c_runtime hadoop  525282945 2017-01-13 09:30
> /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.
> 1484323216991.snappy
> -rw-r--r--   3 b2c_runtime hadoop  523921005 2017-01-13 09:36
> /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.
> 1484323216992.snappy
> -rw-r--r--   3 b2c_runtime hadoop  524827705 2017-01-13 09:42
> /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.
> 1484323216993.snappy
> -rw-r--r--   3 b2c_runtime hadoop  524203463 2017-01-13 09:47
> /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.
> 1484323216994.snappy
> -rw-r--r--   3 b2c_runtime hadoop  524678485 2017-01-13 09:53
> /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.
> 1484323216995.snappy
> -rw-r--r--   3 b2c_runtime hadoop  524598220 2017-01-13 09:59
> /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.
> 1484323216996.snappy
> -rw-r--r--   3 b2c_runtime hadoop    3877959 2017-01-13 09:59
> /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.
> 1484323216997.snappy.tmp
> Found 10 items
> -rw-r--r--   3 b2c_runtime hadoop  523000460 2017-01-13 10:06
> /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.
> 1484326813831.snappy
> -rw-r--r--   3 b2c_runtime hadoop  523455154 2017-01-13 10:12
> /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.
> 1484326813832.snappy
> -rw-r--r--   3 b2c_runtime hadoop  525465618 2017-01-13 10:18
> /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.
> 1484326813833.snappy
> -rw-r--r--   3 b2c_runtime hadoop  524630955 2017-01-13 10:24
> /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.
> 1484326813834.snappy
> -rw-r--r--   3 b2c_runtime hadoop  527780298 2017-01-13 10:30
> /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.
> 1484326813835.snappy
> -rw-r--r--   3 b2c_runtime hadoop  526565562 2017-01-13 10:37
> /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.
> 1484326813836.snappy
> -rw-r--r--   3 b2c_runtime hadoop  524936336 2017-01-13 10:43
> /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.
> 1484326813837.snappy
> -rw-r--r--   3 b2c_runtime hadoop  524565610 2017-01-13 10:49
> /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.
> 1484326813838.snappy
> -rw-r--r--   3 b2c_runtime hadoop  524276950 2017-01-13 10:55
> /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.
> 1484326813839.snappy
> -rw-r--r--   3 b2c_runtime hadoop     654810 2017-01-13 10:55
> /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.
> 1484326813840.snappy.tmp
> Found 11 items
> -rw-r--r--   3 b2c_runtime hadoop  524174553 2017-01-13 11:06
> /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.
> 1484330415712.snappy
> -rw-r--r--   3 b2c_runtime hadoop  524127864 2017-01-13 11:12
> /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.
> 1484330415713.snappy
> -rw-r--r--   3 b2c_runtime hadoop  524778919 2017-01-13 11:18
> /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.
> 1484330415714.snappy
> -rw-r--r--   3 b2c_runtime hadoop  524851182 2017-01-13 11:24
> /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.
> 1484330415715.snappy
> -rw-r--r--   3 b2c_runtime hadoop  525156750 2017-01-13 11:30
> /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.
> 1484330415716.snappy
> -rw-r--r--   3 b2c_runtime hadoop  525334538 2017-01-13 11:35
> /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.
> 1484330415717.snappy
> -rw-r--r--   3 b2c_runtime hadoop  527346578 2017-01-13 11:41
> /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.
> 1484330415718.snappy
> -rw-r--r--   3 b2c_runtime hadoop  525592734 2017-01-13 11:47
> /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.
> 1484330415719.snappy
> -rw-r--r--   3 b2c_runtime hadoop  525502291 2017-01-13 11:53
> /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.
> 1484330415720.snappy
> -rw-r--r--   3 b2c_runtime hadoop  523135186 2017-01-13 11:58
> /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.
> 1484330415721.snappy
> -rw-r--r--   3 b2c_runtime hadoop    9967141 2017-01-13 11:58
> /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.
> 1484330415722.snappy.tmp
> Found 7 items
> -rw-r--r--   3 b2c_runtime hadoop  520881970 2017-01-13 12:05
> /flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log.
> 1484334016849.snappy
> -rw-r--r--   3 b2c_runtime hadoop  522340745 2017-01-13 12:11
> /flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log.
> 1484334016850.snappy
> -rw-r--r--   3 b2c_runtime hadoop  524156495 2017-01-13 12:17
> /flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log.
> 1484334016851.snappy
> -rw-r--r--   3 b2c_runtime hadoop  523482390 2017-01-13 12:23
> /flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log.
> 1484334016852.snappy
> -rw-r--r--   3 b2c_runtime hadoop  524096591 2017-01-13 12:29
> /flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log.
> 1484334016853.snappy
> -rw-r--r--   3 b2c_runtime hadoop  523184628 2017-01-13 12:35
> /flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log.
> 1484334016854.snappy
> -rw-r--r--   3 b2c_runtime hadoop   10981218 2017-01-13 12:35
> /flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log.
> 1484334016855.snappy.tmp
>
> *HDFS Stat On One Of The File (Keep in Mind the output backet is based on
> event time that is MDT/MST vs the stat date of GMT)*
>  hadoop fs -stat "%y %n"  /flumedata/processed/first-
> pass-stream/2017/01/13/10-00/flumeload100
> -log.1484326813840.snappy.tmp
> 17/01/13 12:57:07 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 2017-01-13 17:55:35 flumeload100-log.1484326813840.snappy.tmp
>
> Thanks
> Justin
>
> On Thu, Jan 12, 2017 at 11:56 PM, Denes Arvay <denes@cloudera.com> wrote:
>
>> Hi Justin,
>>
>> Could you please share your config file with us?
>>
>> Thanks,
>> Denes
>>
>>
>> On Thu, Jan 12, 2017, 20:20 Justin Workman <justinjworkman@gmail.com>
>> wrote:
>>
>>> sorry for cross posting to user and dev. I have recently set up a flume
>>> configuration where we are using the regex_extractor interceptor to parse
>>> the actual event date from the record flowing through the Flume source,
>>> then using that date to build the HDFS sink bucket path. However, it
>>> appears that the hdfs.idleTimeout value is not honored in this
>>> configuration. It does work when using the timestamp interceptor you build
>>> the output path.
>>>
>>> I have set the hdfs.idleTimeout value for the HDFS sink, but the files
>>> are never closed or renamed until I restart or shutdown Flume. Our flume is
>>> configured to roll based on size or output path, and the files
>>> rename/close/roll fine based on size, however the last file in each output
>>> path is always left with the .tmp extension until we restart Flume. I would
>>> expect that the file would be renamed and closed if there are no records
>>> written to this file after the idleTimeout is reached.
>>>
>>> Could I be missing something, or is this a known bug with the
>>> regex_extract interceptor?
>>>
>>> Thanks
>>> Justin
>>>
>>
>

Mime
View raw message