flume-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "qihuagao (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (FLUME-3108) Can not roll logs for hdfs sink based on timestamp of log content.
Date Wed, 14 Jun 2017 06:28:00 GMT

     [ https://issues.apache.org/jira/browse/FLUME-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

qihuagao updated FLUME-3108:
----------------------------
    Priority: Blocker  (was: Major)

> Can not roll logs for hdfs sink based on timestamp of log content.
> ------------------------------------------------------------------
>
>                 Key: FLUME-3108
>                 URL: https://issues.apache.org/jira/browse/FLUME-3108
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: 1.7.0
>            Reporter: qihuagao
>            Priority: Blocker
>
> I use regex_extractor to extract timestamp for my log files
> with a1.sinks.k1.serializer = header_and_text, I checked the new timestamps could have
been save in hdfs files.
> but hdfs rolling, can not work as I expect, I expect it could roll logs by timestamp
in logs instead of current timestamps.
> So is it workable, or did I do something wrong´╝č thank guys for help.
> the following is my configruation:
> {quote}a1.sources = s1
> a1.channels = c1
> a1.sinks = k1
> a1.sources.s1.type = org.apache.flume.source.kafka.KafkaSource
> a1.sources.s1.channels = c1
> a1.sources.s1.batchSize = 50
> a1.sources.s1.batchDurationMillis = 2000
> a1.sources.s1.kafka.bootstrap.servers =*
> a1.sources.s1.kafka.topics = LOG
> a1.sources.s1.useFlumeEventFormat=true
> a1.sources.s1.kafka.consumer.group.id = custom.g.id
> a1.sources.s1.interceptors = i1
> a1.sources.s1.interceptors.i1.type = regex_extractor
> a1.sources.s1.interceptors.i1.regex = [(\\d\\d\\d\\d-\\d\\d-\\d\\d\\s\\d\\d:\\d\\d:\\d\\d)]
> a1.sources.s1.interceptors.i1.serializers = s1
> a1.sources.s1.interceptors.i1.serializers.s1.type = org.apache.flume.interceptor.RegexExtractorInterceptorMillisSerializer
> a1.sources.s1.interceptors.i1.serializers.s1.name = timestamp
> a1.sources.s1.interceptors.i1.serializers.s1.pattern = yyyy-MM-dd HH:mm
> a1.channels.c1.type = memory
> a1.channels.c1.capacity = 1000
> a1.channels.c1.transactionCapacity = 1000
> a1.channels.c1.byteCapacityBufferPercentage = 20
> a1.channels.c1.byteCapacity = 128000000
> #a1.sinks.k1.type = logger
> a1.sinks.k1.channel = c1
> a1.sinks.k1.type = hdfs
> a1.sinks.k1.hdfs.path = hdfs://192.168.1.247:9000/logs/%Y-%m-%d/%H
> a1.sinks.k1.hdfs.filePrefix = logs
> a1.sinks.k1.hdfs.fileType = DataStream
> a1.sinks.k1.hdfs.round = true
> a1.sinks.k1.hdfs.roundValue = 1
> a1.sinks.k1.hdfs.roundUnit = hour
> a1.sinks.k1.hdfs.rollSize = 0
> a1.sinks.k1.hdfs.rollCount = 0
> a1.sinks.k1.hdfs.rollInterval=0
> a1.sinks.k1.hdfs.batchSize = 120
> a1.sinks.k1.hdfs.idleTimeout=120
> a1.sinks.k1.serializer = header_and_text
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message