flume-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "winway (Jira)" <j...@apache.org>
Subject [jira] [Commented] (FLUME-3320) Taildir source reset position bug
Date Thu, 22 Aug 2019 03:26:00 GMT

    [ https://issues.apache.org/jira/browse/FLUME-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912899#comment-16912899
] 

winway commented on FLUME-3320:
-------------------------------

Any patch for this issue?

> Taildir source reset position bug
> ---------------------------------
>
>                 Key: FLUME-3320
>                 URL: https://issues.apache.org/jira/browse/FLUME-3320
>             Project: Flume
>          Issue Type: Bug
>          Components: Master, Sinks+Sources
>    Affects Versions: 1.7.0, 1.8.0
>            Reporter: zzf
>            Priority: Major
>         Attachments: error.log
>
>
> 1) background
> we use flume to transport logs from local file to kafka
> every night at 0 p.m , local file will be moved to another directory to be compressed
as backup , after that , old file's inode will be released. The monitor directory will generate
new log file with the same filename.Since flume monitor files with inode , never seen any
conflict before.
> 2 ) problem 
> 2019-02-23 23:59:34,157 (Log-BackgroundWorker-c1) [INFO - org.apache.flume.channel.file.EventQueueBackingStoreFile.beginCheckpoint(EventQueueBackingStoreFile.java:230)]
Start checkpoint for /data/flume/filechannel/checkpoint/checkpoint, elements to sync = 3406
> 2019-02-23 23:59:34,167 (Log-BackgroundWorker-c1) [INFO - org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint(EventQueueBackingStoreFile.java:255)]
Updating checkpoint metadata: logWriteOrderID: 1550778307223, queueSize: 0, queueHead: 1754854
> 2019-02-23 23:59:34,176 (Log-BackgroundWorker-c1) [INFO - org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:1065)]
Updated checkpoint for file: /data/flume/filechannel/data/log-68 position: 149727224 logWriteOrderID:
1550778307223
> *2019-02-24 00:00:01,550 (PollableSourceRunner-TaildirSource-r1) [INFO - org.apache.flume.source.taildir.ReliableTaildirEventReader.updateTailFiles(ReliableTaildirEventReader.java:257)]
Pos 15927858851 is larger than file size! Restarting from pos 0, file: /data/logs/nginx/stm-collect.cn.miaozhen.com.log,
inode: 3932190*
> *2019-02-24 00:00:01,550 (PollableSourceRunner-TaildirSource-r1) [INFO - org.apache.flume.source.taildir.TailFile.updatePos(TailFile.java:126)]
Updated position, file: /data/logs/nginx/stm-collect.cn.miaozhen.com.log, inode: 3932190,
pos: 0*
> 2019-02-24 00:00:04,177 (Log-BackgroundWorker-c1) [INFO - org.apache.flume.channel.file.EventQueueBackingStoreFile.beginCheckpoint(EventQueueBackingStoreFile.java:230)]
Start checkpoint for /data/flume/filechannel/checkpoint/checkpoint, elements to sync = 33326
> 2019-02-24 00:00:04,189 (Log-BackgroundWorker-c1) [INFO - org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint(EventQueueBackingStoreFile.java:255)]
Updating checkpoint metadata: logWriteOrderID: 1550778350086, queueSize: 24452, queueHead:
1763726
> 2019-02-24 00:00:04,213 (Log-BackgroundWorker-c1) [INFO - org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:1065)]
Updated checkpoint for file: /data/flume/filechannel/data/log-68 position: 190948775 logWriteOrderID:
1550778350086
>  normally,flume will open the new file and get it's inode. if  inode in Map<inode,taildir>
,and taildir position larger than file size. the position will be reset to 0.The system went
well for a long time.
> but in this time,flume failed to get the new file's inode.In the log above ,you can see
there was no 'Opening file ...'  . And somehow the old log file's position was reset and all
the data in 23,Feb sent double.
> 3)question
> Logically , flume relates the real file and object tailfile with inode ,even though I
get the old file's inode,I still can not figure out why the tailfile.getPos() will be larger
than f.length().
> has this happened before? should i  blame this to java NIO? 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@flume.apache.org
For additional commands, e-mail: issues-help@flume.apache.org


Mime
View raw message