flume-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Kushmaul (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLUME-2994) flume-taildir-source: support for windows
Date Wed, 05 Oct 2016 12:55:20 GMT

    [ https://issues.apache.org/jira/browse/FLUME-2994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15548622#comment-15548622
] 

Jason Kushmaul commented on FLUME-2994:
---------------------------------------

Uniqueness of FIleKey.hashCode:
Can you be more specific about why it might not be as unique?  My hope was that with this
hashCode override: 
http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/windows/classes/sun/nio/ch/FileKey.java
{noformat}
   52      public int hashCode() {
   53           return (int)(dwVolumeSerialNumber ^ (dwVolumeSerialNumber >>> 32))
+
   54                  (int)(nFileIndexHigh ^ (nFileIndexHigh >>> 32)) +
   55                  (int)(nFileIndexLow ^ (nFileIndexHigh >>> 32));
   56       }
{noformat}
I'm not defending it, it's more that I can't tell you how unique that will be, so I was hoping
you could do the opposite and tell my how unique it will not be.  What I can tell you is that
from run to run, the same value was achieved, and was different for the very small number
of files I tested.

I think this is warranted now - I will provide some data on this and how unique it is.  If
you have any suggestions on that please let me know and I'll be sure to include them, otherwise,
I'll just get started with what I am thinking of right now which is to generate a configurable
amount of files and then check the fileKey.hashCode on them for uniqueness.  Crude but I think
will prove it worthy (or not).

tailFiles Map:
The only place FileKey is used is to get an "inode" like value on windows so I don't think
we should use that in tailFiles map as it would proliferate windows workaround object to the
rest of the code rather than keeping it contained in that single function.  (Did I misread
what you were asking).
I would continue to use Long in tailFiles map, because on unix, that is the primary way to
identify the files other than path (which path can change if a file is "mv"d). 
  

> flume-taildir-source: support for windows
> -----------------------------------------
>
>                 Key: FLUME-2994
>                 URL: https://issues.apache.org/jira/browse/FLUME-2994
>             Project: Flume
>          Issue Type: Improvement
>          Components: Sinks+Sources, Windows
>    Affects Versions: v1.7.0
>            Reporter: Jason Kushmaul
>            Assignee: Jason Kushmaul
>            Priority: Trivial
>             Fix For: v1.7.0
>
>         Attachments: FLUME-2994-2.patch, taildir-mac.conf, taildir-win8.1.conf
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> The current implementation of flume-taildir-source does not support windows.
> The only reason for this from what I can see is a simple call to Files.getAttribute(file.toPath(),
"unix:ino");
> I've tested an equivalent for windows (which of course does not work on non-windows).
 With an OS switch we should be able to identify a file independent of file name on either
system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message