From "Daniel F. Savarese" <>
Subject Re: [net] Solved VMS duplicates problem and simplified system
Date Sat, 10 Jan 2004 21:28:15 GMT

In message <>, steve cohen writes:
>2. In order to implement one, a removeDuplicates() method was added to 
>FTPFileEntryParser.  This is a no-op in the normal case but implemented in 

I don't think like this idea because it is application-specific
(i.e., concerned only with the VMS case).  If we're going to put
hooks like that into FTPFileEntryParser, then a generic approach
would be to have pre-parse and a post-parse methods.  That said,
my createFTPFileList addition doesn't belong in FTPFileEntryParser.
Still, I think FTPFileList is overloaded and the parsing driver
(FTPFileList.readStream()) needs to be extracted into a separate
interface/class in a way that allows the API user to make
customizations without us having to address application-specific

>4.  I renamed FTPFileListParserImpl to FTPFileEntryParserImpl to reduce the 
>confusion level since FTPFileListParser is now deprecated and going away in 
>2.0.  For the time being, though, for backward compatibility it still 
>implements FTPFileListParser.

Good change, but we need to keep FTPFileListParserImpl for backward
compatibility according to Commons release/versioning rules.  We
can enhance the API, but we can't remove APIs until a major release
(i.e., 2.0).  So deprecating FTPFileListParserImpl (and changing it
to extend FTPFileentryParserImpl) should suffice.

>Probably FTPFileList and FTPFileListIterator are misnamed.  FTPFileList 
>IS-NOT-A List of FTPFiles.  It HAS a Vector of raw input lines from the 
>listing.  FTPFileListIterator IS-NOT-A simple iterator over FTPFileList.  It 
>DOES iterate over this Vector and DOES return FTPFile objects, so it is more 
>than a simple iterator, it is an iterator on steroids.  

Sure, but the API user doesn't know how they are implemented.  To the user,
they are a list and an iterator (albeit specifically for FTPFile instances).
Right now they are concrete classes, so should there be a case where a
listing format cannot be parsed a line at a time, then there's no way to
extend the implementation to meet the need.  That's where I argue for
making the classes abstract (ideally interfaces, but it may be too late
for that given API migration rules; however abstract doesn't break anything
because they cannot be instantiated directly by a user as it is).  It's
one path toward implementing Jeffrey's suggestion, which I favor, of
allowing the iterator behavior to be altered either through delegation
or subclassing.

>Remember that there are two scenarios supported here:
>a) read the whole list at once as in FTPClient.listFiles() and 
>b) read the list in but defer creation of more expensive FTPFile objects until
>needed.  This scenario was broken in the VMS case.

Yes, and it may break again with a new unforeseen case.  How to handle
those unforeseen cases is what I think we need to solve (solving the
VMS case as a byproduct).  I'm not convinced of the best way.  My
changes were just a step in one possible direction.

>The one thing left to do is decide which VMS scenario, versioning or 
>non-versioning is the default, i.e. the one to be supported by autodetection. 
>I asked for someone to chime in on this last week and no one has.  I still 
>don't know.

I don't have enough experience with VMS to know what the default should
be.  The default in the implementation was to filter out duplicates,
so I would suggest sticking with that until we hear otherwise.


