commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From steve cohen <>
Subject Re: [net] Solved VMS duplicates problem and simplified system
Date Sun, 11 Jan 2004 23:15:13 GMT
On Sunday 11 January 2004 12:51 pm, Daniel F. Savarese wrote:
> In message <>, steve cohen writes:
> >Oops!  CVS doesn't have "undelete", does it?  :-( Sorry for not
> >knowing/understanding this rule.  Do we add it back in (losing history) or
> > is there some way to put that back as well?  What is best way to proceed?
> It looks like you already took care of it.  But it should extend
> FTPFileEntryParserImpl so we don't have duplicate code.  I don't know
> if you're still making changes, so I don't want to mess with anything.

I'm not doing any more for now, so if you want to get in there and change 
this, feel free.

> >But I think these names confused you.  At least that's the way it looked
> > to me.  The iteration that was needed was over the Vector of raw input in
> > FTPFileList.  When I saw you specializing FTPFileIterator, which walks
> > raw input but returns "cooked" equivalents, I knew you had gone down a
> > wrong
> No, I just didn't explain myself clearly enough.  I was concentrating
> on teasing out the interfaces.  The implementation didn't matter as long
> as the proper behavior was preserved.  That's why I just shuffled code
> around without actually reimplementing any of it.  A subsequent step
> would be to optimize the specialized iterator so it would incrementally
> work off of the raw input lines.  

I think this is a problem.  Rather than just create abstract classes of the 
current ones, we may need to redesign FTPFileList/FTPFileIterator to fit with 
what we're trying to accomplish.  You'll notice that my solution to the VMS 
problem has nothing to do with FTPFileIterator, which up till now has had a 
completely different purpose.  Remember, the current FTPFileIterator iterates 
over raw input from the FTPFileList but returns FTPFiles.  Also remember that 
there is not necessarily a 1-to-1 relationship between the items in the raw 
list that FTPFileList maintains and the results that the iterator returns.  
See the method FTPFileEntryParser.readNextEntry().  All of which bolsters my 
contention that redesigning FTPFileList/FTPFileIterator is going to be 
essential to the process of rationalizing this.

> However, the approach would be through
> subclassing and making some members of what I called DefaultFTPFileList
> protected instead of package scoped or private.  I still think the changes
> you made are too application-specific.  FTPFileList preParse(FTPFileList)
> is entirely motivated by the VMS problem and I don't see it holding up
> to software evolution demands.  The need to continue exposing
> implementation details through methods like
> getInternalIteratorForFtpFileList (should be
> getInternalIteratorForFTPFileList to maintain consistent naming) is
> evidence of an architectural flaw. If we continue to violate data
> encapsulation, favoring customization through exposure of private data
> instead of through inheritance, we're going to make the software more
> difficult to maintain.

Let's take off from one of your best ideas.  You said awhile back that 
FTPFileList is not just a holding area for the raw input (as I had contended) 
but really a Parser Engine managing the whole process.  That isn't how I 
conceived it, but your point about the socket read being in there is a very 
telling point!

Maybe, instead of FTPFileList, we need FTPFileParserEngine.  It would do the 
read from the socket and create the raw input vector.  It would manage any 
preparsing activity like duplicate removal.  It would also manage things like 
merging of multiline input into single entries (which should be thought of as 
a preparse activity as well).  Only when we have an iterable collection of 
raw entries, would we let the FTPFileEntryParser parse them to to FTPFile 

Doing it this way would get rid of the ugly hack of 
getInternalIteratorForFTPFileList() since it would happen in the same class. 
You can think of the preparse phase as "getting the raw input ready to be 
walked".  Then the need for FTPFileIterator goes away entirely!  You just 
need simple iterators and a loop like this

for (Iterator iter = rawEntries.iterator(); iter.hasNext();) {
   String rawEntry = (String);
   FTPFile f = parser.parseFTPEntry(rawEntry);
   // append to output container

However, this would still require hooks in the FTPFileEntryParsers.  I don't 
think that's such a bad thing.  Each parser class remains the locus of all 
system specific code, whether in the preparse phase or in the parse phase.

> >path.  I'm not against abstracting some of these classes.  It's probably a
> >good idea and once we have it, further beneficial refactorings will no
> > doubt suggest themselves, but I think that is for 2.0.
> I disagree.  I think the time to fix this is now.  Otherwise, we're just
> adding more methods that we're going to have to deprecate.  That said,
> this affects us more right now than it affects users.  It's not like
> there's a huge amount of user code dependent on the list parsing
> classes.  Most users use what we already provide.  Anyway, I know I
> still haven't expressed myself as clearly as I could (time constraints),
> but I'd like to fix up my suggested changes to something closer to
> what I'm ultimately trying to propose in order to make it more clear.
> Rather than work on what's in CVS, I'll just post the changes for
> download later this week after I free up some time.  Some things are
> just more clear when looking at examples.

I don't disagree violently with doing it now.  But abstracting out of the 
current division of labor between the classes isn't going to get the job 
done.  That division of labor is flawed!  I think we need to rethink that 
division of labor.  If we both have the time we could get it done in a week.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message