commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "luan xl" <helix_...@hotmail.com>
Subject [Net] a problem in UnixFTPEntryParser.java
Date Wed, 07 Dec 2005 15:59:03 GMT
please look at this url:

ftp://ftp.ebi.ac.uk/pub/databases/rcsb/pdb/data/structures/divided/

the raw list is:

total 185
drwxr-xr-x   9 321      57          8192 Jun  5  2005 .
drwxr-xr-x   6 321      57          8192 Dec  7 08:55 ..
-rw-r--r--   1 321      57           919 Sep 17  2002 README
drwxr-xr-x1062 321      57         24576 Nov 16 08:38 XML
drwxr-xr-x1062 321      57         24576 Nov 16 08:38 XML-extatom
drwxr-xr-x1062 321      57         24576 Nov 16 08:38 XML-noatom
drwxr-xr-x1062 321      57         24576 Nov 16 08:38 mmCIF
drwxr-xr-x 843 321      57         24576 Dec  7 08:47 nmr_restraints
drwxr-xr-x1063 321      57         24576 Nov 16 08:38 pdb
drwxr-xr-x1047 321      57         24576 Jun 29 07:22 structure_factors
^^^^^^^^^^^^^^

the problem here is that some fields of the second column "hard link count" 
have no space
after the "type string". yet in the UnixFTPEntryParser.java the regex to 
match is
    private static final String REGEX =
        "([bcdlfmpSs-])"
        
+"(((r|-)(w|-)([xsStTL-]))((r|-)(w|-)([xsStTL-]))((r|-)(w|-)([xsStTL-])))\\+?\\s+"

                                                                            
         ^^^^  
        + "(\\d+)\\s+"
        ......

I suppose that the \s+ here disallow the no space situation such as the 
former example.
Is it proper that simply change \s+ to \s*? or use a more robust way to 
handle this? or
just treat this as an mismatch?

thanks.



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Mime
View raw message