commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefan Bodewig (Commented) (JIRA)" <>
Subject [jira] [Commented] (COMPRESS-176) ArchiveInputStream#getNextEntry(): Problems with WinZip directories with Umlauts
Date Sun, 26 Feb 2012 07:05:49 GMT


Stefan Bodewig commented on COMPRESS-176:

In extract.c of unzip60 line 1310ff there is this code that replaces backslashes with slashes.
 It only replaces them in names that don't contain forward slashes (MBSCHR looks up a character
in a character array) and only if "hostnum" indicates a FAT system.

            /* for files from DOS FAT, check for use of backslash instead
             *  of slash as directory separator (bug in some zipper(s); so
             *  far, not a problem in HPFS, NTFS or VFAT systems)
#ifndef SFX
            if (G.pInfo->hostnum == FS_FAT_ && !MBSCHR(G.filename, '/')) {
                char *p=G.filename;

                if (*p) do {
                    if (*p == '\\') {
                        if (!G.reported_backslash) {
                            Info(slide, 0x21, ((char *)slide,
                              LoadFarString(BackslashPathSep), G.zipfn));
                            G.reported_backslash = TRUE;
                            if (!error_in_archive)
                                error_in_archive = PK_WARN;
                        *p = '/';
                } while (*PREINCSTR(p));
#endif /* !SFX */

"hostnum" is the upper byte of "version made by" inside the central directory header - this
is ZipArchiveEntry's get/setPlatform - and FS_FAT_ is 0 (ZipArchiveEntry#PLATFORM_FAT).  We'd
have all pieces together to emulate this.
> ArchiveInputStream#getNextEntry(): Problems with WinZip directories with Umlauts
> --------------------------------------------------------------------------------
>                 Key: COMPRESS-176
>                 URL:
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Archivers
>    Affects Versions: 1.3
>         Environment: Windows 7
>            Reporter: Wurstbrot mit Senf
>         Attachments:,,,
> There is a problem when handling a WinZip-created zip with Umlauts in directories.
> I'm accessing a zip file created with WinZip containing a directory with an umlaut ("รค")
with ArchiveInputStream. When creating the zip file the unicode-flag of winzip had been active.
> The following problem occurs when accessing the entries of the zip:
> the ArchiveEntry for a directory containing an umlaut is not marked as a directory and
the file names for the directory and all files contained in that directory contain backslashes
instead of slashes (i.e. completely different to all other files in directories with no umlaut
in their path).
> There is no difference when letting the ArchiveStreamFactory decide which ArchiveInputStream
to create or when using the ZipArchiveInputStream constructor with the correct encoding (I've
tried different encodings CP437, CP850, ISO-8859-15, but still the problem persisted).
> This problem does not occur when using the very same zip file but compressed by 7zip
or the built-in Windows 7 zip functionality.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message