commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefan Bodewig (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COMPRESS-471) Zipped files names having non UTF-8 encoding are being replaced with '?' while previewing file.
Date Sun, 18 Nov 2018 13:46:00 GMT

    [ https://issues.apache.org/jira/browse/COMPRESS-471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16690912#comment-16690912
] 

Stefan Bodewig commented on COMPRESS-471:
-----------------------------------------

I'm sorry, can you please explain what kind of "side effects" you see?
When I iterate over the names after using {{ZipFile(new File(...), "Cp850")}} I see names
that don't contain any question marks.

> Zipped files names having non UTF-8 encoding are being replaced with '?' while previewing
file.
> -----------------------------------------------------------------------------------------------
>
>                 Key: COMPRESS-471
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-471
>             Project: Commons Compress
>          Issue Type: Bug
>    Affects Versions: 1.18
>            Reporter: Gaurav Mittal
>            Priority: Major
>         Attachments: Document(▒Γ║╗)_20150226_11.zip, Incorrect.JPG, correct.JPG
>
>
> | * All the strings which are not supported by UTF-8 are being replaced by '?' symbol, 
> In the issue scenario the charset is 'Cp850', Since the common compress library
cannot identify the 'Cp850' charset and it takes the default charset as 'UTF-8' therefore
>  we can see the '?' symbol
> In our code 
> ZipFile ret = new ZipFile(path);
> Moreover if we send the encoding in the function as defined below, it works fine
> ZipFile ret = new ZipFile(new File(path), "Cp850",false);
> But the second scenario where we are forcibly giving the encoding as 'Cp850' may cause side effects in some cases
>  --------------------------------------------------------------------------
> Below code does not seem to resolve UTF8 conflicts and could not make file names into
correct form -
>  
> try {
>  final Map<ZipArchiveEntry, NameAndComment> entriesWithoutUTF8Flag =
>  populateFromCentralDirectory();
>  resolveLocalFileHeaderData(entriesWithoutUTF8Flag); 
>  success = true;
> } finally {
>  closed = !success;
>  if (!success && closeOnError) {
>  IOUtils.closeQuietly(archive);
>  }
> }|
> | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message