infra-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Lambertus (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (INFRA-15019) lists.a.o: fails to parse mails with non-standard text content-type
Date Wed, 06 Mar 2019 01:23:00 GMT

     [ https://issues.apache.org/jira/browse/INFRA-15019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chris Lambertus resolved INFRA-15019.
-------------------------------------
    Resolution: Information Provided

>From PonEE:



You'll be happy to hear that we only found two messages with that header.

When the PonEE importer sees an unknown Content-Type, it considers
that the body is unreadable. On the presentation side we store this as
an "empty body", so when browsing the archives the content is not
shown, because it could be a binary blob, mojibake, etc. We still,
however, store the actual original message source which is available
to browse in the archives by using the view source option.

In general, if there are any archiving failures we send those to a
special directory for later manual evaluation.

> lists.a.o: fails to parse mails with non-standard text content-type
> -------------------------------------------------------------------
>
>                 Key: INFRA-15019
>                 URL: https://issues.apache.org/jira/browse/INFRA-15019
>             Project: Infrastructure
>          Issue Type: Bug
>          Components: Mail Archives
>            Reporter: Sebb
>            Assignee: Chris Lambertus
>            Priority: Major
>
> Lists.a.o does not include the following emails [1,2] from general@jakarta
> They have the following header:
> Content-Type: text/$email_type; charset=iso-8859-1
> Whilst this is not strictly valid, it's clearly a text message.
> There may be other such mails in the archives; I haven't checked.
> I discovered the issue when I tried to load the ASF mbox file and saw the error messages.
> If lists.a.o keeps track of archiver and import errors then it should be possible to
find other failed parses.
> [1] http://mail-archives.apache.org/mod_mbox/jakarta-general/200301.mbox/%3C1631.192.193.196.3.1043161136.squirrel@duckmail.d2g.com%3E
> [2] http://mail-archives.apache.org/mod_mbox/jakarta-general/200301.mbox/%3C7703.192.193.196.9.1043165745.squirrel@duckmail.d2g.com%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message