infra-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sebb (JIRA)" <>
Subject [jira] [Commented] (INFRA-16162) lists.a.o: import missing HTML-only mails
Date Fri, 09 Mar 2018 01:40:00 GMT


Sebb commented on INFRA-16162:

If a mail is imported, the id that is generated depends on the exact PonyMail code, the configuration
and the Python library code. In some cases the id may depend on the current time or the TZ.
These variables have changed over the lifetime of the system. Thus in general it is not possible
to ensure that duplicates don't occur. There is some code to check that a mail does not already
exist in the database, but the checks may generate false positives. The id generation code
has been improved over time, but to my knowledge none of the existing methors are guaranteed
to produce a unique and stable id.

Even if such a generator were developed, it would only apply to mails archived or imported
since the generator was enabled.

AFAIK, the only way to guarantee that a mail is not duplicated when re-importing from a mbox
file is to only import mails that have not already been imported; i.e. to only import mails
that are HTML-only.

> lists.a.o: import missing HTML-only mails
> -----------------------------------------
>                 Key: INFRA-16162
>                 URL:
>             Project: Infrastructure
>          Issue Type: Task
>          Components: Mail Archives
>            Reporter: Sebb
>            Priority: Major
> Further to INFRA-12085, HTML-only mails are missing from the lists.a.o archives.
> This ticket is about ensuring that the missing mails are made available on lists.a.o

This message was sent by Atlassian JIRA

View raw message