Skip to content

Missing 'linear-white-space' between encoded-word in RFC2047 header #144156

@guewen

Description

@guewen

Bug report

Bug description:

import email.policy
from email.message import EmailMessage

message = EmailMessage(policy=email.policy.SMTP)
subject = "Re: [SOS-1495488] Commande et livraison - Demande de retour - bibijolie - 251210-AABBCC - Abo actualités digitales 20 semaines d’abonnement à 24 heures, Bilan, Tribune de Genève et tous les titres Tamedia"
message["Subject"] = subject
print(message.as_bytes())

result

Subject: Re: [SOS-1495488] Commande et livraison - Demande de retour -\r\n bibijolie - 251210-AABBCC - Abo =?utf-8?q?actualit=C3=A9s?= digitales 20\r\n semaines =?utf-8?q?d=E2=80=99abonnement_=C3=A0?= 24 heures, Bilan, Tribune de\r\n =?utf-8?q?_?==?utf-8?q?Gen=C3=A8ve?= et tous les titres Tamedia\r\n\r\n

When sending an email with this subject, yahoo rejects it with a soft-bounce

<xxxxx@yahoo.com>: host mta7.am0.yahoodns.net[67.195.228.111] said: 554
    Message not accepted. Invalid Subject header. See
    https://senders.yahooinc.com/smtp-error-codes#other-failures (in reply to
    end of DATA command)

In the last line of the subject, two encoded-words are appended together ?==? not complying to this part of the RFC2047

Ordinary ASCII text and 'encoded-word's may appear together in the
same header field. However, an 'encoded-word' that appears in a
header field defined as '*text' MUST be separated from any adjacent
'encoded-word' or 'text' by 'linear-white-space'.

Also, I'm not sure that it has to create an encoded-word for this =?utf-8?q?_?=, but I'm a bit lost in the specs.

It is highly dependent to the number of chars and where the chars that need encoding are placed, as the same subject which does not start with "Re: " is fine

Subject: [SOS-1495488] Commande et livraison - Demande de retour - bibijolie -\r\n 251210-AABBCC - Abo =?utf-8?q?actualit=C3=A9s_digitales_20_semaines_d?=\r\n =?utf-8?q?=E2=80=99abonnement_=C3=A0_24_heures=2C_Bilan=2C_Tribune_de_Gen?=\r\n =?utf-8?q?=C3=A8ve?= et tous les titres Tamedia\r\n\r\n

I could reproduce this behavior on 3.13.x, 3.14.2, 3.15-dev

Apologies if there already is an issue for this, I found several issues on the topic, but could not find a matching one.

CPython versions tested on:

CPython main branch, 3.15

Operating systems tested on:

Linux

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibStandard Library Python modules in the Lib/ directorytopic-emailtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions