The RFCs are very clear that in DATA contents:
> CR and LF MUST only occur together as CRLF; they MUST NOT appear
> independently in the body.
https://www.rfc-editor.org/rfc/rfc5322#section-2.3https://www.rfc-editor.org/rfc/rfc5321#section-2.3.8
Allowing "independent" CR and LF can cause a number of problems.
In particular, there is a new "SMTP smuggling attack" published recently
that involves the server incorrectly parsing the end of DATA marker
`\r\n.\r\n`, which an attacker can exploit to impersonate a server when
email is transmitted server-to-server.
https://www.postfix.org/smtp-smuggling.htmlhttps://sec-consult.com/blog/detail/smtp-smuggling-spoofing-e-mails-worldwide/
Currently, chasquid is vulnerable to this attack, because Go's standard
libraries net/textproto and net/mail do not enforce CRLF strictly.
This patch fixes the problem by introducing a new "dot reader" function
that strictly enforces CRLF when reading dot-terminated data, used in
the DATA input processing.
When an invalid newline terminator is found, the connection is aborted
immediately because we cannot safely recover from that state.
We still keep the internal representation as LF-terminated for
convenience and simplicity.
However, the MDA courier is changed to pass CRLF-terminated lines, since
that is an external program which could be strict when receiving email
messages.
See https://github.com/albertito/chasquid/issues/47 for more details and
discussion.
Some utilities might want to access the EHLO/HELO domain in the
post-data hook (for example, to do additional SPF validations).
This patch implements that support, including sanitizing the EHLO domain
on the environment variable to reduce the risk of problems.
When the DATA input is too large, we should keep on reading through it
until we reach the end marker, otherwise there is a security problem:
the remaining data will be interpreted as SMTP commands, so for example
a forwarded message that is too long might end up executing SMTP
commands under an authenticated user.
This patch implements this behaviour, while being careful not to consume
extra memory to avoid opening up the possibility of a DoS.
Note the equivalent logic for single long lines is already implemented.
Despite its loose appearance, the "Received" header has a reasonably
standarized format.
We were not following the standard format as closely as we should; this
rarely causes problems in this particular case, but there's no need to
deviate from it.
This patch changes the Received header generation as follows:
- The "from" section now uses the remote address as canonical (for
non-authenticated users) which provides more valuable information
than the user-supplied EHLO address (which is also included).
- The remote authenticated user is now hidden, for additional privacy.
- Use the "with" optional clause.
- Use the standard way of printing TLS cipher suite.
- Use the standard way of printing address literals.
We have many places in our tests where we create temporary directories,
which we later remove (most of the time). We have at least 3 helpers to
do this, and various places where it's done ad-hoc (and the cleanup is
not always present).
To try to reduce the clutter, and make the tests more uniform and
readable, this patch introduces two helpers in a new "testutil" package:
one for creating and one for removing temporary directories.
These new functions are safer, better tested, and make the tests more
consistent. All the tests are updated to use them.
This patch implements a post-DATA hook, which is run after receiving the
data but before sending a reply.
It can be used to implement content filtering when receiving email, for
example for passing the email through an anti-spam or an anti-virus.
This patch moves chasquid's Server and Conn structures to their own
smtpsrv package, to make chasquid.go a bit more readable. It also helps
clarify the relation between Server and Conn.
There are no functional changes.
Note that git can still track the history across this commit (e.g. git
gui blame shows the right data).