This patch fixes some minor typos in comments and strings found by
codespell.
While at it, also expand some variable names that were not typos, but
caused false positives, and end up being more readable anyway.
When the SMTP courier gets an error on STARTTLS (either because the
command itself failed, or because there was a low-level TLS negotiation
error), today we just fail that attempt.
This can cause messages to never be delivered if the underlying reason
is a server misconfiguration (e.g. a server certificate that Go cannot
parse). This is quite rare in practice, but it can happen.
To prevent this situation, this patch adds logic so that the SMTP
courier retries over plaintext when STARTTLS fails.
This is still subject to security level checks, so this type of failures
cannot be used to downgrade connections to domains we successfully
established a TLS connection previously.
Note that certificate validation issues are NOT included in this
type of failure, so they will not trigger a retry. The certificate
validation handling is unchanged by this patch.
This patch adds tests for STS policy checks in combination with TLS
security levels.
This helps ensure we're detecting mis-matches of TLS status
(plain/insecure/secure) and STS policy enforcement.
When using STARTTLS, the SMTP courier needs to determine whether the
server certificates are valid or not.
Today, that's implemented via connecting once with full certificate
verification, and if that fails, reconnecting with verification
disabled.
This works okay in practice, but it is slower on insecure servers (due
to the reconnection), and some of them even complain because we connect
too frequently, causing delivery problems. The latter has only been
observed once, on the drv-berlin-brandenburg.de MX servers.
To improve on that situation, this patch makes the courier do the TLS
connection only once, and uses the verification results directly.
The behaviour of the server is otherwise unchanged. The only difference
is that when delivering mail to servers that have invalid certificates,
we now connect once instead of twice.
The tests are expanded to increase coverage for this particular case.
When resolving MX records, we need to distinguish between "no such
domain" and other kinds of errors. Before Go 1.13, this was not
possible, so we had a workaround that assumed any permanent error was a
"no such domain", which is not great, but functional.
Now that our minimum supported version is Go 1.15, we can remove the
workaround.
This patch replaces the workaround with proper logic using
DNSError.IsNotFound to identify NXDOMAIN results when resolving MX
records.
This requires to adjust a few tests, that used to work on environments
where resolving unknown domains (used for testing) returned a permanent
error, and now they no longer do so. Instead of relying on this
environmental property, we make the affected tests use our own DNS
server, which should make them more hermetic and reproducible.
The SMTP courier, which handles outgoing connections, uses the domain of
the envelope's from as the domain in the HELO/EHLO greeting.
This works fine in practice, but ideally the domain used in the greeting
should match the reverse DNS record. This used to be more relevant but
nowadays it is not really enforced; however, it sometimes comes up in
self checks, and might cause some confusion when troubleshooting.
So this patch makes it use the configured hostname instead, which is
under the users' control and more likely to be compliant. It also
simplifies the code.
The documentation of the hostname configuration option is also updated
to mention this behaviour.
Thanks to Jonas Seydel (thor77) for bringing this up.
In the SMTP courier, when the MX lookup fails with a DNS temporary
error, we should propagate that up and cause delivery to fail with a
temporary error too.
That way the delivery can be attempted later by the queue.
However, the code today doesn't make the distinction and will treat any
temporary DNS error as a permanent delivery failure, which is a bug.
The bug was found by ludusrusso and reported in
https://github.com/albertito/chasquid/issues/4
This patch fixes the bug by propagating the error properly.
This patch contains some minor code style improvements, to leave the
linter happier and generally follow best practices in some areas where
things snuck through.
In the upcoming Go release, logging from a finished testing.T triggers a
panic. In the courier tests, this is possible because we don't wait for
completion of fakeServer before ending the test.
This patch makes the tests wait for fakeServer to finish before exiting,
removing the race.
This commit brings back the experimental MTA-STS (Strict Transport
Security) implementation, removed in commit
7f5bedf4aa.
We will continue development in the "sts" branch, subject to rebase,
until it is ready to be integrated into "next" again.
One of the SMTP tests was doing an external DNS lookup for a
non-existing host, which is reasonably harmless but makes the test less
hermetic.
This patch changes the non-existing host for an invalid address, which
has the same effect but avoids the network lookup.
This is a hack; ideally we would be able to override the resolver, but
Go does not implement that yet, and changing the code is not worth the
additional complexity.
We have many places in our tests where we create temporary directories,
which we later remove (most of the time). We have at least 3 helpers to
do this, and various places where it's done ad-hoc (and the cleanup is
not always present).
To try to reduce the clutter, and make the tests more uniform and
readable, this patch introduces two helpers in a new "testutil" package:
one for creating and one for removing temporary directories.
These new functions are safer, better tested, and make the tests more
consistent. All the tests are updated to use them.
This commit removes the experimental MTA-STS (Strict Transport Security)
implementation for now, as it's not up to date with the latest draft.
Development will continue on the "sts" branch, but this way it won't
block releases until it is ready.
Commits reverted:
- cb6500b993
- 0eeb964534
- e66288e4b4
- 216cf47ffa
- d66b06de51
- fe00750e39
- 933ab54cd8
Currently, if we can't find a mail server for a domain, we consider that
a transient failure (semi-accidentally, as we iterate over the (empty)
list of MXs and fall through the list).
It should be treated as a permanent error, in line with other DNS
issues (which is not ideal but seems to be generally accepted
behaviour).
This patch extends the SMTP courier to (optionally) do STS policy
checking when delivering mail.
As STS support is currently experimental, we gate this behind a flag and
is disabled by default.
Currently, we pick the first host in the MX list, and attempt delivery
there. If it fails, we just report the failure to the queue, which will
wait for some time and then try again.
This is not ideal: we should fall back to the other MXs in the list, as
the first host could be having issues for a long time, and not
attempting with the rest just delays delivery.
This patch implements the fallback, so we try all MXs before deciding to
report a failed delivery (unless, of course, an MX returned a permanent
failure).
This patch introduces a new "domaininfo" package, which implements a
database with information about domains. In particular, it tracks
incoming and outgoing security levels.
That information is used in incoming and outgoing SMTP to prevent
downgrades.
Some servers, like postfix, will pay close attention to the domain we say as a
part of the EHLO.
By default, Go's smtp package will use "localhost", causing it to complain.
This patch fixes that by using the envelope-from's domain.
It's not clear if that's better than using what we are resolving to, but
that's much more involved so we're going to do this for now.
This patch makes the queue and couriers distinguish between permanent and
transient errors when delivering mail to individual recipients.
Pipe delivery errors are always permanent.
Procmail delivery errors are almost always permanent, except if the command
exited with code 75, which is an indication of transient.
SMTP delivery errors are almost always transient, except if the DNS resolution
for the domain failed.
This patch adds a new test which makes chasquid send and receive email to/from
Exim.
To make it work we need to add two testing flags to the SMTP courier, which is
not ideal but doesn't muddle the code much.
The test is not very portable, and assumes an exim binary is available, but
does not have to be installed in the system. It includes a helper script to
fetch one from the Debian archives.
This patch does various minor style and simplification cleanups, fixing things
detected by tools such as go vet, gofmt -s, and golint.
There are no functional changes, this change is purely cosmetic, but will
enable us to run those tools more regularly now that their output is clean.
This patch introduces the couriers, which the queue uses to deliver mail.
We have a local courier (using procmail), a remote courier (uses SMTP), and a
router courier that decides which of the two to use based on a list of local
domains.
There are still a few things pending, but they all have their basic
functionality working and tested.