On NixOS, `/bin` is basically empty and this causes the courier tests
(which invoke `/bin/sleep`) to fail.
This patch fixes the tests by removing the hardcoded path.
https://github.com/albertito/chasquid/pull/73
Amended-by: Alberto Bertogli <albertito@blitiri.com.ar>
Adjusted commit message.
This patch implements "via" aliases, which let us explicitly select a
server to use for delivery.
This feature is useful in different scenarios, such as a secondary MX
server that forwards all incoming email to a primary.
For now, it is experimental and the syntax and semantics are subject to
change.
This patch fixes some minor typos in comments and strings found by
codespell.
While at it, also expand some variable names that were not typos, but
caused false positives, and end up being more readable anyway.
The RFCs are very clear that in DATA contents:
> CR and LF MUST only occur together as CRLF; they MUST NOT appear
> independently in the body.
https://www.rfc-editor.org/rfc/rfc5322#section-2.3https://www.rfc-editor.org/rfc/rfc5321#section-2.3.8
Allowing "independent" CR and LF can cause a number of problems.
In particular, there is a new "SMTP smuggling attack" published recently
that involves the server incorrectly parsing the end of DATA marker
`\r\n.\r\n`, which an attacker can exploit to impersonate a server when
email is transmitted server-to-server.
https://www.postfix.org/smtp-smuggling.htmlhttps://sec-consult.com/blog/detail/smtp-smuggling-spoofing-e-mails-worldwide/
Currently, chasquid is vulnerable to this attack, because Go's standard
libraries net/textproto and net/mail do not enforce CRLF strictly.
This patch fixes the problem by introducing a new "dot reader" function
that strictly enforces CRLF when reading dot-terminated data, used in
the DATA input processing.
When an invalid newline terminator is found, the connection is aborted
immediately because we cannot safely recover from that state.
We still keep the internal representation as LF-terminated for
convenience and simplicity.
However, the MDA courier is changed to pass CRLF-terminated lines, since
that is an external program which could be strict when receiving email
messages.
See https://github.com/albertito/chasquid/issues/47 for more details and
discussion.
When the SMTP courier gets an error on STARTTLS (either because the
command itself failed, or because there was a low-level TLS negotiation
error), today we just fail that attempt.
This can cause messages to never be delivered if the underlying reason
is a server misconfiguration (e.g. a server certificate that Go cannot
parse). This is quite rare in practice, but it can happen.
To prevent this situation, this patch adds logic so that the SMTP
courier retries over plaintext when STARTTLS fails.
This is still subject to security level checks, so this type of failures
cannot be used to downgrade connections to domains we successfully
established a TLS connection previously.
Note that certificate validation issues are NOT included in this
type of failure, so they will not trigger a retry. The certificate
validation handling is unchanged by this patch.
This patch changes several internal packages to receive and pass tracing
annotations, making use of the new tracing library, so we can have
better debugging information.
ioutil package was deprecated in Go 1.16, replace all uses with their
respective replacements.
This patch was generated with a combination of `gofmt -r`, `eg`, and
manually (for `ioutil.ReadDir`).
This patch is the result of running Go 1.19's `gofmt` on the codebase,
which automatically updates all Go doc comments to the new format.
https://tip.golang.org/doc/go1.19#go-doc
We've accumulated a few linter issues around comments and a couple of
variable names.
While none of them is major, this patch cleans them up so it's easier to
go through the linter output, and we can start being more strict about
it.
This patch adds tests for STS policy checks in combination with TLS
security levels.
This helps ensure we're detecting mis-matches of TLS status
(plain/insecure/secure) and STS policy enforcement.
When using STARTTLS, the SMTP courier needs to determine whether the
server certificates are valid or not.
Today, that's implemented via connecting once with full certificate
verification, and if that fails, reconnecting with verification
disabled.
This works okay in practice, but it is slower on insecure servers (due
to the reconnection), and some of them even complain because we connect
too frequently, causing delivery problems. The latter has only been
observed once, on the drv-berlin-brandenburg.de MX servers.
To improve on that situation, this patch makes the courier do the TLS
connection only once, and uses the verification results directly.
The behaviour of the server is otherwise unchanged. The only difference
is that when delivering mail to servers that have invalid certificates,
we now connect once instead of twice.
The tests are expanded to increase coverage for this particular case.
When resolving MX records, we need to distinguish between "no such
domain" and other kinds of errors. Before Go 1.13, this was not
possible, so we had a workaround that assumed any permanent error was a
"no such domain", which is not great, but functional.
Now that our minimum supported version is Go 1.15, we can remove the
workaround.
This patch replaces the workaround with proper logic using
DNSError.IsNotFound to identify NXDOMAIN results when resolving MX
records.
This requires to adjust a few tests, that used to work on environments
where resolving unknown domains (used for testing) returned a permanent
error, and now they no longer do so. Instead of relying on this
environmental property, we make the affected tests use our own DNS
server, which should make them more hermetic and reproducible.
This patch renames courier.Procmail to courier.MDA, to make it more
obvious that the functionality is not tied to that particular MDA.
It's just for readability, there are no functional changes.
This patch makes chasquid's monitoring server expose an OpenMetrics
metrics endpoint.
It adds a new package "expvarom" which implements an HTTP handler that
exports expvar variables in the OpenMetrics text format.
Then, the handler is registered by the monitoring server at /metrics
(where most things expect it to be).
The existing exported variables are also extended with descriptions,
which is optional, but improves the readability of the metrics.
The SMTP courier, which handles outgoing connections, uses the domain of
the envelope's from as the domain in the HELO/EHLO greeting.
This works fine in practice, but ideally the domain used in the greeting
should match the reverse DNS record. This used to be more relevant but
nowadays it is not really enforced; however, it sometimes comes up in
self checks, and might cause some confusion when troubleshooting.
So this patch makes it use the configured hostname instead, which is
under the users' control and more likely to be compliant. It also
simplifies the code.
The documentation of the hostname configuration option is also updated
to mention this behaviour.
Thanks to Jonas Seydel (thor77) for bringing this up.
The linter complains that we're not checking for errors, but on some
cases it's on code paths were it is reasonable to do so (e.g. we're
closing the connection and it's a best-effort write).
This patch adjusts the code to make those cases explicit.
In the SMTP courier, when the MX lookup fails with a DNS temporary
error, we should propagate that up and cause delivery to fail with a
temporary error too.
That way the delivery can be attempted later by the queue.
However, the code today doesn't make the distinction and will treat any
temporary DNS error as a permanent delivery failure, which is a bug.
The bug was found by ludusrusso and reported in
https://github.com/albertito/chasquid/issues/4
This patch fixes the bug by propagating the error properly.
In https://github.com/albertito/chasquid/issues/4, ludusrusso comments
that the DNS lookup error handling in the SMTP courier could be improved
by using DNSError.IsNotFound.
That is true, but unfortunately it was only added in Go 1.13, and we are
currently trying to support Go 1.11 since that's what in Debian stable.
So this patch updates the TODO to note this, so that when we can use a
newer Go version, we improve this code.
There are a few context.WithDeadline calls that can be simplified by
using context.WithTimeout.
At the time they were added, WithTimeout was too new so we didn't want
to depend on it. But now that the minimum Go version has been raised to
1.9, we can simplify the calls.
This patch does that simplification, which is purely mechanical, and
does not change the logic itself.
This patch contains some minor code style improvements, to leave the
linter happier and generally follow best practices in some areas where
things snuck through.
In the upcoming Go release, logging from a finished testing.T triggers a
panic. In the courier tests, this is possible because we don't wait for
completion of fakeServer before ending the test.
This patch makes the tests wait for fakeServer to finish before exiting,
removing the race.
MTA-STS has been published as RFC 8461, with no major changes since the
last draft we updated (-18).
This patch updates the documentation accordingly (no code changes).
Instead of pre-filtering the MX list based on STS policy, just check
if it's allowed before each attempt, and skip it if not.
This simplifies the code.
This patch updates the STS implementation from draft version 02 to 18.
The main changes are:
- Policy is now in an ad-hoc format instead of JSON (😒).
- Minor policy well-known URL change (now ends in ".txt").
- Enforce HTTP media type == text/plain, as with the ad-hoc format this
becomes much more important.
- Simplify wildcard mx matching (same algorithm), extend test cases.
- Valid modes are "enforce" (as before), "testing" (replaces "report"),
and "none" (new).
This commit brings back the experimental MTA-STS (Strict Transport
Security) implementation, removed in commit
7f5bedf4aa.
We will continue development in the "sts" branch, subject to rebase,
until it is ready to be integrated into "next" again.
This patch adds a missing docstrings for exported identifiers, and
adjust some of the existing ones to match the standard style.
In some cases, the identifiers were un-exported after noticing they had
no external users.
Besides improving documentation, it also reduces the linter noise
significantly.
One of the SMTP tests was doing an external DNS lookup for a
non-existing host, which is reasonably harmless but makes the test less
hermetic.
This patch changes the non-existing host for an invalid address, which
has the same effect but avoids the network lookup.
This is a hack; ideally we would be able to override the resolver, but
Go does not implement that yet, and changing the code is not worth the
additional complexity.
We have many places in our tests where we create temporary directories,
which we later remove (most of the time). We have at least 3 helpers to
do this, and various places where it's done ad-hoc (and the cleanup is
not always present).
To try to reduce the clutter, and make the tests more uniform and
readable, this patch introduces two helpers in a new "testutil" package:
one for creating and one for removing temporary directories.
These new functions are safer, better tested, and make the tests more
consistent. All the tests are updated to use them.
The outgoing security level checks are not being performed, because of a
bug: the courier thinks the "to"'s domain is always empty.
This patch fixes the bug by simplifying the logic, as there's no need
for the conditional (there is always a domain in the "to" address if it
got to the SMTP courier).
This commit removes the experimental MTA-STS (Strict Transport Security)
implementation for now, as it's not up to date with the latest draft.
Development will continue on the "sts" branch, but this way it won't
block releases until it is ready.
Commits reverted:
- cb6500b993
- 0eeb964534
- e66288e4b4
- 216cf47ffa
- d66b06de51
- fe00750e39
- 933ab54cd8
The current default is "procmail" for historical reasons, but the
program has been unmaintained for years and its use is not generally
advisable.
This patch changes the default MDA binary to "maildrop", which is a more
modern and reasonable MDA to use.
We keep the courier.Procmail name for now, as that's internal, but it
may be changed later. Its documentation is updated to note that the
name is just for reference but it actually works with almost any binary.
Currently, if we can't find a mail server for a domain, we consider that
a transient failure (semi-accidentally, as we iterate over the (empty)
list of MXs and fall through the list).
It should be treated as a permanent error, in line with other DNS
issues (which is not ideal but seems to be generally accepted
behaviour).
This patch extends the SMTP courier to (optionally) do STS policy
checking when delivering mail.
As STS support is currently experimental, we gate this behind a flag and
is disabled by default.
Currently, we pick the first host in the MX list, and attempt delivery
there. If it fails, we just report the failure to the queue, which will
wait for some time and then try again.
This is not ideal: we should fall back to the other MXs in the list, as
the first host could be having issues for a long time, and not
attempting with the rest just delays delivery.
This patch implements the fallback, so we try all MXs before deciding to
report a failed delivery (unless, of course, an MX returned a permanent
failure).
When building the Debian package, the tests are run in such a way that
test/util/exitcode is not found, which causes them to fail.
This patch is a workaround which skips the test if test/util/exitcode is
not found; for now we should consider it temporary, until either the
Debian packaging is fixed, or we decide that its environment is
reasonably enough to make it permanent.
:
glog works fine and has great features, but it does not play along well
with systemd or standard log rotators (as it does the rotation itself).
So this patch replaces glog with a new logging module "log", which by
default logs to stderr, in a systemd-friendly manner.
Logging to files or syslog is still supported.
This patch introduces a new "domaininfo" package, which implements a
database with information about domains. In particular, it tracks
incoming and outgoing security levels.
That information is used in incoming and outgoing SMTP to prevent
downgrades.
The logic that picks the domain we use for EHLO does not need to live
within the TLS retry cycle, and makes it harder to understand.
This patch moves it out of the way, to improve readability.
This patch introduces expvar counters to chasquid and the queue
packages.
For now there's only a handful of counters, but they will be expanded in
future patches.
The SMTP courier was not properly closing the connection, and chasquid's
closing of incoming connections was not ideal (it was closing the
underlying one, not necessarily the active one, like in the case of a jump
to TLS).
This patch fixes both by adding the missing calls to Close.