Design for Success, Not Failure: Error Handling
When writing code, readability should be optimized for the success case rather than the rare failure case. You've might have seen this before - code littered with layers and layers of exception handling or defensive if statements, all in the name of resilience and fault tolerance. In my experience, this approach to writing code can make systems unpleasant to work in since the core logic is obfuscated with layers of exception handling blocks and if guards. Too much exception handling can also insidiously hide real issues if they do occur - the original error may be so obfuscated that you may lose visibility into what actually happened and your layers of exception handling may even start to create implicit dependencies with each other if your codebase is huge and complicated.
Only Handle Known Errors
Exception handling should be treated like performance optimization. When writing code, the common advice is to delay performance optimization until it is needed and initially optimize for readability. When optimization is needed, you would perform measurements to make sure they are worth doing. This is because doing performance optimization from the beginning could make your code hard to understand which is often the biggest bottleneck for productivity, and you need to take everything into account holistically to make sure you are optimizing the highest impact thing. Similarly, for exception handling, you should delay it until you absolutely need it such as when you want to implement some graceful failure behaviour, and you should make sure you are only handling known errors, not speculatively handling unknown errors.
One of the pitfalls of handling unknown errors is that you can end up hiding stupid bugs like this:
try:
array[0
except Exception:
print("Failed to index array");
The code that is trying to index an array is surrounded with an exception
handler that catches everything. However there is a typo. Python will throw a
SyntaxError
exception which is caught by the exception handler and the problem
may never be known until problems manifest indirectly in another part of the
codebase. The poor developer trying to debug this may spend hours trying to
figure out the source of the issue if the syntax error is deeply nested
somewhere in the callstack. This has actually happened to me many times at work
where overzealous exception handling like this cost me several hours of my life.
Instead, what the author may have intended was to catch errors where an index might not have existed:
try:
array[0
except IndexError:
print("Failed to index array");
This would have let the SyntaxError
propagate which would have allowed me to
identify the root cause a lot faster.
You want to be selective about which errors you catch. The only reason to try-catch a piece of code is if you are trying to catch a specific error that could potentially be thrown. This could be a request failure error that might occur when you're making a network request, or a parse error when you're parsing user input. You never want to be try-catching something that you don't understand or are not aware of since it will obfuscate the deeper issue at hand if those types of errors occur. Your users will suffer for it as the error may manifest itself indirectly in ways that are hard to understand and debug.
Use Root Handlers to Catch Unknowns or Knowns That You've Missed
Unknown errors can still crash your program and you may have missed adding exception handling to some known errors. This is not ideal for software that needs to self recover such as a UI or a web server. What do you do with errors that are not covered by your existing error handlers? You should define a root error handler.
The root error handler allows you to have a centralized place to handle the error. It's scope is usually broad as it's defined somewhere high up in the call stack where it can catch most errors if not all of them. The root error handler can stop errors from propagating further if you don't want to completely crash your program or if there is a way to gracefully recover from the error such as showing a generic "Something went wrong" error message to the user in lieu of an intimidating stack trace.
Root error handling manifests itself in many ways. For example, in React, you
can define an
ErrorBoundary
component to
handle errors thrown at render time by children components. This is pretty
important to avoid the blank white screen of death that was pretty common in a
lot of React apps before TypeScript was a thing since it was pretty easy to
crash the UI because of 'cannot read property of null/undefined' errors. You can
place an ErrorBoundary
component at the root of your React application to
handle all the unknown errors and notify the user that something unexpected had
occurred. And in this component, you can add instrumentation to log the errors
to your monitoring system.
Another example of root error handling is in Node HTTP request handlers. Because
of the asynchronous design of Node request handlers, you need to make sure you
are explicitly closing the request even when an error is thrown otherwise the
HTTP client could be stuck hanging forever. For example, if one of the functions
below throws an error before .end()
is called, the HTTP request could hang
indefinitely:
http.createServer(async (req, res) => {
const data = await getData();
res.writeHead(200, { "Content-Type": "application/json" });
res.end(JSON.stringify({ data }));
});
Most Node web frameworks wrap this with a root error handler to either stop the server entirely (to avoid memory leaks), or by responding with a 500.
http.createServer(
withErrorHandler(async (req, res) => {
const data = await getData();
res.writeHead(200, { "Content-Type": "application/json" });
res.end(JSON.stringify({ data }));
})
);
function withErrorHandler(requestHandler) {
return (req, res) => {
try {
requestHandler(req, res);
} catch (e) {
res.writeHead(500, { "Content-Type": "application/json" });
res.end(JSON.stringify({ error: "Internal server error" }));
}
};
}
So there are many types of root error handlers and software can have multiple root error handlers, usually at network and process boundaries.
Consider Modeling Error as Data
One way to handle errors is to stop treating them as exceptional. There are scenarios where certain errors are actually expected and are part of the happy path of your code. Parsing and validating user input is maybe the most classical example. Consider this function that parses user input:
function parseUserInput(input: string): string;
In the case where this function fails, the only way this function could signal failure is by throwing an exception since its function signature doesn't allow it to signal failure. However writing a try-catch every time you call this function can be arduous and since it's not documented anywhere in the function signature that this could fail, you could forget to do it as the compiler won't remind you. We don't want parse failures propagating to the root error handler since this error could be considered to be non-exceptional and an expected part of handling user input.
Instead, we can just model error as data:
type Ok<T> = [error: false, data: T];
type Err = [error: true, data: undefined];
type Result<T> = Err | Ok<T>;
function parseUserInput(input: string): Result<string> {
let result;
try {
result = parse(input);
} catch (e) {
if (e instanceof ParseError) {
return [true, undefined];
}
throw e;
}
return [false, result];
}
This is inspired by the Go error handling pattern by returning a tuple with one of the fields representing the error and the other field, the result. It is a tagged union where the caller would need to check the first value of the tuple for TypeScript's type system to discriminate between the success and failure tuple.
const [error, parsedInput] = parseUserInput("example input");
if (error) {
const x: undefined = parsedInput;
} else {
const x: string = parsedInput;
}
Treating error as data is especially valuable when a type system is involved
since it forces you to handle the error scenario and you can't forget to handle
error cases without the compiler complaining about it. This is a common
technique that comes from functional programming as seen in languages like
Haskell with it's
Either
type
and Scala with it's
Try
class. In
TypeScript, I've been using the
neverthrow
library which is a
package that provides the Result
type. You can wrap functions that throw and
turn them into functions that return error as data:
const safeParse = Result.fromThrowable(parse, () => "Parse failed");
const result = safeParse("example input");
if (result.isErr()) {
console.error(result.error);
} else {
console.log(result.value);
}
It pretty much uses the tagged union I had shown earlier with the tuples but
with better ergonomics and useful utility functions such as ok
and err
data
constructors for the Result
type and the ability to map the error in Result
so that you can transform it to a different format.
Takeaways
- Don't be overzealous with error handling - it hurts readability and makes the codebase a slog to work in.
- Only handle known errors and let unknown errors propagate.
- Root error handlers are useful for logging or gracefully recovering from unknown errors or known errors that you've missed adding error handling for.
- Consider treating non-exceptional errors as data using functional programming
patterns such as
Either
,Try
orResult
. - For most cases, just let the errors flow to the root handler.