"There is no failure except in no longer trying." – Elbert Hubbard
In modern web applications, handling network requests, database operations, and other async tasks reliably is crucial. Yet these operations often fail due to temporary issues like network hiccups or service timeouts. While we could manually wrap each operation in try-catch blocks with custom retry logic, this quickly becomes repetitive and error-prone. That's where a robust retry mechanism comes in handy.
In this article, we'll build a library that makes handling retries simple yet flexible. We want a simple API that wraps async functions and handles retries automatically. The library should support configurable retry attempts, delays, and timeouts while implementing exponential backoff with jitter to prevent overwhelming services.
API Design
The library handling retries is a withRetry
function that takes in as first argument an asynchronous function, and an optional object defining some behaviour configuration.
const result = await withRetry(asyncFn, { ... });
The withRetry
function is defined as follows:
async function withRetry<T>(
fn: () => Promise<T>,
config: Partial<RetryConfig> = {},
): Promise<T> { ... }
The configuration for the withRetry
function is defined as follows:
type RetryConfig = {
maxAttempts: number;
delay: number;
maxDelay: number;
backoffStrategy: RetryConfigBackoffStrategy;
jitter: RetryConfigJitter;
retryCondition: (error: Error) => boolean;
onRetry: (error: Error, attempt: number) => void;
onExhausted: (error: Error) => void;
timeout: number;
};
The different options are:
maxAttempts
: the maximum number of attempts before we return a failure.delay
: the duration (in ms) to wait between subsequent retries.maxDelay
: a cap on how long to delay retries. This can be useful when using some backoff strategies to avoid delays growing past a reasonable point.backoffStrategy
: the strategy used to calculate the delay between retries.jitter
: a rate (between 0 and 1) of the amount of randomness to add to the delay. This is particularly useful to avoid "thundering herd" problems, where multiple calls are awaken at the same time (in this case, after the same delay).retryCondition
: a function that, given the last retry error, decides if we should continue retrying (up tomaxAttempts
).onRetry
: a callback that runs on every retry.onExhausted
: a callback that runs oncemaxAttempts
is reached orretryCondition
returnsfalse
.timeout
: the duration (in ms) to wait for each retry before marking them as failed.
Backoff strategy
The backoff strategy is further defined as:
type RetryConfigBackoffStrategy =
| "constant"
| "linear"
| "exponential"
| { type: "linear"; factor: number }
| { type: "exponential"; factor: number }
| ((attempt: number, delay: number) => number);
A constant
backoff strategy means that the delay
between each retries is the same.
A linear
backoff strategy multiplies the delay
by the number of attempts
, hence growing linearly over each retry. If passed with a factor
, that number is used to further multiply the attempts
and previous delay
, effectively increasing the delay
more.
An exponential
backoff strategy multiplies the delay
by 2 to the power of attempts
. Similarly, if passed with a factor
, that number gets multiplied to effectively increase the delay
more between each retries.
Finally, the backoff strategy also allows for custom implementations by accepting a function that takes the attempt
and delay
and returns a number.
Jitter
The jitter is further defined as:
type RetryConfigJitter =
| boolean
| number
| ((attempt: number, delay: number) => number);
Setting the jitter
to true
provides a random jitter on every retry, whereas setting it to a number between 0 and 1 uses that as a coefficient for subsequent delay
s. Similarly to the backoff strategy, it can also be set to a function for custom implementations, taking the attempt
number and delay
.
Default values
By default, we set the configuration to the following values:
const DEFAULT_CONFIG = {
maxAttempts: 3,
delay: 100,
maxDelay: 1000,
backoffStrategy: "constant",
jitter: true,
retryCondition: () => true,
onRetry: () => {},
onExhausted: () => {},
timeout: 0,
};
Implementation
Let's dive into the implementation of this library!
At a high level, we'd need:
- a main retry loop that calls the asynchronous function passed as argument, and keeps retrying based on configuration
- a delay calculation function that returns how long the library should way between subsequent retries
- some logic to also handle timeouts (if configured)
Main retry loop
The main retry loop is defined as an infinite while
loop. This give us more flexibility on stop conditions for the loop.
async function withRetry<T>(
fn: () => Promise<T>,
config: Partial<RetryConfig> = {},
): Promise<T> {
...
while (true) {
try {
return await fn();
} catch (error) {
...
}
}
Here, we return whatever the function returns if it's successful. If it fails, the while
loop will continue retrying that function.
The stopping conditions for our retry loop are:
maxAttempts
has been reached, orretryCondition
returnsfalse
We can check for those conditions in the catch
block as follows:
async function withRetry<T>(
fn: () => Promise<T>,
config: Partial<RetryConfig> = {},
): Promise<T> {
let attempt = 0;
while (true) {
attempt += 1;
try {
return await fn();
} catch (error) {
if (
attempt == config.maxAttempts ||
!config.retryCondition(error as Error)
) {
config.onExhausted(error as Error);
throw error;
}
}
}
}
We keep the number of attempts in a local attempt
variable that gets incremented whenever we catch
an error. If we have tried the maximum number (maxAttempts
), or if the retryCondition
function returns false
, we throw the final error
. We also call the onExhausted
callback.
Delay calculation
The code we currently have will retry a function, but without any delays. Let's fix that!
We're going to add a delay in the catch
block of the while
loop and use setTimeout
to wait for that given time.
async function withRetry<T>(
fn: () => Promise<T>,
config: Partial<RetryConfig> = {},
): Promise<T> {
...
while (true) {
...
try {
return await fn();
} catch (error) {
...
const delay = calculateDelay(attempt, config);
await new Promise((resolve) => setTimeout(resolve, delay));
}
}
}
The calculateDelay
function depends on 3 factors:
backoffStrategy
maxDelay
jitter
The backoffStrategy
can either be a string ("constant" | "linear" | "exponential"
), an object (with a type
and a factor
fields), or a function that returns a number. Based on the type of backoff strategy, we calculate the new delay
based on the previous delay
and the attempt
number.
function calculateDelay(attempt: number, config: RetryConfig): number {
let delay: number;
if (config.backoffStrategy === "constant") {
delay = config.delay;
} else if (config.backoffStrategy === "linear") {
delay = config.delay * attempt;
} else if (config.backoffStrategy === "exponential") {
delay = config.delay * 2 ** attempt;
} else if (typeof config.backoffStrategy === "object") {
if (config.backoffStrategy.type === "linear") {
delay = config.delay * config.backoffStrategy.factor * attempt;
} else if (config.backoffStrategy.type === "exponential") {
delay = config.delay * config.backoffStrategy.factor ** attempt;
} else {
throw new Error("Invalid backoff strategy");
}
} else {
delay = config.backoffStrategy(attempt, config.delay);
}
return delay;
}
To put a cap on the delay
, in particular for "exponential"
strategies, we then take to lowest value between the newly calculated delay
and maxDelay
.
function calculateDelay(attempt: number, config: RetryConfig): number {
let delay: number;
...
delay = Math.min(delay, config.maxDelay);
return delay;
}
Finally, we need to implement the jitter
based on its configuration. The value of a jitter
is between 0 and 1, and can be though as adding a percentage of the delay to itself. The jitter
can be a function, a number, or random depending on the value passed in the configuration.
function calculateDelay(attempt: number, config: RetryConfig): number {
let delay: number;
...
if (config.jitter) {
const jitter = isFunction(config.jitter)
? config.jitter(attempt, delay)
: isNumber(config.jitter)
? config.jitter
: Math.random();
delay = delay + delay * jitter;
}
return delay;
}
Timeout
Now that we have our main retry loop and delay calculation settled, the last piece of the puzzle is to implement a timeout
mechanism!
We leverage Promise.race([...])
and setTimeout
to implement the timeout:
async function withRetry<T>(
fn: () => Promise<T>,
config: Partial<RetryConfig> = {},
): Promise<T> {
...
while (true) {
...
try {
if (config.timeout > 0) {
const timeoutPromise = new Promise<never>((_, reject) => {
setTimeout(
() => reject(new Error("Operation timed out")),
config.timeout,
);
});
return await Promise.race([fn(), timeoutPromise]);
}
return await fn();
} catch (error) {
...
}
}
}
This ensures that we can control for how long we want to wait for the asynchronous function to return. This is particularly useful for network requests!
Code
There we have it, a retry library that provides flexible configuration!
This library has been implemented and published on npm
: