async and Concurrent Patterns

forkIO is the low-level primitive. In real code you almost never call it directly, you use the async library by Simon Marlow. It handles exception propagation, cancellation, and result retrieval, all the things forkIO leaves to you.

The mental model: an Async a is a handle to a thread that will eventually produce an a (or throw). It's structurally similar to a Future or Promise in other ecosystems, but with proper exception semantics that respect Haskell's type system.

Basic usage

import Control.Concurrent.Async

main :: IO ()
main = do
  a <- async $ do
    threadDelay 1000000
    return "result"
  putStrLn "doing other things"
  result <- wait a
  putStrLn result

async spawns a thread and returns immediately with a handle. wait blocks until the thread finishes and returns its value, or rethrows any exception the thread raised. This last part is the killer feature, with forkIO, exceptions in child threads vanish silently unless you go out of your way to catch them.

concurrently and race

For the common case of running two things in parallel:

import Control.Concurrent.Async

fetchUser :: UserId -> IO User
fetchPosts :: UserId -> IO [Post]

userPage :: UserId -> IO (User, [Post])
userPage uid = concurrently (fetchUser uid) (fetchPosts uid)

Both run in parallel. If either throws, the other is cancelled and the exception is propagated. This is the right semantics, you don't want a half-finished operation lingering after the other failed.

race is similar but takes the first to complete:

withTimeout :: Int -> IO a -> IO (Maybe a)
withTimeout micros action =
  either (const Nothing) Just <$> race (threadDelay micros) action

Whichever finishes first wins, the other is cancelled. (There's already a timeout in System.Timeout, this is just an illustration.)

forConcurrently and mapConcurrently

For a list of independent operations:

import Control.Concurrent.Async

fetchAll :: [UserId] -> IO [User]
fetchAll uids = forConcurrently uids fetchUser

This spawns one thread per element. If any throws, all the others are cancelled and the exception is rethrown. For a million elements, you don't want a million threads, even if they're cheap, the contention on shared resources (DB connections, HTTP clients) will kill you. Use a pool:

import Control.Concurrent.Async.Pool

fetchAll :: [UserId] -> IO [User]
fetchAll uids = withTaskGroup 16 $ \g ->
  mapTasks g (map fetchUser uids)

Sixteen workers process the queue. The async-pool package gives you bounded concurrency.

Exception handling

The semantics are precise: if a child thread throws and you wait on it, the exception is rethrown in the waiter. If you don't wait, the exception is silently ignored (which is rarely what you want).

waitCatch returns Either SomeException a instead of rethrowing:

import Control.Exception
import Control.Concurrent.Async

safeFetch :: IO (Either SomeException User)
safeFetch = do
  a <- async $ fetchUser someId
  waitCatch a

withAsync is the resource-safe variant, it guarantees the async is cancelled when the block exits:

processWithMonitor :: IO ()
processWithMonitor =
  withAsync (monitor metrics) $ \_ -> do
    runMainProcess
    -- monitor is cancelled here, even if runMainProcess throws

Using async without withAsync is the most common bug, you spawn an async, an exception fires, and the async is now orphaned. It might keep running indefinitely, or it might be killed at GC time, neither is what you want. Always prefer withAsync over bare async when you have a clear scope.

unliftio: doing this in monad transformer stacks

The async API is in IO. If you're working in ReaderT AppEnv IO, the standard async won't work directly, you'd need to unlift the monad. MonadIO gives you one direction (lift IO into your monad) but not the other.

The unliftio package solves this. It defines MonadUnliftIO for monads that have an isomorphism to IO (which ReaderT over IO does, but StateT over IO does not, the state would be lost across threads).

import UnliftIO.Async

myApp :: ReaderT AppEnv IO ()
myApp = do
  result <- concurrently (queryDB "users") (queryDB "posts")
  liftIO $ print result

This works because ReaderT AppEnv IO is MonadUnliftIO. The state-like effects (logging, request context) are passed through the reader, which is shared safely across threads.

Mercury's banking platform is largely built on RIO (a wrapper around ReaderT with MonadUnliftIO), and the codebase uses unliftio-core extensively for safe concurrency in their effect stack.

Comparison with Go and Erlang

Go's goroutines are similar in spirit, lightweight threads scheduled onto OS threads. The differences:

Go has channels as the primary communication primitive. Haskell has MVar, STM, and channels, with STM being the more powerful option.
Go's panic doesn't propagate across goroutines unless you set up a recover, much like forkIO ignoring exceptions. Haskell's async does propagate, which catches more bugs.
Go has no equivalent to STM. Coordinating multiple shared values atomically requires explicit locks or rolling your own.

Erlang processes are isolated, with no shared state and message-passing only. This is a different model, you can't have unintended sharing because there's nothing to share. Haskell can do this style with Chan/TChan but the language doesn't enforce isolation. Where Erlang shines is fault tolerance, you supervise processes and restart them on crash. Haskell can do supervision (libraries like async plus exception handlers) but it's less idiomatic.

In practice, Haskell sits between the two. Like Go, you can share state, with the addition of STM for safe composition. Like Erlang, you can build message-passing systems if you want isolation. The choice depends on what you're building.

Practical patterns

Pipeline of stages:

import Control.Concurrent.STM
import Control.Concurrent.Async

pipeline :: IO ()
pipeline = do
  q1 <- newTBQueueIO 100
  q2 <- newTBQueueIO 100
  withAsync (stage1 q1) $ \_ ->
    withAsync (stage2 q1 q2) $ \_ ->
      stage3 q2

Each stage runs concurrently, communicating via bounded queues. Backpressure is automatic, if stage2 is slow, stage1 will block on full q1.

Worker pool with results:

import Control.Concurrent.Async

processItems :: Int -> [Item] -> IO [Result]
processItems concurrency items =
  withTaskGroup concurrency $ \g ->
    mapTasks g (map processOne items)

Cancellable long-running task:

runWithCancel :: TVar Bool -> IO a -> IO (Maybe a)
runWithCancel cancelFlag action =
  withAsync action $ \a ->
    withAsync (atomically $ readTVar cancelFlag >>= check) $ \c -> do
      result <- waitEither a c
      case result of
        Left v  -> return (Just v)
        Right _ -> cancel a >> return Nothing

waitEither returns whichever finishes first. If the cancel flag is set, we cancel the action.

Common Pitfalls

Forgetting to wait. async returns immediately, the work happens in the background. If you don't wait, the result is discarded and exceptions are silently dropped. Always wait, or use withAsync for scoped lifetimes.

Using forConcurrently for huge lists. A million asyncs aren't free even though each thread is cheap, you'll exhaust file descriptors, DB connections, or whatever bounded resource the work touches. Use a pool.

Mixing async with unliftio-async in the same stack. They don't interoperate cleanly. Pick one for your codebase.

Async exceptions and cleanup. If you hold a resource and an async exception arrives mid-operation, you need bracket (or mask) to clean up. withAsync does this for you for the spawned thread, but you still need it for any resources you hold.

Long-running asyncs and memory. An Async keeps a reference to its result. If you spawn a million long-running asyncs and never wait on them, you keep references to all their result thunks. This is a leak. Either bound the count or wait/discard them.

Key Takeaways

The async library is the standard interface for concurrency. It handles exception propagation, cancellation, and result retrieval correctly, things forkIO leaves to you.

Use withAsync for scoped concurrency, concurrently/race for the common two-task patterns, and forConcurrently (or a pool) for parallel maps.

unliftio lets you use these patterns in monad transformer stacks safely. MonadUnliftIO is the abstraction.

Compared to Go and Erlang, Haskell's concurrency is more flexible (STM is unique) but less opinionated. The discipline of async plus STM gives you most of what you want from either, with type-checked exception semantics.