Lazy-ing in My Pool

Posted on October 6, 2024

I encountered an interesting problem recently, where I needed to initialize a pool of resources. The caveat here is though, for performance reasons, I only wanted to initialize resources on demand. Mainly to avoid paying the price of initializing the entire pool, when I might only need to do a singular action using one resource.

Everyone likes a nice usecase, so let’s set the scene. Imagine you have a pool of files you need to read over the network, and some work you have to do on each file’s contents. This FileHandle is the resource we’ll manage manually, ensuring the connections is properly both, initialized and finalized.
data FileHandle = FileHandle {...}

doWork :: FileHandle -> IO Int
doWork fileHandle = readArguments fileHandle >>= process

Explicit Lazy Resources

Let’s start with initializing a single one of these FileHandle resource lazily, and only afterwards create a pool from there. Doing anything with resources implies using something like bracket to ensure finalization in the presence of exceptions. Though bracket isn’t enough by itself in this case, as bracket implies strict usage of the resource. What we can do, however, is create an explicitly lazy wrapper around our initialization and use that as the resource within bracket instead.
data LazyResource m a
    = Uninitialized (m a)
    | Initialized a
    | Finalized

We’ll need something extra to maintain what state the resource currently exists in, however. We could do something State-style and do LazyResource a -> (LazyResource a, r), but while concrete, that’s not very user friendly, and also does nothing for thread-safety, if we want to use this resource concurrently. A simple alternatively is to encase the lazy wrapper and persist the resource’s state in an MVar cell.
-- | Anytime we use the resource, we trigger the deferred the initialization.
withLazyResource :: MVar (LazyResource a) -> (a -> IO r) -> IO r
withLazyResource lazyResourceRef continuation = mask $ \restore -> do
  lazyResource <- takeMVar lazyResourceRef
  -- We always have to ensure we put back the resource into the cell, to ensure
  -- any waiting, are unblocked
  let putBackVerbatim = putMVar lazyResourceRef lazyResource
  resource <- case lazyResource of
    Uninitialized initializer ->
      restore initializer `onException` putBackVerbatim
    Initialized resource -> pure resource
    Finalized -> do
      putBackVerbatim
      error "Attempted to use a resource after it has been finalized"

  -- Same as the above, but we've now initialized the resource.
  let putInitialized = putMVar lazyResourceRef (Initialized resource)
  result <- restore (continuation resource) `onException` putInitialized
  putInitialized
  pure result

Some care has to be taken when exceptions are involved, and to ensure that the MVar is never left empty, even while initializing the resource in the first place! It is preferable that every thread depending on the resource fails, instead of being deadlocked on the thread that held the resource and died.

Creating one of these MVar (LazyResource FileHandle) is left as an exercise to the reader. It should take care of deferring both the initialization as well as finalization of the underlying resource, if needed.

Lazy Pools

Creating pools of these lazy resources is not much different than pools of any manually managed resources, ensuring that each initialization is paired with a corresponding finalization. In actuality, you’ll likely want something that already exists like resource-pool. But I’ll show a small simple alternative.

Keeping it simple, I’ll keep track of all resources in the pool in a simple list, along with a quantity semaphore that ensures we don’t exceed the intended number of resources.
data Pool a = Pool
  { resources :: IORef [a]
  , occupancy :: QSem
  }

For nesting consecutive with-like functions, there’s a neat trick you can do with kan-extensionsCodensity type.
withPool
  :: (forall r. (a -> IO r) -> IO r)
  -> Int
  -> (Pool a -> IO b)
  -> IO b
withPool withResource poolSize action = do
  let withPoolResources =
        runCodensity $ replicateM poolSize $ Codensity withResource

  withPoolResources $ \resources -> do
    poolResources <- newIORef resources
    qSem <- newQSem poolSize
    action $ Pool poolResources qSem

That’s it! Using the pool is should be straightforward, keeping the resources in the pool and the semaphore in sync, while masking exceptions via bracket. And it can be combined with withLazyResource to handle lazy resources transparently.