Roly, Oly, …, Polyvariadic

Posted on July 6, 2019

During my undergraduates, I followed a course called Functional Programming. That course mainly dealt with getting your feet wet with Haskell and some light fp theory. At that point, most students followed the course on imperative programming just half a year ago and then get blasted with the whole higher order functions deal. I definitely doubt I understood what monads were at the end of it.

Now during my master’s studies, I followed a course on “Advanced Functional Programming”. One of the topics that came up during the explanation of monad transformers were functional dependencies and type families – these looked a lot like magic.

One of the many things that is possible with these extensions, is polyvariadic functions, functions which take an arbitrary number of parameters. One example included in ghc is Text.PrintF, which mimics the C-inspired function printf with its, at first glance, weird type signature PrintfType r => String -> r.

The main idea behind this, is to have a function that can result in either another function or a final value. In the printf example, this is accomplished using the PrintfType type class.

As an example, let’s define a function which sums a variable number of integers. The result of the function is either an int or a function which takes an int, and gives the same choice. We can keep this choice of possible types as the type variable of the class.
class AddP r where
    addP :: Int -> r

Defining the instance which will lead to the final value should be easy. The type variable r just becomes an Int.
instance AddP Int where
    addP :: Int -> Int
    addP = id

The case where the function needs to return a function is only slightly more complex. The first instinct is to create an instance for Int -> r. This could look like:
instance AddP r => AddP (Int -> r) where
    addP :: Int -> Int -> r
    addP x y = addP (x + y)

This is, however, too limiting. Trying to execute addP 1 2 gives an error message which, while kind of vague, gives an idea as to why.
<interactive>:1:1: error:
     No instance for (AddP (Integer -> Integer -> Int))
        arising from a use of add
        (maybe you haven't applied a function to enough arguments?)
     In the expression: add 1 2 :: Int
      In an equation for it: it = add 1 2 :: Int

The best effort hint ghc gives here is pretty useless in our case, as we clearly give 2 arguments to the function. The real problem is our usage of the plus operator and our arguments. The plus operator is defined over all instances of the Num type class, and integers in haskell can be anything from Int, Integer, Natural, or even Float. So using the function needs a lot more type annotations to work. This is evident by the fact that addP (1 :: Int) (2 :: Int) :: Int does indeed typecheck.

We need to expand the cases this instance captures. One way to do this, is to use equality constraints, aka. the type level ~ operator. Writing a ~ b means that we additionally require that a and b should be nominally equal. This is different from using, for example just a in every position where b would appear.

When ghc tries to look up an instance, it tries to find any matching instances. If we only write an instance for Int, when it tries to match an instance for Integer, it can’t find anything. A different situation pops up, however, if we write an instance for a and then constrain that a to be Ints. This instance will always be matched due to the very generic type a -> r, while still maintaining the Int constraint. Note that this will still crash if no appropriate instance can be found; trying to match String with an instance written with an Int constraint, for example, will still fail. The following works when executed with addP 1 2 3 :: Int.
instance (AddP r, a ~ Int) => AddP (a -> r) where
    addP :: Int -> a -> r
    addP x y = addP (x + y)

add :: AddP a => a
add = addP 0

We can reuse the previous flow to create polyvariadic variations of unwords and (:). The lists function is not an exact mirror, because of (:) requiring a list to append to. The last function, lists, just builds a list from the given arguments
class WordP r where
    unwordP :: String -> r

instance WordP String where
    unwordP :: String -> String
    unwordP = drop 1

instance (WordP res, a ~ String) => WordP (a -> res) where
    unwordP :: String -> a -> res
    unwordP xs x = unwordP (unwords [xs, x])

unword :: WordP a => a
unword = unwordP []

class ListP r where
    type LArg r
    listsP :: [LArg r] -> r

instance ListP [a] where
    type LArg [a] = a
    listsP = reverse

instance (ListP r, a ~ LArg r) => ListP (a -> r) where
    type LArg (a -> r) = a
    listsP xs x = listsP (x : xs)

lists :: ListP r => r
lists = listsP []

The above add function and AddP type class can be greatly improved. It currently only accepts arguments of type Int, while the only operator it makes use of, (+), is defined over all instances of the Num type class. Ideally, we would like our function to also be defined over those same instances. Examples of what we would have liked to see are:
> add 1 2 3 :: Int
6
> add 1 2.4 3 :: Float
6.4

We can use the Num constraint in the type class definition and instance declarations to achieve this, in most cases this just means switching out the Ints with the constrained variable. We also add an extra type variable which will contain the information over which type we’re currently operating.
class Num a => Add a r where
    addN :: a -> r

One problem is the instance we write for values. We can’t really write an instance for (Num a, a ~ r) => Add a r, because the r type variable is too generic and captures too much.

One solution to this problem is the OverlappingInstances extension. This extension allows multiple instance definitions to be written where the applicable types overlap. In our case the Add a r instance would overlap with any other instance for Add, like Add a (a -> r).

We can, in a limited capacity, annotate specific instances to be preferred over others. So in situations where both Add a r and Add a (a -> r) matched, we would prefer that Add a (a -> r) be picked. The resulting definitions are:
class Num a => Add a r where
    addN :: a -> r

instance {-# OVERLAPPABLE #-} (Num a, a ~ r) => Add a r where
    addN :: a -> r
    addN = id

instance {-# OVERLAPPING #-} (Add a r, a ~ a') => Add a (a' -> r) where
    addN :: a -> a -> r
    addN x y = addN (x + y)

One big annoyance is the fact that we have to create a type class per polyvariadic function. Next time we’ll try and look at approaches to generalizing this process.