Whilst reading Structure and Interpretation of Computer Programs, also known as the SICP book, I discovered the concept of Sequences as Conventional Interfaces. Even though it is an idea that I was somewhat familiar with, it was the first time I encountered a more formal definition for it. Reading about it has helped me to better understand its full power.
Most of the ideas and some of the code samples contained in this post originally come from the SICP book. Although it is written using Lisp the ideas and concepts are applicable to most languages. We will be following them in Haskell as it is the language I am currently learning and enjoying, its terse syntax also makes it a great fit for a blog post.
SICP describes conventional interfaces as a design principle for working with data structures. It is composed of a set of standard operators or combinators that connect the different steps required to implement computations in computer programs.
The combinators in question may be familiar to you as they will probably be provided in some form in your preferred language, e.g. map
, filter
, flatMap
(also known as bind
), reduce
(also known as fold
). They let us capture common patterns in the implementation of programs that are a priori structurally different, enabling us to think and reason about different programs in the same way. This is of great importance as it enables us to intuitively express completely different and unrelated programs applying function composition to this set of combinators.
SCIP also introduces the concept of signal-processing as a metaphor to reason about this programming style.
A signal-processing engineer would find it natural to conceptualize processes in terms of signals flowing through a cascade of stages, each of which implements part of the program plan.
We will expand more on this metaphor later on, but for now, it is important to emphasize that making the signal-flow structure evident in the design of our programs increases the modularity and readability of the resulting code. It is also important to emphasize that the use of these combinators increase the level of abstraction at which we can write code by abstracting the implementation of low-level operations such as iteration, recursion or conditional statements.
To demonstrate the importance of the principle, let's consider the two following functions taken from SICP, which I have translated from Lisp to Haskell.
A simple type representing a binary tree
data BinaryTree a =
Leaf a
| Node (BinaryTree a) (BinaryTree a) deriving (Foldable)
Sum squares of the leaves of a tree that are odd
sumOddSquares :: (Integral a) => BinaryTree a -> a
sumOddSquares (Node l r) = sumOddSquares l + sumOddSquares r
sumOddSquares (Leaf x) | odd x = x ^ 2
| otherwise = 0
All the even Fibonacci numbers up to a given number
evenFibs :: Integer -> [Integer]
evenFibs n = go 0
where go k | k > n = []
| otherwise = let f = fib k
next = k + 1
in if even f then f : go next else go next
Fibonacci sequence
fib :: Integer -> Integer
fib 1 = 1
fib 0 = 1
fib n = fib (n - 1) + fib (n - 2)
These two functions solve completely different and unrelated problems and they also are completely different in terms of the structure of their implementations. However, we are going to step back and look at them from a different perspective, so that we can discover the hidden similarities between them and expose the signal-flow structure in the implementation.
Let's break down at a high level what the functions are really doing.
sumOddSquares
:
+
, starting at 0.evenFibs
:
:
(List constructor), starting with []
(empty list).Note how both implementations are now based on a similar composition of stages. It is also worth nothing how each number contained in the list provided by each enumeration emulates a signal going through a serie of stages, filter followed by map, map followed by filter, respectively. Both implementations finish up by folding or reducing the signals using +
and :
respectively.
It was a truly insightful moment when I realized that indeed both functions are almost identical in structure as opposed to how it seemed at first. The reason why the given implementations are so different is the mixing of concerns that hides the signal-flow structure. To prove this, let's take a deeper look at the implementations to analyze where the mixing of concerns happens.
sumOddSquares
:
Leaf
pattern match, and partially by the double recursion, tree-like, in the Node
pattern match.odd
case in the Leaf
pattern match.+
that joins the double recursion in the Node
pattern match, and partially by the otherwise
case in the Leaf
pattern match.Each pattern matching is mixing several concerns, increasing the complexity of the implementation and hiding the signal-flow structure of the computation.
evenFibs
:
k > n
case and party by the next
expression.f
expression, is mixed with the even
filtering in the otherwise
case.:
, the list constructor, and partially by the [ ]
, empty list, in the k > n
case.As in sumOddSquares
, each pattern matching in evenFibs
is mixing different concerns, failing to exhibit the signal-flow structure of the computation.
Let's refactor both functions following the steps defined previously, so that we can expose the hidden similiarities and the signal-flow structure.
It is worth noting that .
is the operator for function composition in Haskell.
sumOddSquares
:
+
, starting at 0.sumOddSquares :: (Integral a) => BinaryTree a -> a
sumOddSquares tree = foldr (+) 0 . fmap (^2) . filter odd . enumerateLeaves $ tree
enumerateLeaves :: BinaryTree a -> [a]
enumerateLeaves tree = foldr (:) [] tree
The original implementation
sumOddSquares :: (Integral a) => BinaryTree a -> a
sumOddSquares (Node l r) = sumOddSquares l + sumOddSquares r
sumOddSquares (Leaf x) | odd x = x ^ 2
| otherwise = 0
evenFibs
:
:
(List constructor), starting with []
(empty list).evenFibs :: Integer -> [Integer]
evenFibs to = foldr (:) [] . filter even . fmap fib . enumerateInterval 0 $ to
enumerateInterval :: Integer -> Integer -> [Integer]
enumerateInterval from to = [from..to]
The original implementation
evenFibs :: Integer -> [Integer]
evenFibs n = go 0
where go k | k > n = []
| otherwise = let f = fib k
next = k + 1
in if even f then f : go next else go next
Now that we have seen how different computations can be shaped similarly by using the same set of combinators composed in different ways, let's expand on the signal processing metaphor.
As mentioned in SICP, the key to organizing programs so as to more clearly reflect the signal-flow structure is to concentrate on the "signals" that flow from one stage in the process to the next. SCIP uses Lisp, where everything is a list, therefore the signal processing metaphor fits very nicely with lists. If we move away from Lisp to Haskell for example, the metaphor may not be as evident, but as we just showed it is equally applicable. In most of the cases, Haskell abstracts the different combinators from the concrete data structure using type classes, let's break down a few of them.
Enumerate generates the signals that initiate the computation. Haskell names the concept of enumeration unfolding and provides the Data.Unfoldable
type class. List comprehension is also available as a convenience to generate lists of values.
Filter discards unwanted signals by only keeping the ones that satisfy a given predicate. As far as I know, there is not a concrete type class in Haskell that abstracts the concept of a filterable data structure from the concrete implementation. However, there is a filter combinator defined in each of them, e.g. List
, Map
, Set
, Seq
.
Map converts each signal to a different type. Haskell defines the concept of data structures that can be mapped in the typeclassData.Functor
. In Haskell the map
function is called fmap
because it is considered as a function that maps functions, hence the f prepending map.
Bind nests signals so that we can use each of them and do some extra computation based on their values. bind
is provided by type class Control.Monad
and it can also be applied as an index operator with >>=
. Bind is also known as flatMap
in other languages.
Zip joins two sequences of signals as a sequence of pairs. As far as I know, there is not a concrete type class in Haskell that abstract the concept of a zippable data structure from the concrete implementation. However, it is defined in some of the most usual ones, such as List
or Seq
.
Fold combines the signals to create a summary value. In the case of data structures that can be folded Haskell defines the type class Data.Foldable
. In other languages fold
is know as reduce
.
To demonstrate that the idea of Sequences as Conventional Interfaces is not limited to lists, let's show a few examples of data structures that can be thought of as sequences:
Map
data structure can be thought of as a generator of sequences, e.g. It can generate sequences with its keys, with its values or it can pair them together as a sequence of pairs where each element is composed of a key and a value.Maybe
, also know as Option
, could be considered as a sequence that can contain zero or one signal.Either
as a sequence that contains a single signal which can be of one of two possible types.(,)
in particular could be considered as a sequence that contains exactly two signals.Different sequences have different constraints, structures and semantics, but they are equally valid from the combinators point of view. These sequences are just concrete examples of abstract mathematical structures.
By now you may have realized that both examples described in this post have a structure with the following shape, enumeration followed by a composition of standard combinators, terminating with a fold or reduction to a summary value, in other words:
I see the pattern Enumerate -> combinators -> Fold as a very intuitive programming style that comes as a consequence of using Conventional Interfaces and it can be applied to most, if not all, computations out there.
To close up the post, here is a FizzBuzz implementation that follows the signal-flow structure, as an extra demonstration of this pattern.
FizzBuzz
fizzBuzz :: Integer -> [String]
fizzBuzz to = foldr (:) [] . fmap fizzBuzzOrNumber . enumerateGame $ to
fizzBuzzOrNumber :: (String, Integer) -> String
fizzBuzzOrNumber (fizzBuzz, index) = fizzBuzz `orElse` (show index)
enumerateGame :: Integer -> [(String, Integer)]
enumerateGame to = zip fizzesBuzzes $ enumerateInterval 1 to
fizzesBuzzes :: [String]
fizzesBuzzes = zipWith (++) fizzes buzzes
fizzes :: [String]
fizzes = cycle ["", "", "Fizz"]
buzzes :: [String]
buzzes = cycle ["", "", "", "", "Buzz"]
orElse :: [a] -> [a] -> [a]
orElse [] ys = ys
orElse xs _ = xs
Software is our passion.
We are software craftspeople. We build well-crafted software for our clients, we help developers to get better at their craft through training, coaching and mentoring, and we help companies get better at delivering software.