Functional programming is taking off, but there appears to be some misunderstanding about the theoretical background of its types. Many programmers incorrectly name types and create confusion around simple ideas. To clarify, let's look at the differences between union, tagged unions, and algebraic data types.

Algebraic data types (ADTs) are defined in terms of product and sum types. Sum types (also called tagged unions) are a *refinement* of union types. Let's explore their differences:

*Union types* encode values under the same allocated memory. The most basic example I can think of is the **union** keyword in C. The example below allocates the biggest type contained in the union. This makes it possible that both values could be there—but not at the same time. If you need to know which value is available at any time, you need to use a field in a **struct** to keep track of the type at runtime.

union {

int i;

void *p;

}

A *tagged union* saves some runtime information to determine which element from the union exists. Tagged union/sum types are a refinement of union types; the constructor adds some extra information in the tag so that you can easily pattern match on it. Haskell does this really elegantly:

data TrafficLight = Red | Yellow | Green

In this example, **TrafficLight** represents the type constructor, and only one of the three data values can exist at any given point. The symbol **|** expresses the idea of a logical **or** operator. The following code shows how to create and pattern match on the **TrafficLight** type:

msgTrafficLight colour =

case colour of

Red -> "you should stop"

Yellow -> "rush before it turns red"

Green -> "you can pass"

Before we get into algebraic data types, let's take a look at *product types*, which are necessary for understanding ADTs.

Product types are your normal pairs and tuples, e.g., **(1, "Hello")** in Haskell with type **(Int, String)**, where both types, **Int** and **String**, are expected. Another way to define your product types is using **record** types, which basically are tuples with named labels. For instance, if we define the previous tuple as:

`data Person = {age :: Int, name :: String}`

creating a person is as simple as **Person 31 "Kiko Fernandez"** or using record syntax **Person {name = "Kiko Fernandez", age = 31}**.

If we generalize the idea of products inside tagged unions, then we can represent data values with a disjointed number of arguments and/or types in each branch of the sum type. The example below redefines **TrafficLight** with additional information, where two data values use the same type, a **String**, and the remaining data value uses the type constructor **Options**.

data Options = Rush | SlowDown

data TrafficLight = Red "Stop" | Green "Drive" | Yellow Options

If we further expand this generalization by adding type variables to type constructors, then we have defined the idea of algebraic data types! ADTs are important because they allow programmers to define composite types based on products and sum types.

One more final example: Our friend the *Option* type (known as *Maybe* type for Haskellers) can be encoded using plain ADTs:

data Maybe t = Nothing | Just t

**Maybe t** is a type constructor with a type variable, **t**, while **Nothing** and **Just t** are a tagged union where **Just t** uses the type variable from the type constructor.

I hope that this clarifies the definition and the usage of union, tagged unions, products, and algebraic data types. If you have questions, please post them in the comments.

## Comments are closed.