Basics 04 - Functions (Part 1)

Functions - The heart and soul of any functional programming language.

What's so special about functions? That's the question I asked myself when I started exploring functional programming languages. I came from a C/C++/C# background. For me, functions were not so special. Functions were where I placed the logic, passed some parameters, and got some output. Functions helped me break down bigger "things" into smaller "things." Some extreme cases involved function pointers in C, or delegates in C# for callbacks/hooks. That is it...

Real or Impure World and Pure World

Before we begin discussing functions and what's so special about functions and functional programming languages, let's establish a contract, a thought process to break down applications (full-blown projects or modules or anything sizeable) into two parts: Real or Impure World and Pure World.

Real/Impure World and Pure World

Real/Impure World deals with:

  1. Databases and file system

  2. Queues and event stream

  3. User input/output

  4. Network, integrations and APIs etc.

  5. Essentially anything which has "uncertainty"

Pure world deals with only one thing:

  1. The business logic, the core purpose of the application/module

  2. Essentially anything which has "certainty"

Let's take a simple example. You need to write an application that does the following:

  1. Start

  2. Read a JSON file from the filesystem

  3. Parse JSON

  4. Perform validations

  5. Apply transformations

  6. Create new JSON

  7. Write new JSON to the filesystem

  8. Log (to the filesystem) success/failure

  9. Stop

Now let's breakdown the application scope into two worlds:

  1. Real/Impure World: Reading/Writing JSON, Logging

  2. Pure World: JSON parsing (deserialization), JSON creation (serialization), Validations, and Transformations

Once you have the breakdown:

  1. Implement the real/impure world

  2. Implement the pure world

  3. Figure a way out to stitch the two together

For the remainder of this section, we will focus on the pure world only.

The Pure World - The World of Pure Functions

Let's take some examples of functions written in C#:

public static string MyToUpper(string s)
{
    return s.ToUpper();
}

public static string CombineStringAndCurrentDate(string s)
{
    Log($"CombineStringAndCurrentDate called with {s}.");

    return $"{s}-{DateTime.Now}";
}

Problems with MyToUpper function from the caller's perspective:

  1. What happens if null is passed? Does it crash/throw or return an empty string?

  2. If it returns an empty string in case of failure/crash, how does the caller differentiate between empty input vs. crash case?

  3. Is this function internally calling ToUpper or ToUpperInvariant?

Problems with CombineStringAndCurrentDate function from the caller's perspective:

  1. Nearly all the problems with MyToUpper function, the uncertainty/ambiguity

  2. Is the current date prefixed or postfixed?

  3. What format of DateTime (to string) is used?

  4. Why is this function logging? Who asked for it?

Now let's define what a pure function is. Remember that we are in the pure world, where we implement the business logic, the real purpose of the application. Some of these ideas presented here may seem unrealistic to you, as they did to me. Be patient and keep reading, slowly everything should make sense.

  1. A pure function should not deal with null or void; you should never think of these terms in the pure world; absence of data has a better representation

  2. A pure function should not deal with exception handling; there are no thrown/caught exceptions; similar to null/void for the absence of data, failures have a better representation

  3. A pure function should not deal with Global State; it operates purely locally; that is, it operates only on the parameters that are supplied to it (see below for more details on Global State)

  4. Similar to point no. 3, a pure function has no Side Effects (see below for more details on Side Effects)

  5. If you combine point no. 3 and 4, a pure function does exactly what the scope/purpose is, nothing more or less

  6. In continuation to point no. 5, a pure function, ideally, does just one thing

  7. And lastly, given the same input, a pure function should ALWAYS produce the same output

At this point, you may wonder what meaningful piece of code can be written if these guidelines are followed. Yes, I wondered the same. I couldn't wrap my imperative and OOP mind around these ideas. As I said earlier, be patient and keep reading, slowly everything should make sense.

Let's discuss a few more areas and then we should get back to pure functions.

Global State and Side Effects

These are two terms you will often encounter in the world of functional programming, or wherever functional ideas are borrowed in the imperative world (React.JS for example). There are all sorts of twisted explanations available for the global state and side effects. Let's have a simple view. Simple views are helpful.

The Global State is anything that's outside the scope of a function.

Typical examples are:

  1. Some global (application/module level) look-up table (dictionary/map etc.) or list or queue is read/written from functions

  2. Reading system date/time

  3. Filesystem reading/writing (console printing, reading user input from the console and logging etc.)

  4. The list goes on...

All of these are examples of a function going (for reading or writing) outside the boundary of its parameters. Too many problems are caused by this behavior. The biggest challenge is synchronization. If a function is reading the global state, you have to ensure it's not reading stale or out-of-sync values. If a function is writing/changing the global state, you have to ensure that the change is broadcasted to others. If broadcasting is not possible (as it's not the easiest to implement), you are forced to use locks for synchronization.

Side effects, simply put, are what functions cause when they write/change the global state. If a function is not writing/changing the global state it cannot have a side effect. As simple as that.

Rule to remember: a function should perform only within the boundary of its parameters.

Inversion of Control (IoC)

This is a very old idea, sadly, now marketed as "Dependency Injection." Buried under marketing, a plethora of libraries, and fancy documentation, the real philosophy is lost.

Let's understand what inversion of control means with the help of this example written in C#:

static void PrintTodayDateTime()
{
    Console.WriteLine($"Current date and time is {DateTime.Now}.");
}

Here's a function that prints the current date and time. Everything works fine. Time passes by, more and more requirements come and you realize the following:

  1. This function prints the message in a fixed format, there is no way to change it. If you change the implementation of this function, what about all the existing functionality that depends on the current implementation of it?

  2. This function converts DateTime to string in one way only (DateTime.Now.ToString() actually). If we add a parameter to control the conversion of DateTime to string, the existing code will break.

  3. This function only prints the current date and time. Now there is a new demand to print past/present/future DateTime values. You will have to create new functions for newly discovered use cases or change the existing function and break existing code.

The answer to all these problems is: to allow a function to assume as little as possible, aka inversion of control. Here is the same function written with all previously mentioned problems fixed:

static void PrintDateTime(string format, DateTime dateTime, string dtFormat)
{
    Console.WriteLine(string.Format(format, dateTime.ToString(dtFormat)));
}

// Different callers
PrintDateTime("Current date and time {0}.", DateTime.Now, "D");
PrintDateTime("{0} is the current date and time.", DateTime.Now, "U");

Inversion of control simply means that the caller of function has control. The called function simply operates within the boundary set by the caller. Inversion of control allows you to write reusable, generalized functions, which are called with the right parameters, and those parameters are the problem of the caller.

Pure Functions are Look-up Tables

Imagine a pure function written in F#:

let add2Nums x y = x + y

Here are the results of invoking add2Nums with various values of x and y:

XYResult
112
123
246
81018
10010001100

Now imagine a Dictionary declared in F#:

let dictionary = Dictionary<(int * int), int>()

The key in this dictionary is a tuple (int * int), and the value is an int. Imagine you have populated the dictionary like this:

dictionary.Add((1, 0), 1)

dictionary.Add((1, 1), 2)

dictionary.Add((1, 2), 3)

For key, you pass 2 numbers, and for value, you pass the sum of those 2 numbers. And once you have populated this dictionary with infinite values, you can look up the sum of any 2 numbers.

Now what's the difference between invoking add2Nums for x and y, and looking up dictionary for (x, y)? In theory, there is no difference. You should get the same result.

While creating such lookup tables makes no sense, this example should help you understand a key idea: calling a pure function is the same as performing a lookup.

The Beauty of Pure Functions

Pure functions, when combined with techniques like IoC, partial application, chaining, and pipelining, allow you to:

  1. Write reusable, generalized functions

  2. Write functions that have no ambiguity, no uncertainty

  3. Write composable functions, that is, compose bigger functionality by stacking smaller pieces of functions

Conclusion

This was just an introduction to pure functions. Now we need to focus on techniques, how to achieve IoC, partial application, chaining, pipelining etc. Once we discuss these topics, we will revisit Pure World, and how to compose functions.

If you have reached so far, congratulations.

Keep reading!