What is functional programming?

By Vladimir Khorikov

The topic of functional programming described here is a part of my Applying Functional Principles in C# Pluralsight course.

In this article, I’ll try to answer the question: what is functional programming?

Functional programming

So, what is functional programming? This term arises quite often and every author writing about it gives their own explanation. I’m no exception. In my opinion, the simplest and at the same time precise definition is the following: functional programming is programming with mathematical functions.

Mathematical functions are not methods in a programming sense. Although we sometimes use the words “method” and “function” as synonyms, they aren’t the same concepts from the functional programming point of view. The best way to think of a mathematical function is as of a pipe that transforms any value we pass it into another value:

Functional programming: Mathematical function

Mathematical function

And that’s basically it. A mathematical function doesn’t leave any marks in the outside world about its existence. It does only one thing: finds a corresponding object for each object we feed it.

In order for a method to become a mathematical function, it must fit two requirements. First of all, it should be referentially transparent. A referentially transparent function always produces the same result as long as you supply it the same arguments. It means that such function should operate only the values we pass in, it shouldn’t refer to the global state.

Here’s an example:

public long TicksElapsedFrom(int year)

{

    DateTime now = DateTime.Now;

    DateTime then = new DateTime(year, 1, 1);

 

    return (now – then).Ticks;

}

This method is not referentially transparent because it returns different results even if we pass it the same year. The reason here is that it refers to the global DatetTime.Now property.

A referentially transparent alternative for this method would be:

public long TicksElapsedFrom(int year, DateTime now)

{

    DateTime then = new DateTime(year, 1, 1);

 

    return (now then).Ticks;

}

This version operates only the parameters passed in.

Secondly, the signature of a mathematical function must convey all information about the possible input values it accepts and the possible outcome it may produce. I call this trait method signature honesty.

Look at this code sample:

public int Divide(int x, int y)

{

    return x / y;

}

The Divide method, despite being referentially transparent, is not a mathematical function. Its signature states that it accepts any two integers and returns another integer. But what happens if we pass it 1 and 0 as input parameters?

Functional programming: Dishonest method

Dishonest method

Instead of returning an integer as we’d expect, it will throw a Divide By Zero exception. It means that the method’s signature doesn’t convey enough information about the result of the operation. It deceives the caller by pretending that it can handle any two parameters of the integer type whereas in practice it has a special case which cannot be processed.

In order to convert the method into a mathematical function, we need to change the type of the “y” parameter, like this:

public static int Divide(int x, NonZeroInteger y)

{

    return x / y.Value;

}

Here, NonZeroInteger is a custom type which can contain any integer except zero. This way, we’ve made the method honest as it now doesn’t behave unexpectedly for any values from its input range. Another option is to change its returning type:

public static int? Divide(int x, int y)

{

    if (y == 0)

        return null;

 

    return x / y;

}

This version is also honest as it now doesn’t guarantee that it will return an integer for any possible combination of input values.

Despite the simplicity of the functional programming definition, it entails a lot of practices that might seem novel to many programmers. Let’s see what they are.

Side effects

The first such practice is avoiding side effects as much as possible by employing immutability all over your code base. This technique is important because the act of mutating state contradicts the functional principles.

The signature of a method with a side effect doesn’t communicate enough information about what the actual result of the operation is, and that hinders our ability to reason about it. In order to validate your assumptions regarding the code you are writing, not only do you have to look at the method’s signature itself, but you also need to fall down to its implementation details and see if this method leaves any side effects that you didn’t expect:

Functional programming: Side effects

Side effects

Overall, code with data structures that change over time is harder to debug and is more error-prone. It brings even more problems in multi-threaded applications where you can have all sorts of nasty race conditions.

When you operate immutable data only, you force yourself to reveal hidden side effects by stating them in the method’s signature and thus making it honest. This makes the code more readable because you don’t have to fall down to the methods’ implementation details in order to reason about the program flow. With immutable classes, you can just look at the signature of a method and get a good understanding of what is going on right away, without too much effort.

Exceptions

Exceptions is another source of dishonesty for your code base. Methods that employ exceptions to control the program flow are not mathematical functions because, just like side effects, exceptions hide the actual outcome of the operation the method performs.

Moreover, exceptions have a goto semantics meaning that they allow you to easily jump from any point of your program to a catch block. In fact, exceptions perform even worse because the goto statement doesn’t allow for jumping outside a particular method, whereas with exceptions you can cross multiple layers in your code base with ease.

Primitive Obsession

While side effects and exceptions make your methods dishonest about their outcomes, primitive obsession deceives the reader about the methods’ input values. Here’s an example:

public class User

{

    public string Email { get; private set; }

 

    public User(string email)

    {

        if (email.Length > 256)

            throw new ArgumentException(“Email is too long”);

 

        if (!email.Contains(“@”))

            throw new ArgumentException(“Email is invalid”);

 

        Email = email;

    }

}

 

public class UserFactory

{

    public User CreateUser(string email)

    {

        return new User(email);

    }

}

What the signature of the CreateUser method tells us? It says that for any input string, it returns a User instance. In practice, however, it accepts only strings that are formatted in a particular way and throws exceptions if they are not. Therefore, this method is dishonest as it doesn’t convey enough information about the kinds of strings it works with.

This is essentially the same problem you saw with the Divide method:

public int Divide(int x, int y)

{

    return x / y;

}

The type of the parameter for email, as well as the type of the parameter for “y”, is more coarse-grained than the actual concept they represent. The number of states an instance of the string type can be in exceeds the number of valid states for a properly formatted email. This disparity results in deception for the developer who uses such a method. It makes the programmer think that the method works with primitive strings whereas, in the reality, this string represents a domain concept with its own invariants.

Just as with the Divide method, the dishonesty can be fixed by introducing a separate Email class and using it in place of the string.

Nulls

Another practice in this list is avoiding nulls. It turns out that the use of nulls makes your code dishonest as the signature of the methods using them doesn’t tell all information about the possible outcome the corresponding operation may have.

In fact, in C#, all reference types act as a container for two kinds of values. One of them is an instance of the declared type, and the other one is null. And there’s no way to distinguish between the two as this functionality is baked into the language itself. You have to always keep in mind that when you declare a variable of a reference type, what you actually do is you essentially declare a variable of a custom two-fold type which can contain either a null reference or an actual instance:

Functional programming: Nulls

Nulls

In some cases, it’s exactly the thing you need but sometimes, you want to just return MyClass, without a possibility for it to turn into null. The problem here is that there’s no way to do that in C#. There’s no way to distinguish between nullable and non-nullable reference types. It means that methods with reference types in their signature are inherently dishonest.

This issue can be fixed by introducing the Maybe type and an agreement across the team that whenever you define a nullable variable, you use the Maybe type to do that.

Why functional programming?

An important question that comes to mind when you read about functional programming is: why bother about it in the first place?

To answer it, we need to refer to the pain points that we as an industry have at the moment. One of the biggest problems that come into play in enterprise software development is complexity. The complexity of the code base we are working on is the single most important factor that affects such things as speed of development, the number of bugs and the ability to quickly adjust to ever-changing market needs.

There is only so much of complexity we can deal with at a time. If the project’s code base overwhelms this limit, it becomes really hard, and at some point even impossible, to change anything in the software without introducing some unexpected side effects.

Applying functional programming principles helps reduce code complexity. It turns out that programming with mathematical functions makes our job a lot less complicated. Because of the two traits they possess – method signature honesty and referential transparency – we are able to understand and reason about such code much easier.

Every method in our code base – if written as a mathematical function – can be considered in isolation from others. When we are sure that our methods don’t affect the global state or blow up with an exception, we can treat them as building blocks and compose in any way we want. This, in turn, unlocks great opportunities for building complex functionality which itself is not much harder to create than the parts it consists of.

With honest method signatures, we don’t have to fall down to the method’s implementation details or refer to the documentation to see if there’s something else we need to consider before using it. The signature itself tells us what can happen after we call such method. Unit testing also becomes much easier. It comes down to a couple of lines where you just provide an input value and check that the result is the one you excepted. No need to create complex test doubles, such as mocks and maintain them later on.

Summary

Functional programming is programming with mathematical functions. Most methods in the C# language can potentially be converted into mathematical functions. To do this, we need to make their signatures honest in a sense that they should fully convey all possible inputs and outcomes and we also need to make sure the method operates only the values we pass in, nothing else.

Practices that help convert C# methods into mathematical functions are:

  • Immutability
  • Avoiding exceptions to control the program flow
  • Getting rid of primitive obsession
  • Making nulls explicit

I go into the topic of functional programming in much more detail in my Pluralsight course Applying Functional Principles in C#, so be sure to check it out.

Related articles:





  • http://sidburn.github.io/ David Raab

    It is a very nice overview!

    The only thing that isn’t quite correct is your definition about “referential transparency”. What you describe is mathematical functions. But “referential transparency” is that you always can replace an expression with its value without that a programs behaviour change. And because of this, nearly all things you mention derive from it.

    For example immutability. If you write “x = 5” it means x is 5 and you can replace all “x” in your program with “5” without that the behaviour of your program changes. But if “x” is mutable then this is not possible at all, because “x” at some place could be something different as “5”.

    Throwing exceptions is the same. If you have a function call “let x = DivideByInt(1, 0)” it means once again you can calculate DivideByInt(1,0) only once, save it in x and also replace all “x” with the result of it. With exceptions it doesn’t work at all because an exception is not of the same type of an int is. “x” cannot be of two different types.

    In the same sense that is also the problem of “null”. “null” itself is also not a concrete type, it is just compatible with every type.

    The same is true for side-effects. If you always can replace a function call with is return value, then side-effects breaks referential transparency. if you have a logging statement in a function and it returns a value. You cannot just take the return value and replace every function call with the same arguments with it. Because doing that means your behaviour of your program changes (you see less logging statements).

    Referential transparency leads to mathematical functions, if you always can replace the output of a function with the function call, then those functions also must be mathematical function.

    And all of it leads to laziness. If everything must be immutable, everything can be replaced by its output, nothing can contain side-effects, then it also means you can defer everything to the point it is needed, and only calculate everything once.

    Things like avoiding exceptions, avoiding side-effects, methematical functions or embracing immutability are not “best practices” that are done because they make something better (well it makes it better, but this is not the point). All of those things derives from the restriction of “referential transparency”.

    • http://enterprisecraftsmanship.com/ Vladimir Khorikov

      Referential transparency in on itself doesn’t lead to all this practices. For example, a referentially transparent function can still return a null value or work with primitive types instead of Value Objects. Mathematical functions can’t be defined by this attribute along.

      • http://sidburn.github.io/ David Raab

        Theoretically i don’t see a difference. “null” is a valid value for every type. That is why returning “null” is also valid in a mathematical function. “null” sure has a lot of problems, but it is not “null” itself that creates the problem, it is more that “null” is implicitly a valid value for every reference type and you are not forced to check for it. If you eliminate those both things, then null would also not be a problem at all.

        And by that, an option type is basically the same. An option type like “option” means, it can return “int” or “None”. It is the same thing with “null”. An Option type is only better because it makes the “null”, “None” part explicit and forces you to check for it. But a mathematical function returning an “Option” still basically can return two different things, like a function that can return additional a “null”.

        As for the second, i don’t know what you mean by “Primitive Type”. I know “Primitive Type” from Java and they mean, int, float, byte and so on. In .NET the closest is a “Value Type”. But all Value Types are basically “Value Objects” by default. So probably you mean something different here?