status

Why did IOG choose Haskell to build Cardano?

Published 25.1.2023

The IOG team carefully decided which programming language and tools to use for Cardano development. In the end, they chose Haskell. The team must be able to formally verify that everything written in the white papers will correspond correctly with the source code. Come read about how the team transforms the mathematical proofs from the white papers into Haskell source code.

TLDR

  • Before Cardano, no project from the top 10 used formal methods.
  • The white paper is the basis for the formal specification. It is the basis for the source code written in Haskell.
  • The formal specification and Haskell source code have very similar notations.

From white paper to source code

Many teams in the blockchain industry have a "move fast and break things" startup mentality. This mentality is driven by youth, greed, and passion. The source code of many protocols is often written in haste and without prior research. In some cases, we can talk about the irresponsible approach of teams toward users. The antithesis is a slow, methodical, and academic-oriented approach motivated by a desire to consolidate innovation in our space. At the time the team began developing Cardano none of the top ten cryptocurrencies by market capitalization were based on a peer-reviewed protocol and were implemented based on a formal specification. The IOG team decided to use formal methods to develop Cardano.

The IOG team defines and mathematically proves the correctness of the behavior of the protocol and other components in white papers. These are then used to create a formal specification. Formal specification is a precise, mathematical description of the system's behavior and properties.

In white papers, mathematical notation is used to express abstract concepts and mathematical proofs. It is precise but can be difficult for non-experts to understand. Formal specification notation is used to express the behavior and properties of a software system in a precise, mathematical way. It is less abstract than mathematical notation but more precise than source code. In other words, the programmer may not understand the mathematical notation used in white papers but is able to understand the formal specification.

After the formal specification is created, the next step is the implementation of the source code in the programming language Haskell. It is expected that the implementation follows the specifications and the properties described in the formal specification. The implementation should be done in such a way that it's possible to prove mathematically that the implementation meets the requirements and properties described in the formal specification. In other words, the source code can be formally verified based on the specification.

Haskell source code is used to express the implementation of Cardano. It is the most concrete form of expected behavior. It is the form in which the system is executed. Haskell source code is less precise than formal specification notation, but it is more easily understood by Haskell developers.

While there is a relationship between mathematical notation, formal specification notation, and Haskell source code, it is not always easy to switch between them. The level of abstraction and expressiveness of each is different since different notations are used. The goal is to maximize the transfer of information between different notations.

Scientists and mathematicians don't have to be Haskell programmers. Their job is to write a white paper. Haskell programmers don't have to be top mathematicians. The formal specification is a kind of mediator between the two groups. There must be someone who is able to create a formal specification from a white paper. However, it's possible to use tools that can automatically translate between mathematical notation, formal specification notation, and source code, such as dependently typed programming languages like Coq or Agda.

Cog and Agda are both formal verification tools that can be used in the context of functional programming languages like Haskell. They are both proof assistants, which are computer programs that help users write formal mathematical proofs.

Both Cog and Agda are based on the idea of using formal methods to enhance the reliability and correctness of software systems. While Cog is more specific to concurrent algorithms, Agda is more general and can be used for any kind of functional programming. They are both powerful tools for ensuring the correctness of software systems and can be used together to provide a comprehensive verification solution.

Why Haskell

Haskell is a functional programming language that is well-suited for the use of formal methods during software development.

One of the main advantages of Haskell is its strong type system. Haskell has a static type system which means that types are checked at compile-time, providing a high level of safety and preventing many types of bugs before the program is even run. Additionally, Haskell supports type inference, which means that the programmer does not need to explicitly specify the types of variables and functions, making the code more concise and readable.

In Haskell, variables are immutable by default, which means that once a variable is assigned a value, it cannot be changed. This makes it easier to reason about the program's behavior and can simplify concurrent and parallel programming. Immutability also helps to prevent bugs caused by unexpected changes to variables, as well as making code more testable.

Haskell also has a strong focus on functional programming, which encourages a declarative and expressive style of programming. In functional programming, the focus is on describing what the program should do, rather than how it should do it. This makes the code more readable, and maintainable and can simplify debugging and testing.

Haskell's strong type system and mathematical style of programming make it a natural fit for formal methods. The type system in Haskell is very expressive, allowing for a high level of static checking of the code. This means that many common programming errors can be caught at compile-time, rather than at runtime. Additionally, the functional programming paradigm encourages a mathematical style of thinking, which helps to make the code more predictable and easier to reason about. It can be especially beneficial for complex and large Cardano code bases.

Haskell's garbage collection and strict evaluation ensure that the program is protected from runtime errors such as null pointer dereferences, buffer overflows, and data race. This makes it more suitable for safety-critical systems.

Haskell's concurrency and parallelism model is based on lightweight threads and message passing makes it easy to write code that can take advantage of multiple cores and processors. This is important for the Cardano project, which is a blockchain platform that needs to handle a large number of transactions simultaneously.

Haskell's high-level abstractions, such as monads, make it easier to write code that is modular, composable, and reusable. This can help to reduce the complexity of the codebase and make it easier to reason about and prove the properties of the code.

Factorial function

Let's look at a simple example of how a mathematical notation can be transformed into source code.

A complete mathematical proof of the factorial function would look something like this:

See the formal specification in Agda for the factorial function:

Look at the Coq notation of the factorial function:

See how the Haskell source code of the factorial function resembles the Agda and Coq notation:

For comparison, see how the implementation of the factorial function looks like in the traditional C++ language:

The programmer needs to understand the formal specification to be able to write the corresponding source code in Haskell. Of course, a programmer can also write a factorial function in C++, but notice that the notation is less similar to the formal specification. The code in both cases does exactly what the specification says it should do, which is the most important thing of all.

However, it is more natural for a programmer to compare Haskell notation and formal specifications than if he had to do the same with C++ (or another similar language). The difference would be more noticeable for more complex functions.

Conclusion

The IOG team had many reasons for choosing Haskell. It is not only the language itself that is important but also its suitability for solving a specific problem, the number of tools available, and the number of quality developers. It is not very common for a regular programmer to have experience with formal methods. In the case of Haskell programmers, the frequency is higher. Moreover, it is uncommon to find experienced Haskell developers without detailed knowledge of computer science.

In the context of the need to use formal methods for Cardano development, Haskell was one of the natural choices. Cardano is being built as a mission-critical project, and Haskell has in its DNA features that help achieve this goal.

Featured:

Related articles

Did you enjoy this article? Other great articles by the same author