Karol Kuczmarski’s Blog – Rust as a gateway drug to Haskell

0
2


For work-related reasons,
I had to recently get up to speed on programming in Haskell.

Before that, I had very little actual experience with the language,
clocking probably at less than a thousand lines of working code over a couple of years.
Nothing impressive either:
some wrapper script here,
some experimental rewrite there…

These days, I heard, there are a few resources for learning Haskell
that don’t require having a PhD in category theory.
They may be quite helpful when your exposure to the functional programming is limited.
In my case, however, the one thing that really enabled me to become (somewhat) productive
was not even related to Haskell at all.

It was Rust.

In theory, this shouldn’t really make much of a sense.
If you compare both languages by putting checkmarks in a feature chart,
you won’t find them to have much in common.

Some of the obvious differences include:

  • predominantly functional vs. mostly imperative
  • garbage collection vs. explicit memory management
  • lazy vs. eager evaluation
  • rich runtime vs. almost no runtime
  • global vs. localized type inference
  • indentation vs. braces
  • two decades (!) vs. barely two years since release

Setting aside syntax, most of those differences are pretty significant.

You probably wouldn’t use Haskell for embedded programming, for instance,
both for performance (GC) and memory usage reasons (laziness).
Similarly, Rust’s ownership system can be too much of a hassle for high level code
that isn’t subject to real time requirements.

But if you look a little deeper,
beyond just the surface descriptions of both languages,
you can find plenty of concepts they share.

Traits: they are typeclasses, essentially

Take Haskell’s typeclasses, for example —
the cornerstone of its rich and expressive type system.

A typeclass is, simply speaking,
a list of capabilities:
it defines what a type can do.
There exist analogs of typeclasses in most programming languages,
but they are normally called interfaces or protocols,
and remain closely tied to the object-oriented paradigm.

Not so in Haskell.

Or in Rust for that matter, where the equivalent concept exists under the name of traits.
What typeclasses and traits have in common is that
they’re used for all kinds of polymorphism in their respective languages.

Generics

For example, let’s consider parametrized types,
sometimes also referred to as templates (C++) or generics (C#).

In many cases, a generic function or type requires its type arguments
to exhibit certain characteristics.
In some languages (like the legacy C++), this is checked only implicitly:
as long as the template type-checks after its expansion, everything is okay:

template <typename T> T min(T a, T b) {
    return a > b ? b : a;
}

struct Foo {};

int main() {
    min(1, 2);  // OK
    min(Foo(), Foo());  // ERROR, no operator `>`
}

More advanced type systems, however, allow to specify the generic constraints explicitly.
This is the case in Rust:

fn min<T: Ord>(a: T, b: T) -> T {
    if a > b { b } else { a }
}

as well as in Haskell:

min :: (Ord a) => a -> a -> a
min a b = if a > b then b else a

In both languages, the notion of a type supporting certain operations (like comparison/ordering)
is represented as its own, first-class concept:
a trait (Rust) or a typeclass (Haskell).
Since the compiler is aware of those constraints,
it can verify that the min function is used correctly even before
it tries to generate code for a specific substitution of T.

Dynamic dispatch

On the other hand, let’s look at runtime polymorphism:
the one that OO languages implement
through abstract base classes and virtual methods.
It’s the tool of choice if you need a container of objects of different types,
which nevertheless all expose the same interface.

To offer it, Rust has trait objects,
and they work pretty much exactly like base class pointers/references from Java, C++, or C#.

// Trait definition
trait Draw {
    fn draw(&self);
}

// Data type implementing the trait
struct Circle { radius: i32 }
impl Draw for Circle {
    fn draw(&self) { /* omitted */ }
}

// Usage
fn draw_all(objects: &Vec<Box<Draw>>) {
    for &obj in objects {
        obj.draw();
    }
}

The Haskell analogue is, in turn, based on typeclasses,
though the specifics can be a little bit trickier:

{-# LANGUAGE ExistentialQuantification #-}

-- Typeclass definition
class Draw a where
    draw :: a -> IO ()

-- Polymorphic wrapper type
data Draw' = forall a. Draw a => Draw' a
instance Draw Draw' where
    draw (Draw' d) = draw d

-- Data types instantiating ("implementing") the typeclass
data Circle = Circle ()
instance Draw Circle where draw = undefined -- omitted
data Square = Square ()
instance Draw Square where draw = undefined -- omitted

-- Usage
drawAll :: (Draw a) => [a] -> IO ()
drawAll ds = mapM_ draw ds

main = do
    let shapes = [Draw' Circle (), Draw' Square ()]
    drawAll shapes

Here, the generic function can use typeclass constraints directly ((Draw a) => ...),
but creating a container of different object types requires a polymorphic wrapper.

Differences

All those similarities do not mean that
Rust traits and Haskell typeclasses are one and the same.
There are, in fact, quite a few differences, owing mostly to the fact that
Haskell’s type system is more expressive:

  • Rust lacks higher kinded types,
    making certain abstractions impossible to encode as traits.
    It is possible, however, to implement a trait for infinitely many types at once
    if the implementation itself is generic
    (like here).

  • When defining a trait in Rust, you can ask implementors to provide some auxiliary,
    associated types
    in addition to just methods.
    A similar mechanism in Haskell is expanded into type families,
    and requires enabling a GHC extension.

  • While typeclasses in Haskell can be implemented for multiple types simultaneously
    via a GHC extension,
    Rust’s take on this feature is to make traits themselves generic (e.g. trait Foo<T>).
    The end result is roughly similar;
    however, the “main implementing type” (one after for in impl ... for ...)
    is still a method receiver (self), just like in OO languages.

  • Rust enforces coherence rules on trait implementations.
    The topic is actually
    rather complicated,
    but the gist is about local (current package) vs. remote (other packages / standard library)
    traits and types.
    Without too much detail, coherence demands that there be a local type or trait
    somewhere in the impl ... for ... construct.
    Haskell doesn’t have this limitation,
    although it is recommended not to take advantage of this.

The M-word

Another area of overlap between Haskell and Rust exists
in the data model utilized by those languages.
Both are taking heavy advantage of algebraic data types (ADT),
including the ability to define both product types (“regular” structs and records)
as well as sum types (tagged unions).

Maybe you’d like Some(T)?

Even more interestingly,
code in both languages makes extensive use of the two most basic ADTs:

  • Option (Rust) or Maybe (Haskell) —
    for denoting a presence or absence of a value
  • Result (Rust) or Either (Haskell) —
    for representing the alternative of “correct” and “erroneous” value

These aren’t just simple datatypes.
They are deeply interwoven into the basic semantics of both languages,
not to mention their standard libraries and community-provided packages.

The Option/Maybe type, for example,
is the alternative to nullable references:
something that’s been
heavily criticized
for making programs prone to unexpected NullReferenceExceptions.
The idea behind both of those types is to make actual values impossible to confuse with nulls
by encoding the potential nullability into the type system:

enum Option<T> { Some(T), None }

data Maybe a = Just a | Nothing

Result and Either, on the other hand,
can be thought as an extension of this idea.
They also represent two possibilities,
but the “wrong” one isn’t just None or Nothing
— it has some more information associated with it:

enum Result<T, E> { Ok(T), Err(E) }

data Either e a = Left e | Right a

This dichotomy between the Ok (or Right) value and the Error value (or the Left one)
makes it a great vehicle for carrying results of functions that can fail.

In Rust, this replaces the traditional error handling mechanisms based on exceptions.
In Haskell, the exceptions are present and sometimes necessary,
but Either is nevertheless the preferred approach to dealing with errors.

What to do?

One thing that Haskell does better is composing those fallible functions
into bigger chunks of logic.

Relatively recently, Rust has added the ? operator
as a replacement for the try! macro.
This is now the preferred way of error propagation,
allowing for a more concise composition of functions that return Results:

/// Read an integer from given file.
fn int_from_file(path: &Path) -> io::Result<i32> {
    let mut file = fs::File::open(path)?;
    let mut s = String::new();
    file.read_to_string(&mut s)?;
    let result = s.parse().map_err(|e| io::Error::new(io::ErrorKind::InvalidData, e))?;
    Ok(result)
}

But Haskell had it for much longer,
and it’s something of a hallmark of the language and functional programming in general
— even though it looks thoroughly imperative:

intFromFile :: FilePath -> IO Int
intFromFile path = do
    s <- readFile path
    i <- readIO s
    return i

If you haven’t seen it before, this is of course a monad — the IO monad, to be precise.
While discussing monads in detail is way outside of the scope of this article,
we can definitely notice some analogies with Rust.
The do notation with <- arrows is evidently similar to
how in Rust you’d assign the result of a fallible operation after “unpacking” it with ?.

But of course,
there’s plenty of different monads in Haskell: not just IO,
but also Either, Maybe, Reader, Writer, Cont, STM, and many others.
In Rust (at least as of 1.19), the ? operator only works for Result types,
although there is some talk
about extending it to Option as well.

Eventually, we may see the language adopt some variant of the do notation,
though the motivation for this will most likely come from
asynchronous programming with Futures
rather than plain Results.
General monads, however, require support for higher kinded types
which isn’t coming anytime soon.

A path through Rust?

Now that we’ve discussed those similarities,
the obvious question arises.

Is learning Rust worthwhile
if your ultimate goal is getting proficient at functional programming in general,
or Haskell in particular?

My answer to that is actually pretty straightforward.

If “getting to FP” is your main goal, then Rust will not help you very much.
Functional paradigm isn’t the main idea behind the language —
its shtick is mostly memory safety, and zero-cost abstractions.
While it succeeds somewhat at being “Haskell Lite”,
it really strives to be safer C++.

But if, on the other hand, you regard FP mostly as a curiosity
that seems to be seeping into your favorite imperative language at an increasing rate,
Rust can be a good way to gain familiarity with this peculiar beast.

At the very least, you will learn the functional way of modeling programs,
with lots of smart enums/unions and structs but without inheritance.

And the best part is: you will be so busy
fighting the borrow checker
you won’t even notice when it happens 😉

Kommentieren Sie den Artikel

Please enter your comment!
Please enter your name here