Monday, October 5, 2015 [Tweets] [Favorites]

What the Heck Is a Monad

Soroush Khanlou:

This is the first important part of a monad. You have to have a way to create one. In this case, the constructor, Maybe.Something, fills that role. In other languages, this is known as unit or the inconveniently-named function return. It’s a function that takes one parameter, and returns a monad that wraps that parameter.


It’s important that the block returns an already-wrapped monad, so that we can chain these calls. This is a big part of why monads are useful.


Functional programmers took a great name like ifSomething and made it totally inscrutable by calling it flatMap. (In some of the literature, it’s also known as bind. In Haskell, aka peak inscrutability, it’s invoked with the operator >>=.)


To build map, we wrap the result of the map block with the constructor and send that to flatMap:


For something to be monad, in addition to implementing bind and unit, it has to follow some special rules.

Previously: Functor and Monad in Swift, Higher Order Functions in Swift 2.

Update (2015-10-07): Jeremy W. Sherman:

Re: monads, I point people at this article when they want to tackle the hilariously compact yet accurate “monoid in the category of endofunctors” definition. Unpacks the jargon, but does use Haskell syntax.


"I'm probably going to fail"

Well, he got that bit right.

"[Monads] wrap a value, like a shell."

No, monads wrap behaviors. They were originally added to Haskell in order to encapsulate IO (so that it wouldn't pollute the user's entire program with side-effects, as those destroy a functional system's ability to decide the order of operations for itself), and subsequently found to be good for encapsulating and composing a wide range of other non-functional behaviors as well.

Haskell's Maybe monad is merely the simplest, and hardly the most powerful or interesting, example of how Haskell makes use of monads. Its main value in such discussions is as a demonstration of how a good, reusable, general-purpose idea can generate convenient solutions to a wide range of different problems - in this case, the ability to chain the application of operations upon a value that may or may not exist, avoiding the need for multiply-nested case blocks. But this only makes sense in a language which doesn't already have a simple, established (typically built-in) way to achieve the same result. Haskell is such a language. Swift, however, is not.

The problem with the linked article is that the author immediately sets up the entire discussion with a massive misconception that Swift Optionals are monads. They're not, they're just a simple sum type (aka tagged union), backed up by some special-case behaviors hacked into the Swift compiler to make them behave a bit differently to every other sum type in Swift. Thus the discussion fails before it's even started.

Even without such a fundamental boob, using an imperative language like Swift as the foundation upon which to explain what monads are, why on Earth you'd want to use them, and how they actually work, is just a rotten approach, period:

1. Swift is an imperative language, so imposes none of the functional restrictions that require the invention of monads in the first place. All that FP-izing Swift does is "solve" a problem that doesn't actually exist, while creating even more complexity for Swift users to wade through as a side effect.

2. No matter how much FP-izing of Swift is done, users will never gain any of the significant benefits that FP automatically provides, such as fewer bugs thanks to compiler's ability to reason mathematically about code, and free runtime performance improvements due to implicit laziness/memoization/parallelization. FP's restrictions on state/side-effects/sequential operations are an all-or-nothing deal: as soon as you introduce explicit state or ordering, you lose the automatic benefits.

Thus, for all the effort expended, the user is left with only one significant question: So what was point of all that? And the only answer, as far as Swift is concerned, is: Absolutely none.

I can understand not wanting to use Haskell for worked examples as the syntax is obtuse and its size intimidating, but in that case why not just invent a simple pseudo-language specifically for the purpose? At least then the problem and solution are a clear fit - something they certainly aren't after being bludgeoned unrecognizably into a Swift-shaped mold.


In other words, monads will only make sense to folk coming to them from the Haskell/ML side, not from C/Java/Python/Swift/etc, because the problems they solve and the benefits they preserve simply don't exist in the latter. A programmer who only knows imperative languages has absolutely no idea of what they're missing, or is even aware that they're missing it. It's like explaining the color green to a perfectly blind man.

Frankly, the only way in which "applying" fundamental FP concepts to an imperative language will succeed is in generating massively broken misconceptions amongst imperative programmers as to what functional/declarative programming is all about. (Protip: programming with first-class functions != Functional Programming.) Those imperative programmers then merrily propagate till others until everyone's understanding is completely wrong and no-one even realizes it - making it even worse than useless[1].


Furthermore, Swift already has its own clearly-defined, well-established idioms; for example, when working with Optionals, you typically use guard statements instead of nested if...else statements. Or, you define your behaviors as methods on your object's type, then chain your method calls using ? to skip over the remainder should one of them return nil. Or you just have your functions throw an error to indicate the 'no result available' condition and use a do...catch block to pick up at the end.

This is not just a technical issue, it's about basic effective human-to-human communication too. Swift code that follows standard Swift idioms is instantly familiar to - and readable by - any Swift developer, whereas "clever" code that invents its own ingeniously non-standard NotReallyAMonadOptional constructs is going to be a giant PITA for whatever poor devil eventually ends up maintaining it[2]. Not to mention that interoperability between that code and everyone else's code that follows the standard idioms (or other equally "clever" non-standards) will be lousy.

The interop issue alone is sufficient grounds to take all these "FP in Swift" blogs, articles, and even books, and utterly bury them (preferably in the heart of the furthermost star), while the rest is just downright Lovecraftian in its unspeakableness.


That all said, I do believe there is value to be had in trying to culture mainstream programmers' interest in non-mainstream tools and idioms.

Something I really wish the FPers would do is create a really clean, simple, minimalist FP language that can be quickly taught or learned by programmers before they disappear completely down the OO rabbit hole. Not so they can notch up another "mastered X/Y/Z features today" on their toolbelt, but so they can learn to think and reason about describing and solving problems in more than just one rote "hardcoded" (imperative procedural OO) way. FP is about saying what you want, not how to do it.

Ditto the Lispers, who really need to get over their parentheses fetish, which only serves to obscure the real take-home message that algebraic computation (C/Java/Swift/Python/etc) is only one possible way in which to derive an answer, and that symbolic computation opens up whole new possibilities in problem-solving (not to mention meta-problem-solving, and meta-meta-problem-solving... and so on:).

In a way, the Haskellites and Lispers are even more culpable than the mainstream programmers. While the latter can at least fall back on the defence that they don't know what they don't know, the FPers and MPers have absolutely no such excuse. It's just plain old miserable communication skills on their part, and given how much they love those tools it's astonishing how little they do to ensure others can understand and employ them effectively too.


[1] This blind-leading-the-blind phenomenon has already occurred with the WWW, for example - look at the popularity of the oxymoronic phrase "REST API", and realize that every single web programmer who uses it has an understanding so utterly broken that it's not even wrong. It is frightening just how utterly blind, and even blissfully unaware, the entire programming culture and industry is to its own misunderstandings and Dunning-Krugerism. And then we wonder why so much software today fails to deliver on its promises or just stinks outright.

[2] This includes the original author if/when he[3] finally grows out of his smartass phase. All male programmers go through this phase after they've mastered the three Vs (Variable, Values, and eValuation), though it's scary how many seem perfectly happy to remain in extended programmer adolescence for years or even decades, building ever more convoluted and worse-than-useless testaments to their own ingenuity (e.g. see every other article ever posted).

[3] Dunno about female programmers (and don't wish to add mansplaining to my already long list of commenting sins), though hopefully they've a bit more freaking sense, not to mention a basic desire to get the job done and go home rather than play with themselves all day.


I am wondering if you have any opinions about the article, Swift, or Functional Programming and if you'd be willing to share any of them?


> FP's restrictions on state/side-effects/sequential operations are an all-or-nothing deal: as soon as you introduce explicit state or ordering, you lose the automatic benefits.

I think I disagree with "all-or-nothing". Less side effects is better.

@charles: It's "all or nothing" in that a functional compiler/runtime cannot reason mathematically about a program unless referential transparency is absolutely guaranteed. A functional program does not describe a sequence of operations to perform (as an imperative program does) but rather the relationships between that program's inputs and its output.

A functional machine can then make very powerful decisions on how to evaluate that program. It can defer evaluation of a function unless/until that function's output is actually needed. It can cache (memoize) the result of a time-consuming function so that the next time the same inputs are given it does not need to recalculate that result. It can choose to divide and distribute intensive calculations across multiple CPU cores (or even entire networks) secure in the knowledge that race/deadlock conditions cannot ever occur.

Oh, and even if v1 of a functional compiler/runtime doesn't provide these powerful capabilities, v2 can add them at a future date and your programs are guaranteed to produce exactly the same results regardless of which they run on. As programs become ever more distributed and non-deterministic, and thus harder for humans to reason about efficiently and correctly, automating away such reasoning to the machine is definitely not to be sniffed at.


However, while math is fantastic for modelling and analysing fixed relationships, it is totally hopeless at modelling time. As soon as you introduce even the possibility that any function within that program could return different outputs for the same input you void that guarantee, and the ability for the machine to reason completely reliably about any part of that program goes down like a domino run.

This is why Haskell has monads: they allow individual, non-functional operations to be performed inside of a much larger, functional program, without allowing those operations' imperative poison to spread beyond the tight confines of the monad wrapper. "What happens in Vegas, stays in Vegas" is the monad's motto (unofficial).


Obviously, Haskell can't reason functionally about the code within monads, and its ability to reason about functions through which monads pass may also be somewhat degraded. Plus you lose some polymorphism, since a function that takes a monadic value as input can't be applied to a non-monadic value (or vice-versa).

Still, there are plenty of use-cases where such losses may nevertheless be outweighed by other benefits: tasks where side-effects are unavoidable (e.g. IO), or where the human still understands her own needs and intents better than the machine can deduce (e.g. don't start calculating B unless and until the result of A is such-and-such), or where you gain more in expressibility than you lose in machine reasoning (e.g. compare the relative sizes of the bothGrandfathers functions here).


Monads are a compromise, but one whose boundaries are absolutely defined and absolutely enforced by Haskell itself. That's not something a language like Swift can ever do, because no matter how much you might informally tighten your code (e.g. using lets instead of vars), there will always be some opening, some hole, some trapdoor, some way in which imperative behavior can get through. Not because Swift is deficient, but because that is exactly what it's designed to do.

The key problem, in other words, does not exist in the language but in its users. Mainstream programmers desperately need to learn how to educate and adapt themselves to the right tool for a job, instead of endlessly trying to make their existing tool into a Swiss army hammer. For a bunch of exceedingly smart, highly motivated people, they are often the dumbest sacks of rocks too. :)

@Jason: Damn, you are really inviting trouble. :p FWIW, I have seriously considered doing my own blog, but I get little enough real work done as it is.

That said, if my rambling comments contain one key take-home message, it's that imperative programmers cannot learn non-imperative concepts simply by adding them on top of their existing imperative mental model. They should not even try doing it themselves, never mind encourage others to do so too.

Look, adopting new knowledge by mapping it onto our existing knowledge is one of those ubiquitous learning tricks we all use more or less automatically. And very often it is a highly efficient time- and labor-saving technique; and programmers do need to absorb a heckuva lot of information, so need all the help they can get. However, such learning shortcuts are only reliable when the new knowledge is just extending what we already know; attempting to apply it to a totally foreign, fundamentally incompatible knowledgebase is a recipe for misconception disaster.

Instead, you have to force yourself to "forget" about everything you already know, and start constructing a brand new mental model completely from scratch. You cannot make any assumptions while doing this, because you cannot know if the results of such shortcuts are appropriate (or even remotely correct) until after you've already learned the new material correctly.

To borrow a phrase: "If you mix cow pie with apple pie, it does not make the cow pie taste better; it makes the apple pie worse." The only way to do a taste test is by eating each one completely separately first, then draw comparisons between them. By which time, of course, you've already realized there are no meaningful comparisons to be made, so will know not to even try. :)

> It's "all or nothing" in that a functional compiler/runtime cannot reason mathematically about a program unless referential transparency is absolutely guaranteed.

OK, fair enough, I fully understand that. So, my question should then be: forgetting about the compiler-side benefits, aren't there benefits for the programmer and the quality of the produced code in adopting some of the design patterns of FP? Couldn't the "monad trick" (whatever that would be!) still be useful even in Swift? Rewriting optionals is probably not the best example in practice, but surely, some resulting code, in some situations, may end up more concise and easier to read? I am genuinely curious about your opinion (I have my own too, but it can be bent :-)

@charles Yes, my opinion is that there are huge benefits to reducing state, even if it can’t be eliminated entirely.

@Michael: Of course there are benefits in reducing statefulness in imperative programs, since it reduces the amount of moving parts and interactions the programmer has to reason about.

This has zero to do with functional programming though, as its purpose is to take the whole job of reasoning about behavior away from the programmer completely, turning it over to the machine to perform automatically. One of the prequisites for this is that the machine is able to reason mathematically about the code, which in turn is dependent on referential transparency being enforced throughout.

FP automates the often complex and challenging business of deciding exactly what to do and when, just as GC automates the fiddly, tedious business of allocating and freeing memory correctly. You give up low-level control, but gain high-level conciseness, expressiveness, and power.


@Charles: Adding monads to Swift is a complete waste of time because the purpose of monads isn't to reduce statefulness or sequentialness but to add it!

Haskell needs to do this because Haskell itself (i.e. the functional language) doesn't support these things, but without them certain tasks (e.g. IO, exceptions) become impossible or at least really horrible to do. Monads are the anti-Haskell, the gross pus-filled foreign body cysts floating within the pure healthy FP body. They're solely there to keep that disgusting grossness forever contained so that it never spills out to poison everything else.

Swift, being imperative, already has statefulness and sequentialness built into its core and instantly accessible from any and every part of your program. Your entire Swift program is already "inside the monad", so to speak. The only thing adding monad-like structures to Swift does is create a whole new layer of complexity which enables you do absolutely nothing that you couldn't already do, while increasing the Swift compiler/runtime's ability to reason about your program not one whit.

All it does is add cost for no benefit (other than to the developers' egos, natch; and those things are already far too big and bloated for their own damn good).

@Charles: Regarding adopting "design patterns of FP", I recommend reading physicist Richard Feynman's wonderful "Surely You're Joking, Mr Feynman!", especially the bit where he coins the phrase "Cargo Cult Science" for the first time:

In the South Seas there is a cargo cult of people. During the war they saw airplanes land with lots of good materials, and they want the same thing to happen now. So they've arranged to imitate things like runways, to put fires along the sides of the runways, to make a wooden hut for a man to sit in, with two wooden pieces on his head like headphones and bars of bamboo sticking out like antennas—he's the controller—and they wait for the airplanes to land. They're doing everything right. The form is perfect. It looks exactly the way it looked before. But it doesn't work. No airplanes land. So I call these things cargo cult science, because they follow all the apparent precepts and forms of scientific investigation, but they're missing something essential, because the planes don't land.

Mimicking the forms of FP will not improve the way your imperative code works, because those forms are merely the consequence of FP's particular strengths, not the creator of them.

The real problem we have is that programmers whose only experience is in imperative programming have absolutely no idea what they don't know - or even know that they don't know it. But finding the solution to that one is left to the reader...;)

Stay up-to-date by subscribing to the Comments RSS Feed for this post.

Leave a Comment