What is FP?
Unfortunately, Functional Programming (FP) isn’t really a single thing. It’s a bunch of different programming constructs that get put together for traditional, convenience, implementational, or mathematical reasons. Some elements of FP, like higher order and anonymous functions, have slowly become ubiquitous, which tends to confuse the dialog. I’ll try to keep this short, but in order to say something that isn’t trite, useless, or wrong I still have to say rather a lot.
Is FP useful?
Honestly, a section like this should go at the end of the blog post. And considering how FP is a hodgepodge of other concepts, this should probably go at the end of a book. Or maybe a series of books. But I suspect enough people have been burned by snake-oil sales men bearing silver bullets that a bit of reassurance is in order. I’ll do my best to limit the triteness, but considering the placement of what is obviously a conclusion there is only so much I can do.
Yes, FP is useful. Depending on the domain you are working in, some parts will be more useful than others. But because of the large number of concepts that fall under the purview of FP (and the impressive capability of each concept when considered alone), regardless of the domain, your software engineering techniques will benefit from some part of FP.
Now. Let’s take a look at some of the main concepts in FP.
I would have a hard time saying enough good things about lambda calculus . If you’re writing generic software and you could only know one type of math, then I think your best choice is lambda calculus. You can read more about it in the early chapters of  or . Even in imperative settings it will help you write code that’s better, navigate through code that’s bad, and refactor pretty much anything.
Functional programming languages have sort of an odd relationship with lambda calculus. Languages in the lisp  or ML  families kind of allude to lambda calculus while languages like Haskell  go completely off the deep end and are more or less lambda calculus facades.
The important thing is that the techniques that you learn from lambda calculus either directly or indirectly apply and are leverageable with little or no effort in FP languages.
Even if you want to ignore FP, you’re still going to have to deal with lambda calculus. Closure  is pretty close to being universal in programming languages and higher order and anonymous functions are not far behind. Of course my personal preference is to use lambda calculus in languages that are friendly to its benefits, where you won’t get into trouble because you’re meshing two paradigms together in strange ways. And that typically means using FP languages.
If you could only know two types of math and the first was lambda calculus, then the second would be type theory . Unfortunately, things can get a little overwhelming when you start to look into different areas of type theory . However, much like lambda calculus, type theory can help you write better code and refactor bad code.
The word “type” is actually rather nebulous. The typical dialog is static vs dynamic, however I believe that this is misguided. The meaning of the word “type” changes when you move from C to Ruby to C# to Haskell. I’m going to try to write a blog post about types later, so for now try and ignore the typical discourse. With that disclaimer in mind, let’s talk about the benefits of types you normally get with FP.
There’s two major benefits you see with types. The first is the concept of proof carrying code. This is your traditional “static types mean the program ‘just works’ when it compiles successfully” sentiment. Although, there’s actually a lot more going on behind the scenes. If you are working with something like the Hindley-Milner type system , then what this means is that containers (for example: lists) can be written such that the type of the contained item is generic. When you compile your code all of the concrete types get propagated wherever you were using the container. This works identically with arbitrary structures and functions. The benefit being that when your code successfully compiles it means you have all the right types in the right places (without writing a bunch of duplicate structures and functions for every possible type that would go into them).
Proof carrying code can also be taken a lot further with dependent types . With the proper dependent type system, you are able to fit arbitrary proofs into types, which then all get resolved at compile time. At this point you aren’t just sure that compilation means that the correct type goes to the correct function, but that compilation means the every object has exactly the properties that you successfully proved in the types. The upside is that you have even more compile-time assurances than you do with something like Hindley-Milner, although the downside is that you have to do a bunch of work to actually prove the properties you want proven.
The second benefit of types is that they provide helpful mental structures. If you have the type of a function (or really any arbitrary structure), what you actually have is a very concise description of everything that matters for that function (assuming you are able to encode what matters into the type system). This allows you to ignore irrelevant details and focus on what you need right now.
Additionally, I have, anecdotally, found that code I write that is well-typed is cleaner, less coupled, and more modular. This seems to hold even when I’m working in languages that do not have a static type system. For this reason, I like to think of FP types as a useful discipline for writing good code more than I like to think of it as making the compiler happy.
Immutability and Referential Transparency
Most languages have facilities to handle constant values. However, few languages take the concept as far as (some) FP languages do with immutable data structures. That dictionary or list doesn’t get any new items added to it, that object doesn’t have any methods that changes internal state, whatever the object looks like when it’s first create is how it always looks.
Referential transparency  doesn’t show up so much in programming languages, but we’re quite familiar with it from high school mathematics. Imagine a function from algebra that looks like “f(x) = x * 5”. Whenever you call that function with a given input you will always get the same output. The idea is roughly analogous to reentrancy .
Now, personally, I think these two concepts are going to be key in writing good software in the future. The products that organization want are becoming more and more complex and the traditional methods of implementing them just aren’t able to handle the increase in complexity. The worst coding boondoggles I’ve had experience with were all due to either abuse of mutable data or lack of referential transparency (typically both).
If you took immutable data structures and referential transparency to traditional programming languages, the result would be difficult to maintain, hard to implement, slow and space inefficient code. However, FP contains constructs that actually makes this type of programming easy and convenient. For example, pattern matching  and algebraic data types  provide some very natural ways to implement a wide variety of algorithms that are immutable and referentially transparent. Additionally, persistent data structures  provide immutable data structures that are both fast and space efficient. Not to mention there are some research efforts that are making interesting headway into making FP very fast .
FP languages themselves are all over the map when it comes to immutability and referential transparency. Haskell is completely pure (meaning there are no mutable data structures whatsoever and every function is referentially transparent) while Ocaml allows mutable state pretty much anywhere you want it. There’s also a lot of options in between as with languages like Clojure that allow mutable state, but severely restrict how you can use it.
I’m not sure anyone is currently handling these concepts correctly. It seems like either there isn’t enough restriction and that leads to losing many benefits, or there’s too much restriction and it’s not easy to mutate objects when you really need to (or you have to learn a little bit of category theory first). It could also be that there is no right answer and that how you handle these issues depends on the domain you’re working in. However, with complex software this seems to cause a lot of problems, so you should definitely keep your eye on this issue because if someone does figure out the right way to handle mutation and transparency it might just facilitate moving software engineering to the next level of reliability and robustness.
One of the major selling points of FP that gets pulled out often is that it’s going to make it easy to do parallel programming. CPUs aren’t getting faster like they used to and FP is here to save us by making parallel programming easy. Although this isn’t actually true. While there are some features in FP that make parallel programming look easier, no one has really found a way to massively cash in on the promise of parallel gains in FP. Simeon Peyton Jones  and Rich Hickey  both have some very interesting talks about this. It looks like they’re making good progress, but there’s still a lot of work to do in order to live up the expectations that have been made concerning FP and parallel programming.
If you had to get started in FP today, I would say the language to use would be F# . I haven’t personally used F# myself, but I have looked through the features and used Ocaml (the language F# is based off of). F# has all of the features that make FP powerful, it has object oriented and mutable features so you don’t have to make the transition to FP in one step, and it has interoperability with C# so if you get into trouble you can always handle your pain points in a more familiar environment.
In the long term though, I would keep an eye on Haskell and any variants that split off of it. I don’t think Haskell itself will ever be the language to use, but there’s a lot of research and experimentation going on in and around that language. If any groundbreaking developments occur in Haskell, step two is going to be cleaning them up and placing them in a new language that looks a lot like Haskell. At that point it’s going to pay off to already be up to speed with how things work in that sort of environment.
 – http://en.wikipedia.org/wiki/Lambda_calculus
 – http://research.microsoft.com/en-us/um/people/simonpj/papers/slpj-book-1987/
 – http://www.cis.upenn.edu/~bcpierce/tapl/
 – http://en.wikipedia.org/wiki/Lisp_(programming_language)
 – http://en.wikipedia.org/wiki/ML_(programming_language)
 – http://en.wikipedia.org/wiki/Haskell_(programming_language)
 – http://en.wikipedia.org/wiki/Closure_(computer_programming)
 – http://en.wikipedia.org/wiki/Intuitionistic_type_theory
 – http://en.wikipedia.org/wiki/Lambda_cube
 – http://en.wikipedia.org/wiki/Hindley%E2%80%93Milner_type_system
 – http://en.wikipedia.org/wiki/Dependent_type
 – http://en.wikipedia.org/wiki/Referential_transparency_(computer_science)
 – http://en.wikipedia.org/wiki/Reentrancy_(computing)
 – http://en.wikipedia.org/wiki/Pattern_matching
 – http://en.wikipedia.org/wiki/Algebraic_data_type
 – http://en.wikipedia.org/wiki/Persistent_data_structure
 – http://research.microsoft.com/en-us/um/people/simonpj/papers/ndp/haskell-beats-C.pdf
 – http://www.youtube.com/watch?v=NWSZ4c9yqW8
 – http://www.youtube.com/watch?v=dGVqrGmwOAw
 – http://research.microsoft.com/en-us/projects/fsharp/