Lachlan's Rambling: scala

Showing posts with label scala. Show all posts

Friday, September 02, 2011

Applying Applicative Functors in Scala

A chap named ittayd has written a nice gentle tutorial on functional programming in Scala. My only minor complaint is that he calls the applicative functor “sequential application” function apply. While this is a logical choice, in Scala apply is used everywhere to implement plain old function application, so it could lead to confusion. I think it’s better to call it <*>, as in Haskell. Read on.

I found the tutorial interesting because it talks a bit about applicative functors, and how to give them a nice syntax in Scala. The one thing that has bugged me about applicatives is that using them in Scala always felt clunkier than Haskell. Ittayd comes up with a scheme I like, but says “we have one difficulty: we can no longer define apply inside Future”. He uses a standard object-pimping implicit conversion to get around this, and that works fine. However, it is possible to define <*> inside your thing-with-an-applicative-functor, using one of my favourite Scala features: generalized type constraints. To continue with ittayd’s example using Future:

trait Future[A] { … def <*>[X, Y](x: Future[X])(implicit ev: A <:< (X => Y)) = new Future[Y] { def get = Future.this.get(x.get) def isDone = Future.this.isDone && x.isDone } }

The compiler will only allow <*> to be called on a Future that provides an appropriate function as its value.

Ittayd bootstraps things simply by lifting the function to be applied into Future, but personally I prefer the Haskell style of using <$>, aka “map”. In Scala, <$> is unsuitable for a few reasons, so I picked **::

trait Future[A] { … def **:[B](f: A => B): Future[B] = map(f) }

Obviously, this is just the standard map method with a different name, what’s the point? Well, because the operator ends with a :, it’s right associative; and because it starts with an *, it has higher precedence than <*>; so you end up with:

(marry _).curried **: joe <*> jane

Or, if you subscribe to the slippery slope theory of gay marriage:

(gayMarry _).curried **: joe <*> bob <*> george <*> fido <*> aDonkey

This is reasonably close to the Haskell idiom:

gayMarry <$> joe <*> bob <*> george <*> fido <*> aDonkey

You may wonder why obsess over the best way of dealing with applicative functors? After all, most of the applicative functors you see have an associated monad, so we can use Scala’s nice for syntax. Yes this is true for many useful applicative functors, but not all of them.

Just a quick asside on termonology. In casual conversation, you may hear it said that “Option is a monad”. But it’s more accurate to say “Option has a monad”. Because many types only have one useful monad associated with them, we tend to think that the type itself is the monad, but that’s not quite right. This doesn’t just apply to monads, of course, the same goes for monoids, applicative functors, etc. For example, numeric types have two useful monoids: one for addition and one for multiplication.

Now consider a Validation type from this Tony Morris tutorial. You can write a useful monad for this type, which implies an applicative functor as well. However, the tutorial is concerned with a second useful applicative functor for Validation. This second applicative is only relevant for the case where the Validation is using a semigroup as its error type (the tutorial explains what a semigroup is), and it allows using the semigroup to accumulate a list of all the validation failures. The classic example of this is validating data entered by a user on a form with several fields. The “plain” applicative/monad will only record the first validation failure it finds, whereas the semigroup applicative will record all the failures.

Now, finally, to my point (I think). This applicative functor that uses semigroup does not have a corresponding monad. Which means that we can’t use it with Scala’s for syntax. Hence the desire for a nice syntax for applicative functors. It is most definitely useful.

Obligatory Scalaz reference: those guys have pretty much worked out the best way to do all this stuff in Scala. But I find I learn a lot by experimenting myself.

Posted via email from lockster's posterous

Friday, April 15, 2011

Ceylon: Interesting for the Wrong Reasons

A couple of days ago, Gavin King announced that he is leading a team at RedHat that’s creating a new JVM language called Ceylon. See here for most of what is known so far. The stated motivation, in a nutshell is that they like Java very much, but it has deficiencies that are preventing progress in key areas, and they are generally feeling “frustrated”. They don’t have a complete compiler yet, but they’re working on it, and I get the impression RedHat is pretty serious about it.

Gavin King’s slides provide a fascinating insight into the kind of thinking that remains so dominant in the Java community. Gavin King is well known in the Java world as the inventor/leader of Hibernate and various other JBoss projects. He certainly knows Java inside out, and his list of “frustrations” is astute, if incomplete. While I haven’t dug into all the details so far revealed about Ceylon, but it seems likely that they’ll succeed in their goal of creating “a better Java”. If given the choice between using Java and Ceylon, I could certainly see myself choosing Ceylon. But Ceylon is a terrible idea. The problem is that they are treating the symptoms, not the disease; and they’re repeating exactly the same mistakes that resulted in them finally hitting a dead-end with Java.

Possibly the best outcome of Ceylon is that it emphatically refutes the argument that “Java is all we need”. Few have put more hard labour into trying to make Java feasible than Gavin King and the JBoss folks. They and a few others have spent the last decade or so producing a staggering number of bloated tools, libraries and frameworks in order to keep Java going. Don’t get me wrong, there’s a lot of extremely useful Java libraries out there, but there’s also a huge amount of guff that exists solely to compensate for the impracticality of the Java language. Ceylon is the “main-stream” admitting, finally, that not all problems can be solved the Java way: more frameworks, more libraries, more tools, more code.

So Mr King and friends have finally hit the wall with Java. They’re a bit late to the party, but we should welcome them none the less. So why do I say Ceylon is a terrible idea? The short answer is: Scala. The long answer is, well, longer. My biased summary of the situation is this: after years of having to build more and more layers of increasingly baroque frameworks in order to make progress, they realise that they can no longer keep up with their competitors (cough_C#cough_) because Java the language is an unavoidable “bottleneck”. They identify certain essential abstractions that are needed to solve their immediate problems, and proceed to design a new language that provides those essential abstractions, while remaining as “familiar” as possible.

In other words, Ceylon is all about making “a better Java”. There is certainly support for a evolutionary approach like this. Some argue that Java itself was (in language terms) “a better C++”, so why not follow that successful approach now? Well, if that worked so well for Java, why are even its greatest fans now trying to replace it? I’m reminded of Linus Torvalds' reaction to “Subversion is a better CVS”. The problem with Subversion isn’t that is poorly designed or built; it isn’t. The problem with Subversion is that the goal itself just isn’t very useful.

This is how it’s going to go: Ceylon will be built, they’ll make a SDK library, then there will be Hibernate-Ceylon, Spring-Ceylon, WebFrameworks-Ceylon 1 through 63,854, Logging4Ceylon, Ceylon.util.logging, commons-logging-ceylon and Simple Logging Facade 4 Ceylon. All these will be an improvement over their Java equivalents. Then, in another five or so years they’ll hit some new pain points and realise that some of those other “academic” abstractions are actually useful in the real world after all. For example, Ceylon supports higher-order functions (but, inexplicably, not lambda expressions). If we could go back in time 5 years and ask Gavin King “how important do you think higher-order functions are?”, what would he say? My guess is “they’re not essential for Real Work™”.

But it’s still an improvement, so why not embrace it? Because we have an already have a much better option that’s already doing the job in the real world: Scala. Or look at the interesting stuff going on in .NET land. Ceylon’s goals look positively narrow and uninspiring by comparison. Of course, Gavin King was immediately asked “why not Scala?”. It is my sincere hope that this question will continue to be asked of him. His answer consists of the same misconceptions that Scala advocates have been trying to dispel since forever. Take this statement from Mr King, from an interview in InfoQ:

One important force that helped make Java so successful was the things Java left out. But you need to be very smart here: Java’s decision to leave out higher-order function support was enormously harmful. Now, obviously the Scala team have gone down a very different path here and have created a very feature-rich type system. As a general rule I would prefer to solve my problems with fewer, more general constructs, than with more, less powerful constructs.

This is the exact inverse of the truth. No one who has made a serious effort to understand Scala could make such a statement. “Fewer, more general constructs” is exactly what Scala is all about! My summary of Scala’s mission statement would be as follows:

Scala aims to maximise the potential for useful abstractions, whilst:
- giving assurance that your abstractions are correct (static typing)
- allowing you to do what is necessary to get the performance you need
- maintaining seamless interoperability with Java

Scala aims to be compatible with Java, but it’s about much more than just “a better Java”. Ironically, it seems as though Ceylon will not interoperate with Java as seamlessly as Scala does (e.g. Scala keeps null and type erasure). It’ll be interesting to see how that works out. To get an idea of how serious the Scala team is about powerful but practical abstractions, look at how they handle the tensions that crop up between abstraction and performance. Great examples of this are how Scala handles the unpleasantness of arrays on the JVM, and also the @specialized annotation.

Scala is by no means perfect. The trade-offs it makes will not suit everyone fully. But I cannot help but scratch my head over statements like this (Gavin King, again):

But I guess I should mention that the number one technical problem that we simply can’t solve to our satisfaction in Java – or in any other existing JVM language – is the problem of defining user interfaces and structured data using a typesafe, hierarchical syntax.

I could very well be missing something, but this would seem like a problem that is right in Scala’s wheelhouse. Perhaps it may not be solved satisfactorily today, but it could done with far less effort than creating a whole new language (and new standard library, and new tool chain…). It is depressing to think of what these guys could achieve working with Scala instead of pouring resources into Ceylon.

You may wonder, “if you think Scala is so good, then use it and be happy, why worry about Ceylon?” Well, I worry because I think Ceylon is worse than Scala, but it could win anyway. That actually seems to be the more common outcome in these situations. I would much prefer a world with Scala jobs in demand than one with Ceylon jobs in demand. So, yes, it’s all about me being selfish. I would say to all Scala fans: don’t be afraid to be a little selfish and evangelise for the better outcome.

Posted via email from lockster's posterous

Thursday, September 30, 2010

Scala's Missing Monad

I've been writing more and more pure functional code in both Scala and Java recently. An issue I found myself running into quite often is this: say you have a function that sometimes returns no results (e.g. looking up a key in a map). A common way to deal with that in Scala is to return an Option:

Simple enough. But then a typical use of such a function is to map it over all members of a collection:

What I find I often want to do in a situation like this is to only bother with the results if all the applications of foo returned a result. In other words, I want to turn the strings list above into an Option[List[String]], which is Some(List(foo(1).get, foo(2).get, foo(3).get)) if all those applications of foo give values, and None otherwise. I couldn't see any method in the Scala standard library that does what I want.

As a bit of a functional programming n00b, one approach I've found to be useful is to work out the Haskell type signature of the function I want, feed that into Hoogle and see what I get, if anything. In this case, the Haskell type is:

The top Hoogle result for that is the sequence function:

And look, the sequence function works on any monad, not just Maybe/Option. For example, if my foo function had been returning Either[SomeErrorType, String], then the sequence function would give the first error in the list if there was an error, or the list of results if there was no error. So this is a useful function, but as I said, I can't find it in the standard Scala library.

When I thought about how to implement sequence in Scala, I immediately ran into trouble. It is often said that "Scala has monads", and we know that the flatMap method is doing a monadic bind. But this support isn't much more than a convention for method names and type signatures, combined with nice syntax in the form of for expressions. There is no monad type declared anywhere in the standard library. Without a monad type, the sequence function would have to be re-implemented over and over: Option.sequence, Either.sequence, List.sequence, etc.

The good news is that Scala's type system can express the monad type. And the better news is that some really smart people have already done all the work for us in a library called Scalaz. So I had a look, and lo and behold there is the sequence function:

That may look nothing like the Haskell function, but that is mostly a result of how typeclasses are expressed in Scala. Trust me, this is the same function. Also, this Scalaz function is more general than the Haskell function above. It turns out that sequence works for any "traversable" thing (not just lists) containing anything that is an applicative functor (not just monads). Haskell also has the equivalent fully generalised sequence function in the Data.Traversable library.

So what did I learn from this exercise? First that the lack of explicit types for monad, functor and friends in Scala's standard library is a greater problem than I'd expected. Second, if you're writing pure functional Scala code, you should be using Scalaz. Since all Scala programmers should be writing pure functional code, it follows that all Scala programmers should be using Scalaz!

Tuesday, December 15, 2009

Some simple Scala null tricks

Scala code often needs to deal with null references, unfortunately. Such is the price of Java interoperability. One of my favourite utility methods is:

Scala 2.8 offers this using Option(ref) instead of ?(ref). However, I can't figure out why the Scala 2.8 method accepts non-reference values as well.

Another simple thing that may come in handy is an extractor for non-null values:

What's the advantage of using an extractor? Say you have a collection of references, some of which may be null, and you want to ignore the null references:

Monday, June 01, 2009

Project Euler Problem 8

I guess I should warn that these Project Euler posts have spoilers, in case you want to try the problems yourself ☺. My Scala solution to problem 8:

val input = "73167176531330624919225119674426574742355349194934" +
"96983520312774506326239578318016984801869478851843" +
"85861560789112949495459501737958331952853208805511" +
"12540698747158523863050715693290963295227443043557" +
"66896648950445244523161731856403098711121722383113" +
"62229893423380308135336276614282806444486645238749" +
"30358907296290491560440772390713810515859307960866" +
"70172427121883998797908792274921901699720888093776" +
"65727333001053367881220235421809751254540594752243" +
"52584907711670556013604839586446706324415722155397" +
"53697817977846174064955149290862569321978468622482" +
"83972241375657056057490261407972968652414535100474" +
"82166370484403199890008895243450658541227588666881" +
"16427171479924442928230863465674813919123162824586" +
"17866458359124566529476545682848912883142607690042" +
"24219022671055626321111109370544217506941658960408" +
"07198403850962455444362981230987879927244284909188" +
"84580156166097919133875499200524063689912560717606" +
"05886116467109405077541002256983155200055935729725" +
"71636269561882670428252483600823257530420752963450"

val digits = input.map(_.asDigit).toArray

def multiply(index: Int) = digits.slice(index, index + 5)
    .foldLeft(1)(_*_)

val multiples: Stream[Int] = {
    def rec(n: Int): Stream[Int] = Stream.cons(multiply(n),
        if (n > digits.length - 5) Stream.empty else rec(n+1))
    Stream.cons(mult(0), rec(1))
}
println(Iterable.max(multiples))

For convenience, this specifies the input as a string, then uses RichChar.asDigit to create a corresponding array of integers. The multiply function uses left fold to multiply together sequences of five digits. The multiples value is a stream (lazy list) of the multiples of all groups of five consecutive digits from the input. And finally, the Iterable object provides a convenient method to find the maximum value in any Iterable containing Orderable things.

Project Euler in Scala

Project Euler is a bunch of mathematical/programming problems. I've been trying to improve my Scala skills by using it on some of the Euler problems. I'm trying to use functional programming styles as much as possible. My solutions definitely haven't been very optimized, but I am learning, which is the aim.

Euler Problem 1

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23. Find the sum of all the multiples of 3 or 5 below 1000.

My solution:

def sum(nums: Iterable[Int]): Int = (0 /: nums)(_ + _)

println(sum((3 until 1000).filter(i => i%3 == 0 || i%5 == 0)))

The 3 until 1000 creates a "range", which is a lazy sequence of numbers from 3 to 999 inclusive. Because it's lazy, you don't get 997 integers sitting around in memory, you only get those integers which pass the filter. The filter itself shows the Scala syntax for a "lambda expression" (anonymous function). The full syntax is (arg1, arg2, ..) => function body, but when you only have one argument (a single Int, in this case) you can leave out the brackets.

The sum function is the operator syntax for a left fold over nums. The _+_ is an even more terse form of a lambda expression that you can use for really simple expressions. The first underscore is replaced by the first argument to the function, and the second is replaced by the second argument.