nil
$SwitchMap$
I recently became the maintainer of the Testability Explorer project. It’s a library that inspects Java bytecode, assessing how difficult the code will be to unit test. Check out the author’s blog: Misko Hevery. For more info about the project, see http://testability-explorer.googlecode.com.
Testability explorer looks for a few things in your code. One of them is some mutable object that’s globally accessible. For example, the Evil Singleton makes it hard to unit test anything that uses it, but a public static final String is fine, since String is immutable.
So I was investigating a strange issue where a large mutable global cost was assigned to some classes, due to an anonymous inner class. It seemed that there was some globally mutable state hidden in SomethingOrOther$1, whenever SomethingOrOther has a switch statement with a variable of type Enum as the argument. That doesn’t make your code hard to test, so what’s going on?
Try it at home! Compile this with javac:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | public class HasStaticCost { public int compare() { Fruit fruit = Fruit.APPLE; switch(fruit) { case APPLE: return 1; case ORANGE: return -1; default: return 0; } } public enum Fruit { APPLE, ORANGE } } |
And run javac HasStaticCost. The outputs are:
HasStaticCost$1.classHasStaticCost$Fruit.classHasStaticCost.class
Three classes? What’s in that first one?? We’ll have to look at the bytecode:
class com.google.test.metric.collection.HasStaticCost$1 extends java.lang.Object { static final int[] $SwitchMap$com$google$test$metric$collection$HasStaticCost$Fruit; static {}; Code: 0: invokestatic #1; //Method com/google/test/metric/collection/HasStaticCost$Fruit.values:()[Lcom/google/test/metric/collection/HasStaticCost$Fruit; 3: arraylength 4: newarray int 6: putstatic #2; //Field $SwitchMap$com$google$test$metric$collection$HasStaticCost$Fruit:[I 9: getstatic #2; //Field $SwitchMap$com$google$test$metric$collection$HasStaticCost$Fruit:[I 12: getstatic #3; //Field com/google/test/metric/collection/HasStaticCost$Fruit.APPLE:Lcom/google/test/metric/collection/HasStaticCost$Fruit; 15: invokevirtual #4; //Method com/google/test/metric/collection/HasStaticCost$Fruit.ordinal:()I 18: iconst_1 19: iastore 20: goto 24 23: astore_0 // same for Fruit.ORANGE 39: return }
This is pretty crazy. The ordinal values of the Fruit enum are stored into a static array, named $SwitchMap$<enumname-encoded-with-dollars>, and stored in this synthesized inner class, and accessible to other classes in the same package. And that synthesized class isn’t stored with the enum, instead, each class that switches on the enum gets blessed with some ugly global state. Why? There was a release of the JVM in Java 1.5, along with the new language features, so you’d think the switch statement would understand an enum type. But this bytecode looks like a complete hack to make the switch statement think it’s operating on an int instead. Here’s how the switch is implemented:
0: getstatic #2; //Field com/google/test/metric/collection/HasStaticCost$Fruit.APPLE:Lcom/google/test/metric/collection/HasStaticCost$Fruit; 3: astore_1 4: getstatic #3; //Field com/google/test/metric/collection/HasStaticCost$1.$SwitchMap$com$google$test$metric$collection$HasStaticCost$Fruit:[I 7: aload_1 8: invokevirtual #4; //Method com/google/test/metric/collection/HasStaticCost$Fruit.ordinal:()I 11: iaload 12: lookupswitch{ //2 1: 40; 2: 42; default: 44 }
At instruction 12, I would think the value 1 would naturally refer to the first ordinal value of the enum without needing to load the constant from the static array as it does.
It turns out this was just bad decision making by the Java 1.5 committee. They intended to implement Java 1.5 language features in a way that would execute on the 1.4 JVM, I bet because they knew that BigCorp wasn’t going to upgrade their production environments and the language features would sit on the shelf for a few years unless they could be deployed with little risk. So we got stuck with big things like generic type erasure, and small things like this hack for the switch statement… but hey, at least adoption would be easy.
I fixed testability explorer by whitelisting field names matching /\$SwitchMap\$.*/. Gross. Although that’s straightened out, it still leaves that bad taste in my mouth. Because it turned out that there was some feature added later on in Java 1.5 that did require a change to the VM, and so all these hacks are in there for no good reason. And, of course, BigCorp stayed on 1.4 for years.
Assert, Validate, Precondition
Several recent posts have been about what I’d like to see in a programming language, and those ideas prompted me to code. What I’ve been creating is a new programming language, very similar to Java and targetted to the JVM, but with these testability and best practice ideas included.
- Immutability – final should be the default
- Testing – unit tests should be in the core language, not a library
- Dependency Injection: class parameters
The language is called Nil, and the project is hosted at http://nil.googlecode.com.
Some ideas come from work, where I contribute to Testability Explorer (http://testability-explorer.googlecode.com), some come from blogs I read like this post from Michael Feathers (author of Working Efectively with Legacy Code) A Wish for the next Mainstream Programming Language. Some are inspired by other languages, like Objective-C, ActionScript, and Scala.
I should afford some introspection here: I realize that, as Steve Yegge says, “your language is doomed to fail, with probability 1 minus epsilon.” So even if contributors and I produce something useful, it’s not going to be used. That’s fine, there are lots of little-known languages with a small, über-geek following.
So, today’s idea, is about assertions. In testing, there are various libraries modelled after JUnit, which is nice, and as I’ve said, the language could make life a little easier by allowing the test to live in the class it tests.
Another thing the language could do is provide a context-sensitive assertThat() method. I’ve seen various API’s, like Springs Validator class, and Preconditions in Google Collections, which are asserts in your production code. I like these, just because you fail faster. It’s also a nice executable way to document the preconditions of your method. You use them because you don’t want to depend on JUnit. But that’s silly. Why not have the same powerful API to express your expectations in your test that you have in production code? If assertThat() was a method automatically mixed into all classes, then it just fails the current test if called in a test, and it throws an IllegalStateException with the same lovely message when called in production code.
Another idea: what if a method could define an internal DSL? Then the equals() and isA() and so on don’t have to be imported into the namespace of the test. When you call assertThat(“a”, isA(String.class)), you can’t resolve the isA() method, but when assertThat() evaluates its arguments, the isA() method is defined in that scope. IDE’s would need an easy way to find methods that are legal in such a scope, but that seems doable. The advantage is that the isA() method doesn’t clutter the completions when you’re not inside assertThat()! The API appears to have only the assertThat() method.
I’ll go code some of this now!
A test-driven modern language
Here’s another in my recent series of posts about ways I think the Java language could support modern coding practices.
An obvious practice that we use extensively today is unit testing and test-driven development, and again, Java and other languages don’t provide built-in support. Instead, we have a few libraries to create test code, execute it, and provide mock dependencies, and then some standards for naming and directory layout that help us organize the test code and correlate it with production code.
What sorts of facilities would our dream language have for testing?
In Java, tests have a lot of repetitive code, because they are just classes. All test methods are in the same form, namely that they are non-static, take no arguments, marked as a test via an annotation or naming convention, throw all checked exceptions, and return void. We should probably have a test keyword that starts a block, and this should act like a test.
Just like the keywords extends and implements, we should have a way to associate two objects with a new relationship: tests. This way, we can easily see for a given class, what classes test it, and IDE’s can consistently navigate between a class and its test(s). It also would allow some conveniences. Instead of having to mark methods package-private so they are visible to the unit test, we could make private methods visible to tests, like an implicit C++ “friend” relationship.
So, with our two new keywords we have an example like this:
class Foo { private int helperMethod() { //stuff } } class FooTest tests Foo { test helper { int i = new Foo().helperMethod(); assertThat(i, equals(10)); } }
Now we have created a test class and some tests. Notice the assertion in that test – where do the assertThat() and equals() come from? Instead of importing some utility classes, maybe the tests keyword could also cause my FooTest class to extend from a base Test class rather than from Object by default, so these could be implemented there. Or if the language had mixins, the test methods could come from a mixin and not require polymorphism (we don’t care that our test class may be cast to a Test type).
The example also uses a Hamcrest-style expectation, to make the assertion more fluent.
The same thing could also be done if tests could be written directly in the class they test, and the compiler will have to ignore it when not in testing mode. In that case, the mixin would be automatically added to classes that contain tests, so that the assertions are available:
class Foo { // implicit mixin Asserts int calculateDay(Date date) { ... } test calculateDay { // using the enclosing instance, requires that it have a default constructor int day = calculateDay(new Date()); assertThat(day, equals(31)); // probably more realistic int otherDay = new Foo().calculateDay(new Date()); assertThat(day, equals(31)); } }
What more can we do to make writing tests the most convenient and fastest way to code? For one, we could have a built-in test runner. Having marked our tests with a new syntax, we can find all the tests and execute them using an equivalent program to the compiler. In that test running mode, we can relax the security model of the runtime, avoiding some common problems with testing sensitive code.
Once we have our test runner, we can lower the bar to getting good testing infrastructure in a project. One major pain in the butt is instrumenting the code to get the line-level test coverage. Because libraries like JUnit execute code that’s been compiled by the normal compiler, we have to do something funny like twiddle the bytecode after compilation, or use a custom classloader to do it on the fly. If the language understood testing, then it could also always provide coverage data when the tests are executed. It would be easier for IDE’s to show how much of your code is executed in the tests, as well.
Finally, there is the ever-annoying issue of setting up dependencies. The language needs access to different libraries at test-time, and we also want to enforce that production code may not have dependencies on test code. With language-level support for testing, there can be compile-time checks that test code is not used from production code, and we can have a second path of libraries passed only to the test runner.
Does this seem like a good idea, or does the language step over the line here and take over the job of a framework? Should we encourage testing private methods this way, or does it violate encapsulation principles?
Simpler dependency injection
In my last post, I wrote about how Java’s final modifier could be replaced with a mutable modifier, helping us to follow modern coding practices.
Another feature Java and other languages lack is a built-in understanding of dependency injection. I’ve been using Spring and Guice for a few years now, and it’s so clear to me that dependency injection is absolutely essential to writing a complex application. Even more than separating code into classes, we must avoid mixing the wiring with the logic. It makes code much more testable – if you look at the Testability explorer metric, it is partly based on the expense you incur in testing when a dependency cannot be injected – you are bound to test the object with that particular collaborator.
In Java, a typical incantation of constructor injection looks like this:
class Foo { private final Dependency1 d1; private final Dependency2 d2; public Foo(Dependency1 d1, Dependency2 d2) { this.d1 = d1; this.d2 = d2; } }
The only information we are providing here is that to construct this class, we’ll need access to two dependencies, and we’ve given them names. This is how most classes should start, and the list of dependencies can be long. Why not treat this just like arguments to a method? We just provide the types and names of the arguments, and then they become new variables in method scope.
What we need to improve this is Properties. Just ask Java Posse co-host Joe Nuxoll – Java needs properties. If you don’t know, a property is just like a public member variable, except that you can write a set or get method for it later if you like. It replaces the silly Javabeans spec – to client code, it looks just like a public member, you just use object.property = 1 to set and object.property to get. I like how they are implemented in ActionScript, with set and get keywords to override the assignment/dereference:
public var one:int; private var _two:int; public function get two():int { return _two + 1; } public function set two(i:int):void { _two = i - 1; }
Now we can dream up an example:
class Foo(Dependency1 d1, Dependency2 d2) { void doThing() { d1.print(); d2 = null; // not allowed - the property is read-only } Foo foo = new Foo(new Dependency1(), new Dependency2()); foo.d1; // also not allowed - the properties are private
This would create private read-only properties d1 and d2, and a default constructor that requires both to be passed in. It’s important that the values are final – we want to reduce the mutable state at runtime, and assigning the properties during construction allows this.
Ok, now setter injection: if the dependency is optional, meaning we could have a Foo in a valid state without it, then it might be appropriate to inject the dependency with a setter instead. (For some reason, setter injection seems to be the preferred way in Spring – which is wrong.) We could allow a convenience for setter injection as well, simply by adding a default value for that dependency in our “class arguments”:
class Foo(Dependency1 d1, Dependency2 d2 = null) { } Foo foo = new Foo(new Dependency1()); foo.d2 = new Dependency2();
Now the default constructor takes only one argument, of type Dependency1, which populates the readonly property d1, and the read/write public property d2 starts off as null. Because it’s a property, outside code can set it. The only aspect that’s missing here is if you want to have additional constructors – but with no additional syntax, you could just write those constructors out in the class body.
The more I have thought about this, the more obvious it seems that injected dependencies of a class are analogous to method parameters, at one scope higher. Why not make the syntax the same, so that this good practice is the easy way? I’d love to hear whether you think this is a good idea.
Java’s final modifier is backwards
In Java, there’s a modifier that can be applied to a variable, and it’s called final. Final can also be applied to classes and methods, to prevent overriding, which is kind of overloading the word. I don’t have a problem with that kind of final, except that I don’t agree that it’s good to use it liberally where you don’t design for inheritance up-front. This post is about final variables and immutable objects.
Good practice dictates that variables should always be marked final, except where the reference needs to be changed. This is true for class members, which should be assigned in instance initialization or a constructor, and can therefore be marked final. It’s true for method parameters, where it prevents you mistakenly thinking that Java is pass-by-value. And of course block-scoped variables should be final. All of this is good because it reduces mutable state at runtime, making your app less complex, more testable, and expresses your intent in compiler-enforceable constraints.
class Bear { final DateFormat df = new SimpleDateFormat("MMddyy"); final SalmonService ss; Bear() { ss = new HuntSalmonService(); } void eat(final Stream s) { final Integer hungerLevel = 10; ss.takeFish(s, hungerLevel); } }
The problem is, I only know one coder who is religious about declaring everything possible as final. Which sucks: if a variable is NOT marked final, this tells you nothing. Most likely, the coder didn’t write it because they were in a hurry or didn’t think it was needed, but they never modify the reference. The fact that Java makes you write this modifier everywhere is backwards, because you are rarely forced to use it (for use in anonymous inner classes). It would make more sense to use a modifier to note the exception rather than the rule, so we should have a mutable modifier instead. Now if you see a mutable List foo, you can bet the reference is really changed to another List later, since the developer bothered to mark it as mutable.
In Scala, you are forced to think about this, by marking your identifiers as a var or a val. Val’s are final, var’s are not. That’s good, Java should do that too. Now our Bear example is:
class Bear { DateFormat df = new SimpleDateFormat("MMddyy"); // non-mutable SalmonService ss; Bear() { // Without this, compiler complains that ss must be marked mutable ss = new HuntSalmonService(); } void eat(Stream s) { // re-assigning s here would be a compiler error mutable Integer hungerLevel; hungerLevel = 10; ss.takeFish(s, hungerLevel); } }
So far, this would be an easy change to the Java language, or something a new language could model.
There’s a second problem, however. I love to ask this in interviews: if something is marked final, can it be changed? It’s intentionally vague, and of course the answer is that the reference cannot change, but the value can. Immutability is important for the referenced object even more than for the reference, since immutable objects are thread-safe, can cache expensive operations, may be pooled, and it’s easier to reason about their state. They also have the same benefits as final variables – less mutable state at runtime is a Good Thing. Is there anything to be done at the language level to support the maxim, “Favor Immutability”?
Let’s imagine what it could look like. We’ll add the mutable modifier to a class that is mutable:
mutable class MyList<E> implements java.util.List<E> { List delegate = new ArrayList<E>(); public void add(E item) { delegate.add(item); } public E get(int index) { return delegate.get(index); } // More implementation for List methods ... } class YourList<E> implements java.util.List<E> { // We don't even want a mutator, but have to satisfy our interface void add(E item) { throw new UnsupportedOperationException(); } public E get(int index) { return delegate.get(index); } // More implementation for List methods ... } //reference and value are both mutable mutable List<Object> foo = new MyList<Object>(); foo = new YourList<Object>(); // now the value is immutable
This is no different from what Java does already: you have two classes, one that is mutable, and one that isn’t, or a method like Collections.unmodifiableList() that returns a wrapper around a list with exceptions thrown by the mutator methods – in the latter case, you don’t find out until runtime that mutation was attempted. Just like with the final modifier, the status quo is that everything is mutable, and thoughtful programmers sometimes use a utility method to produce immutable instances or have to write a second copy of their classes that lacks or hides the mutator methods. Note that there are lots of classes named “Immutable*” out there – maybe that’s backwards too. Again, we’d like the language to give us immutable objects unless we specifically ask for a mutable instance.
So we’ll just have the one MyList class, and we’ll need to mark the methods that are mutators. Those methods are only available, at compile time, if a mutable instance of the object was constructed. Since the mutable keyword is already in use for the referencing variable, we’ll need some other way to request mutable instances, maybe a mutable version of the new() operator. We can also mark methods as returning mutable instances, since we have made mutability part of our type system, and we don’t want callers to be able to cast an immutable return value into a mutable instance.
class MyList<E> implements java.util.List<E> { List delegate = new ArrayList<E>(); mutator void add(E item) { delegate.add(item); } public E get(int index) { return delegate.get(index); } // Return a mutable copy: public mutable MyList copy() { // call the copy constructor, // using the mutable() keyword instead of new() return mutable MyList(this); } // More implementation for List methods ... } mutable List<Object> foo = new MyList<Object>(); // Compile-time error: This instance of MyList is immutable foo.add("Not allowed"); foo.copy().add("Can do this"); foo = mutable MyList<Object>(); foo.add("Good");
Is this possible? I’m still pondering the implications. Here are a few:
- Can a non-mutator method call a mutator method? Probably that should be compile-time error.
- Should support reflection on whether an instance is mutable.
- Can the compiler figure out what are mutators without marking the method? Right now, if we forget to mark a method, it violates the immutability of our object. Or it could be reversed: a modifier could be required for the methods that are legal on immutable instances.
- Maybe if a newly created object is assigned to a mutable variable, it should be mutable? Then we could avoid this new brother of the new() operator.
I’d love to hear your comments.
About Me
Tweets
- I played the ice hockey for the second time in about 8 years. I was about as good as ever, I guess. Which was fairly bad. 16 hrs ago
- I finally jailbroke an iPhone. Now I feel like I have decent geek cred again. 2 days ago
- Lost a bolt on my lower control arm. Found out about it when the wheel came partly off. http://twitgoo.com/fw9e0 3 days ago
- Wow we have the craziest channel 1.6 on broadcast TV where I live, that runs this show: http://intensit.tv/ 5 days ago
- Dorfmeister is playing Zurich the day after I leave. Worst! 6 days ago
- 70 fresh, organic oranges from our tree were sitting on the table this morning. So, marmalade had to be canned. It's tasty! 6 days ago
- Moles, cousins, and unattended baggage #10kpyramid 1 week ago
- @mdauber You live in Sunnyvale too? And NBC is ruining your olympics also? We should get together. 1 week ago
- More updates...
Powered by Twitter Tools