Thursday, October 13, 2011

Oh dear. I just discovered that some places still link my name to this site. Eep! A better link would probably be my Google Profile, now.

Monday, June 14, 2010

Guava release 05!

Guava release 05!

Hello guavateers,

I'm happy to (finally) inform you that Guava release 05 was quietly posted two weeks ago! It is also in the central maven repository as (as will all future releases, so you may stop asking :-)).

Here is a report of the changes between r04 and r05. For that matter, here are the changes from r03 to r04 as well. There have been a few very nice new additions, as I hope you'll agree -- including the humble beginnings of a brand new package,

Remember that most newly-added classes and methods are marked @Beta, and so are still subject to change at any time. Note: this is particularly true of the cool new InternetDomainName class, several methods of which I already need to rename soon (sorry).

About the "Google Collections Library":

Everyone! It's time to stop using the library called the "Google Collections Library"! Guava represents a fully-compatible proper superset of that library. It also contains six months worth ofimportant bug fixes and improvements to performance and documentation.

Continuing to use the Google Collections may lead to trouble when an application ends up with both that and Guava on the classpath at the same time. If this happens, and Guava comes later in the classpath, unpredictable breakages could result! (And if Guava comes earlier, then the google-collect JAR will never even be seen; either way, the situation is senseless.)

Tools like Maven seek to manage your dependencies in an intelligent way, but cannot tell that Guava represents a newer version of Google Collections, so as long as the latter is still in use, it won't know the right thing to do with it.

Yes, the Guava JAR file is about 60% larger than the Google Collections one, but if this is an issue for you, we strongly recommend you address this using a JAR shrinking tool such as the much-loved ProGuard, and please share your experiences in doing so with the rest of us on this list.


With this release, I now see Guava as truly ready to be evangelized to the corners of the globe. Would you be able to help us spread the word? Blogs, comments, twitter, podcasts, company discussion forums, skywriting, whatever you please. It's especially important to me that we convey the message that Guava is the new Google Collections, and no one should use google-collect-1.0.jar anymore.

Any questions?

Thanks everyone!

Monday, March 1, 2010

The little optimization that couldn't

Let's say you represent two groups of people. If I collect one penny from each member of Group A, then, from the unbounded kindness of my heart, give one dollar to each member of Group B, has the total wealth of the groups combined become greater or less?

If you answered "greater", this article's for you! You see, there's a question you forgot to ask -- and it's a question that we as programmers forget to ask constantly.

A nicely optimized method?

In our internal core Java libraries at Google, we have a method called StringUtil.repeat(). It returns a string consisting of n copies of a base string -- so StringUtil.repeat("hey", 3) produces "heyheyhey". When I first came across it, and cleaned it up a bit, its implementation looked a little bit like this:

public static String repeat(String base, int count) {
if (base.equals("") || count == 0) {
return "";
if (count == 1) {
return base;
StringBuilder sb = new StringBuilder(base.length() * count);
for (int i = 0; i < count; i++) {
return sb.toString();

What's going on here? Well, there are four basic cases.

Case one: base is the empty string, so the result should always be the empty string. We don't want to loop and all that, so this optimization returns an empty string straight away.

Case two: count is zero. Here again, why do any actual work? We should return the empty string here and now.

Case three: count is one. We can avoid instantiating a new string by returning the original string directly!

Case four: aww nuts. In this case, we really do have to loop and create the new string. Well, at least we optimized out as much ass we could first!

In each case, we're very carefully doing the bare minimum amount of work we can! With me so far? Sounds good?

When I found this, I proceeded to make the method run even faster. Any guesses how I did it?

That's right, I simply removed all of these so-called optimizations.

Remember that to optimize a special case, you must check for that special case... in every case! That small extra cost goes to every single user of the method. And what of the benefits? Hah! Notice that the "optimized" special cases are the cases that are the fastest to compute anyway!

What's more, surprisingly enough, it doesn't really happen that often that users call a repeat() method passing zero or one as the count. Why would they? Commonly, that is. So we're "optimizing" a case that hardly even exists, at the expense of all the cases that do exist. The special-case checks were a net loss, and better off removed.

It's not like my removing them radically improved the performance of anyone's application. However, the experience is useful as a lesson. You'll encounter this same situation many times in many guises, and it will often be tempting to think about the benefits to the few rather than the overall aggregate cost to society as a whole (hmm... remind you of any politicians?)

If the Group A of my opening scenario has thousands of times more members than Group B -- not to mention that dollar really turns out to be a dime -- it's a bad deal. Just say no!

Monday, April 6, 2009

Anyone still out there?

Wow, I can't believe it's already been almost a year since I posted anything meaningful on this blog. Blogging is a total ON/OFF switch with me.

However, now I must post, in order to tell you that you should (a) listen to me and Jesse on Java Posse (actually, we were on the previous episode also) and (b) download the first release candidate of Google Collections Library 1.0!

Well, that's my news. How are you?

Tuesday, September 23, 2008

So good I had to come out of blogging hiatus just to share it with you:
"Governor Palin, are you ready at this moment to perform surgery on this child's brain?"

"Of course, Charlie. I have several boys of my own, and I'm an avid hunter."

"But governor, this is neurosurgery, and you have no training as a surgeon of any kind."

"That's just the point, Charlie. The American people want change in how we make medical decisions in this country. And when faced with a challenge, you cannot blink."

The prospects of a Palin administration are far more frightening, in fact, than those of a Palin Institute for Pediatric Neurosurgery. Ask yourself: how has "elitism" become a bad word in American politics? There is simply no other walk of life in which extraordinary talent and rigorous training are denigrated. We want elite pilots to fly our planes, elite troops to undertake our most critical missions, elite athletes to represent us in competition and elite scientists to devote the most productive years of their lives to curing our diseases. And yet, when it comes time to vest people with even greater responsibilities, we consider it a virtue to shun any and all standards of excellence. When it comes to choosing the people whose thoughts and actions will decide the fates of millions, then we suddenly want someone just like us, someone fit to have a beer with, someone down-to-earth—in fact, almost anyone, provided that he or she doesn't seem too intelligent or well educated.
Incredibly well said by Sam Harris. Link

Tuesday, August 12, 2008

Google Collections presentation

If you'd like to know more about the Google Collections Library, here's a video you can watch:

It continues here. Or if you're in a rush you can just skim the slides, not that I recommend it. :-)

Hope you find it informative!

Wednesday, July 9, 2008

I'm curious: what do you use to parse command-line options in your Java programs?

You know, -r, -w, -rwl, --use-flibbert, --foobar=12 and all that kinda stuff.

Wednesday, July 2, 2008

small favor

Hey, everyone? Like, the next time you're making some clever analogy about something you don't like, involving the concept of "riding the short bus?" You know what I mean -- like "maan, with all these other cool languages nowadays, Java's really, heheh, riding the short bus!" (pat self on back)

... Just kinda keep in mind that some of us have a son or daughter who's the love of our life who actually literally does ride the short bus. And your joke is just kinda less amusing to us. hokay? kthx, that's all.

Monday, June 16, 2008

Common Java unchecked exception types

I've noticed a lot of confusion about what type of unchecked exception is the right one to throw under various circumstances. Here's a very simple explanation of the most common types.

NullPointerException - multiple schools of thought on this one. Of course, it's thrown automatically by the runtime when you try to dereference null. Many say that you should never rely on this behavior, and should always check for null explicitly. Many also believe that when you find a null reference, you should throw IllegalArgumentException instead of NPE. This way, a thrown NPE always indicates some programming error in the implementation of the method, not a failure of the caller to pass valid parameters. I'm not taking a stand on this issue right now.

IllegalArgumentException - throwing this exception implies that there exists at least one other value for this parameter that would have caused the check in question to pass. If the caller can't remedy this exception by substituting another value for the argument in question, it's the wrong exception to throw. Note that in some of these cases IndexOutOfBoundsException is more appropriate (and strangely, IOOBE doesn't extend IAE).

IllegalStateException - this exception implies that there are no argument values that could have caused the check to succeed, yet, there does exist at least one alternate state that the instance in question could have been in, which would have passed the check. Note that this type almost never makes sense for a static method, unless you rely heavily on static state (shame on you). Note also that this exception is appropriate whether or not it is possible to actually mutate this aspect of the instance's state, or it's already too late.

UnsupportedOperationException - this means that the method invoked will always fail for an instance of this class (concrete type), regardless of how the instance was constructed.

AssertionError - this is the right exception to use whenever a statement should by rights be impossible to reach.

I hope this helps. Any points you want to argue?

Tuesday, April 29, 2008

JavaOne approacheth

Just got my sessions all scheduled. As usual, I chose them more for the speakers than for the topics; there are certain individuals who I just know can speak well and tend to talk about topics I like -- people like Brian Goetz, Bill Pugh, Cliff Click, Joshua Bloch, etc. Whatever they talk about, I go.

If you're attending this year, and you might like to meet me and chat about Java, collections, Guice, working at Google, the smallwig or whatever, well gosh, I'd like that too. What you can do is stop by the Google booth in the pavilion at one of these times:

  • Tuesday May 6, 2:00-3:00 pm
  • Wednesday May 7, 12:30-1:30 pm
  • Thursday May 8, 12:30-1:30 pm
And then look for the guy wearing a Google t-shirt who looks like this but needs a haircut. Say hello and we can talk about whatever. That would be cool.

Also, if you're a Guice user, please see if you can come to BOF-6400, The Future Of Guice, which is on Thursday at 7:30 pm. Bob, Jesse and I will all be there for an informal fireside chat about forthcoming Guice goodness.

Hope to see you!

And for those not coming to JavaOne, what conferences will you be attending over the next year, if any?

I'm a twit

For those who care about such things, I'm on twitter now.

Friday, April 25, 2008

Interesting Stuff I Read

I've added a link to my Google Reader shared items in my sidebar. You can view that, or subscribe to its feed, or whatever! (You don't have to be using Google Reader... though you should). :-)

Thursday, April 24, 2008

I get to break awesome news

I asked Josh if I could have the pleasure of breaking this news on my li'l blog here, and unbelievably, he actually said "sure."

Lucky me! Here's the news:

Effective Java, Second Edition by Joshua Bloch has gone to press and copies will be available at JavaOne in two weeks.

Hooray! We've been waiting for this for a long time.

Having read it (again, lucky), I'll quickly tell you my opinion (personal opinion only, not an endorsement by my employer, and feel free to disregard it as biased if you like).

You probably all know how valuable the first edition is already. The new edition really takes it a step further. It's vastly improved and has entire new sections on generics, enums, annotations, and other recent Java developments. The concurrency chapter was completely redone to reflect the "java.util.concurrent" new world order. There's a wealth of new information about serialization pitfalls and patterns, and the list goes on.

It is not just the Effective Java you know with a few extra chapters tacked on! Josh has painstakingly revisited every single line of every single page. I believe it shows.

This book will certainly replace its predecessor as the bible of our craft. Many of the code reviews I do for Java library code at Google basically end up with me spouting chapter and verse from EJ, and I can't wait for everyone to get the new edition so I can start doing the same with it!

(Not linking to amazon because I'm peeved at them; they let you click "see inside the book" but then they just show you the insides of the first edition, leading you to think that nothing has changed.)

Pure functions

To its detriment and yours, the Java language makes no distinction between a pure function, and any plain old subroutine. Even in the core libraries, the two are freely intermingled, with no obvious distinguishing characteristic. Yet we can all benefit from striving to make this distinction clear in our own code.

By "pure function" I mean a function in the mathematical sense: it performs a calculation with no observable side-effects, and its result depends only on its arguments. Invoke it again on the same instance (or Class if static), and with the same arguments in the same states, and you must always get the same answer.

What are some advantages of pure functions?

  • They're testable
  • They're thread-safe (though not necessarily "thread-correct", more on this later)
  • They're deterministic
  • They never need to be mocked out*
  • They're easier to understand and reason about
  • They're "referentially transparent," so they can be "memoized" (more on this later)

They're the easy kind of functions to work with, just like immutables are the easy variety of data objects.

(*About this particular claim. Have you ever felt compelled to test how your class behaves if the implementation of integer addition were to change? I doubt it, unless you're just plain batshit crazy, or a mathematician (but I repeat myself). In rare cases, if a pure function is very expensive, you may want to mock it anyway just to make your test runs faster. But you didn't "need" to do it.)

When is a function pure?

All its dependencies must be pure functions themselves (or constants, which are basically just pure functions that have no arguments). Impurity, just like it sounds, is a contaminant. If your method calls eight other methods, and just one of those calls a method which sometimes calls a method which uses System.currentTimeMillis(), kaboom: your function is not pure.

So a method which invokes new Random(5) may still be pure (as guaranteed by that class's specification), while one that invokes new Random() certainly is not. Collections.shuffle(), the two-argument form, is pure, while Collections.shuffle() the one-argument form is not. (wait, duh, neither is pure, because they mutate the passed-in list! but maybe you see the point anyway?) Now you see the "intermingling" I was bemoaning before!

What are the most common sources of impurity in my code?

Some I can think of:
  • mutable state
  • the system clock
  • I/O

I'm sure there are more. Help me out here: what others can you think of?

Are impure functions evil?

No, of course not. If they were, I would never be able to write any, as it would be against company policy. They're simply very different from their pure cousins, and more challenging to work with and to test. Keeping your functions pure, like keeping your value objects immutable, just gives you less to worry about. (Remember that hit song "Mo' Mutatin', Mo' Problems?" Toootally analogous to that. Listen to Biggie, he knew.)

How to deal with impurity?

I've told you that the system clock is a contaminant, that makes everything it touches impure. But, of course, some of your business logic probably needs to know the current time. Are you just hopelessly contaminated as well?

No! You have at your disposal a chlorine tablet called dependency injection! (You just knew it would come to that, didn't you?)


  public class SignUtils {
public static String getCurrentMessage() {
Instant now = new Instant(); // automatically set to now
return someCalculation(now) ? "OPEN" : "CLOSED";

After (simplified):

  public class SignController {
@Inject Clock clock;
public static String getCurrentMessage() {
Instant now =;
return someCalculation(now) ? "OPEN" : "CLOSED";

The result is a function which can be either pure or impure depending on what dependencies are provided for it. In "real life", you need it to be impure, and return a different result at 9:01 than it did at 8:59. But this nondeterminism has now been walled off behind an interface. Because the result of getCurrentMessage() itself now depends only on the states of its arguments (none) and the state of its instance, it will always be just as pure as its provided clock instance is. Now the code is testable, because we properly isolated the impurity.

In summary:
  • Pay attention to the difference between your pure and impure functions.
  • Use dependency injection to limit the damage radius of impure functions.
  • If you're designing the Next Great Language, ferchrissakes handle these two things differently. Don't make the system time available via a simple static method call.

Thanks for reading. Let me know if this kind of post is helpful to you!

Tuesday, April 22, 2008

fun with IdentityHashMap

What does this program print? (Eliding the generics so you can read it.)

public static void main(String[] args) {
Map m = new IdentityHashMap();
m.put("a", 1);
m.put("a", 2);
m.put("a", 3);
System.out.println(new HashSet(m.entrySet()).size());

When you've got the answer, scroll down...

The answer is 1. Even though this is an identity-based HashMap, String literals are interned, so after the first entry is created, it is overwritten two times leaving a map of size one. This single entry will then be placed into the HashSet, so the HashSet has size one.

If you got it right, congrats. Now let's make a small change.

public static void main(String[] args) {
Map m = new IdentityHashMap();
m.put("a", 1);
m.put("b", 2);
m.put("c", 3);
System.out.println(new HashSet(m.entrySet()).size());

Now what does it print?

Once you've decided on your answer, compile and run the code (sorry about the warnings). Were you right?

Update: Ok, this isn't doing the same thing for y'all that it was doing for me. And now it's not doing it for me either. :) Ok look. Try this: remove the call to .size(). Just print out the entry set itself. Guess what it's going to be first. Then see. It'll be worth it, really!

Monday, April 21, 2008

"   "

Who among my readers believes that he/she has a firm grasp on the meaning of the term "whitespace" as it applies to modern Java development?


Whitespace! How much simpler could anything be than that?

Yeah. Well guess what. I've found, so far, six conflicting definitions worth knowing about. They are summarized in this table for your viewing enjoyment. I daresay you will be surprised at how bad the situation is.

Wednesday, April 16, 2008

The real difference between List<Object> and List, illuminated at last

Suppose you're a store clerk, and a customer asks you, "what kinds of credit cards do you accept?"

The difference between List<Object> and List is basically the difference between answering this question "we accept all kinds," and answering it, "duuuuhhhhhhh?"

The smallwig theory of optimization

There are three kinds of optimization.
  1. Optimization by using a more sensible overall approach.
  2. Optimization by making the code less weird.
  3. Optimization by making the code more weird.
You've probably heard, and maybe even spouted yourself, the phrase "premature optimization is the root of all evil." It's exclusively "Type 3 optimization" that this aphorism applies to. Types 1 and 2 are quite fine to engage in pre-emptively.

To make a type 3 optimization, your burdens are six:
  1. Thou shalt have excellent, comprehensive unit tests.
  2. Thou shalt have a reliable benchmark, based on representative inputs.
  3. Thou shalt demonstrate that your change improves the benchmark.
  4. Thou shalt successfully argue that this improvement really matters.
  5. Thou shalt comment the code.
  6. In nontrivial cases, thou shalt also preserve the clear-but-slow implementation, to use in parity tests with your optimized implementation.
In all things, remember these truths:
  1. Your brain is a terrible profiler.
  2. Hotspot will outsmart you.
  3. It just doesn't matter, until it matters.
If you believe this post, please spread the word!

Friday, January 25, 2008

Now with "customizable sodomy!"

Holy shit, I want the version of Mass Effect that this guy played!

Mine is apparently defective. A damn good game though... but I think the sex-with-blue-chicks was about as hot as Kirk's typical planetary exploits back in the 60s.

Thursday, December 6, 2007

Can you watch this without smiling?

If so... I don't know. You might be a replicant or something.

Why does Set.contains() take an Object, not an E?

Virtually everyone learning Java generics is initially puzzled by this. Why should code like the following compile?

Set<Long> set = new HashSet<Long>();
if (set.contains(10)) {
// we won't get here!

We're asking if the set contains the Integer ten; it's an "obvious" bug, but the compiler won't catch it because Set.contains() accepts Object. Isn't this stupid and evil?

A popular myth is that it is stupid and evil, but it was necessary because of backward compatibility. But the compatibility argument is irrelevant; the API is correct whether you consider compatibility or not. Here's the real reason why.

Let's say you have a method that wants to read from a Set of Foos:

public void doSomeReading(Set<Foo> foos) { ... }

The problem with this signature is it won't allow a Set<SubFoo> to be passed in (where SubFoo is, of course, a subtype of Foo).

To preserve the substitutability principle, any method that wants to read from a set of Foos should be equally able to read from a set of SubFoos, so let's tweak our signature:

public void doSomeReading(Set<? extends Foo> foos) { ... }


But here's the catch: if Set.contains() accepted type E instead of type Object, it would now be rendered completely unusable to you inside this method body!

That signature tells the compiler, "don't let anyone ask about containment of an object unless you are damn sure that it's of the exact right type." But the compiler doesn't know the type -- it could be a Foo, or SubFoo, or SubSubFoo, or who knows what? Thus the compiler would have to forbid everything -- the only safe parameter to a method like this is null.

This is the behavior you want for a method like Set.add() -- if you can't make damn sure of the type, don't allow it. And that's why add() accepts only type E while contains() accepts anything.

So the distinction I'm making is between read methods and write methods, right? No, not exactly -- notice that Set.remove() also accepts Object, and it's a write method. The real difference is that add() can cause "damage" to the collection when called with the wrong type, and contains() and remove() cannot.

Uniformly, methods of the Java Collections Framework (and the Google Collections Library too) never restrict the types of their parameters except when it's necessary to prevent the collection from getting broken.

So what to do about this vexing source of bugs, as illustrated at top? Well, when I typed that code into IntelliJ, it flagged a warning for me right away. This let me know to either fix the problem or add an annotation/comment to suppress it. Problem solved.

Static analysis plays an extremely important role in the construction of bug-free software. And the very best kind of static analysis is the kind that pops up in your face the second you write something questionable.

The moral of the story: if you're not coding in a good, modern IDE, you're coding with one hand tied behind your back!

Thursday, November 15, 2007

Occam's Coder

When deciding between two or more choices for a style rule in your team's programming style guide, and all of these lead to code that is equally maintainable, prefer the rule you can explain in the fewest words.

"Readability" is also everyone's goal, but fails as a criterion because it is far too subjective.

Friday, November 9, 2007

What Google Java engineers do while... waiting... for IntelliJ to unhose itself...

This discussion happened on our internal Java mailing list.

Dan: "What libraries do people use for generating PDFs from java?"

Tim: "I used iText. It works pretty well."
Fernando: "+1 for iText."
Steve: "Ditto"
Joseph: "we are using itext..."
Mike: "Calendar uses iText..."
Isaac: "+1 for iText on (Google Spreadsheets)"

Me: "Did six Googlers just AGREE on something in a company-wide mailing list thread?? Did that really just happen?? My God, there really is a first time for everything."

Mike: "I disagree. This assumes that time runs in a single direction. If it doesn't, then if an event occurs more than once then the first time could be called the last time and vice versa so neither can be reliably called a first time."

Me: "I've now gained a new appreciation for the complexity of the problems you folks on the Calendar team have to contend with. Clearly we're moving beyond i18n to i19n: interuniversalization!"

Mike: "Yep. We have an implemention of scheduling in multiple time dimensions ready to go and Jeff allocated a pocket universe for testing but production doesn't want to set up a datacenter there."

Kathrin: "Can't you test in a parallel universe where production has agreed to set up your datacenter?"

Well, I laughed my ass off.

Tuesday, November 6, 2007

Minor API fixes for JDK 7

Josh, Doug and I are proposing a handful of minor API additions to the Java Class Libraries (lang, util, math and reflect) for JDK 7.

The very quick overview of our recommendations:
  1. Static compare() methods for all non-void wrapper classes, not just Double and Float.
  2. Static hashCode() methods on all non-void wrapper classes.
  3. Integer.mod() and Long.mod()
  4. RoundingMode.round(double)
  5. Arrays.concat()
  6. EnumSet.complementOf(Collection)
  7. All JDK maps should implement putIfAbsent(), etc.
  8. Proxy.newProxyInstance(type, handler)
We're pretty confident that these changes will make it into JDK 7, barring any particular controversy that could develop.

Please have a read over the document. What do you think about our proposals? What do you think we've missed?

Saturday, November 3, 2007


if (isBloggingInPseudocode()
&& intendedEffect() == Effect.HUMOR) {
try {
// this line unreachable
} catch (NotFunnyException e) {

Thursday, October 25, 2007

Understanding CLS

This post is about all of us helping each other to understand CLS. CLS is an acronym for "Class Loader Stuff."

Man, I just really don't understand Class Loader Stuff.

Doing what I do, it happens fairly often that someone busts into my office or starts an IM and says, "hey, you know a lot about Java, right? I need some help."

And I always get really excited whenever this happens. Because I just loooove answering questions about things I know! Who doesn't?

"Sure, sure!" I say, "make it quick though, cause I gotta run over to Neal's building and help him with some generics problems he's having" (this is what passes for humor in my cubicle).

And then it all goes downhill, because five seconds of the way in, I realize that they're asking about Class Loader Stuff.

Aw shit. Now I have to be all "um, well, remember how you were, um, asking if I was the guy who knew a lot about Java? yeah? Well, hehe, I was just kiddin, see. Pretty funny, huh?"

Nope, I don't understand class loaders and I don't understand all the fucked up problems that they seem to cause.

So here's what I'm going to do. In fact, I'm going to try to do this every time I become flummoxed by some major Java thing that I just don't get:

I'm going to come to this blog and explain the thing I don't understand to you.

How does that sound? It's really quite a novel privilege to be able to learn from someone who has no idea what they're talking about, I mean since graduation anyway, so I hope you'll enjoy this as much as I will!

Understanding CLS (Class Loader Stuff)

This post assumes that you understand, going in:

  • What a class is; sort of, at least

Because if you don't, then you're in even worse shape than me, man.

The easiest way to understand what a class loader is all about would be to understand the one and only one purpose that it has. Unfortunately, it has something like three purposes, so that's just not going to work.

Plan B. Let's just start with the simplest thing it does first. At the most basic level, a class loader is a thing that you can tell it a name and will give you some bytecodez.

Class loader. n. It's a thing that you can tell it a name and it will give you some bytecodez.

It's a function from class name to class bytecode. You give it a String containing a valid Java class name, and it will use some mechanism to find and return you the bytecode to use for that name. So one of the things that makes different Class Loader implementations different is that they may each use different mechanics for how to come up with that bytecode.

  • Some of them might look at your classpath, read and unpack a JAR file, and get the bytes of a file ending in .class outta there
  • Some of them might make an HTTP request to the porn site you're currently frequenting and retrieve the class files from there (BoobieCam.class, etc.)
  • Some of them might just make some shit up and give it to you, then laugh at you with their friends later
  • Some of them want to abuse you
  • Some of them want to be abused

So that was the easy part. Of course, there's more.

The class loader doesn't just come up with this byte[] containing the bytecode for the class; it also provides these bytes to the JVM in an act known definining the class. This just means -- well, the instant before it does this, that class does not exist in memory, and the instant after, it does.

Every single class your JRE has in memory was placed there by some class loader. The class loader is an obstetrician, delivering new baby classes into the world.

Quick: what was the name of the person who delivered you when you were born? You probably don't even know, do you? But the weird thing about classes is: they know. Whenever a class loader defines a new class, that class contains an immutable reference back to the class loader instance that defined it. You can ask it yourself: clazz.getClassLoader().

But now we're finally getting to the interesting part: every class in memory in your runtime environment can be uniquely identified by the pair of

(a) its full name
(b) the class loader that loaded it

These two things together form, in database terms, the "unique key" of that class within this JVM process. Another way to say it is that each class loader gets its own independent namespace in which it can define classes.

It's as if you asked me my name and I said, "I'm Kevin Dr. Bob Farquar" and you said, "nice to meet you, I'm Kevin Dr. Fenton Pulsifer" -- each of us known by the combination of our own name with the name of our obstetrician who delivered us.

When necessary, I will refer to "the class[A]" as a shorthand for "the unique class with the name foo.Bar which was loaded by class loader A".

You may have heard someone explain, or you may have explained yourself, "see, you can't cast a foo.Bar to a foo.Bar here even though it's the same class, because they came from different class loaders, so there's funny class loader hoodoo going on there." (My explanations are not usually that eloquent and cogent, but I try.)

Or you may have said, "Right, this class is a singleton, but that actually doesn't mean you have only one instance of the class per VM, it means you have one instance of this class per class loader."

But both of these explanations are incorrect. In the situations described, these are multiple different classes that have been loaded. They are not two different "versions" of the same class, they're just two different classes.

The class named foo.Bar defined by class loader A (shorthand: 'foo.Bar[A]') and the class named foo.Bar defined by class loader B ('foo.Bar[B]') have essentially nothing in common with each other. Just a name, and that's just coincidence, really. Having the same name as each other makes it perhaps more likely that they have the same bytecode as well, but this is irrelevant; they very easily may not have the same bytecode at all.

So when "the class foo.Bar" appears to have multiple different static states at the same time, or you sometimes see a ClassCastException for trying to cast a Bar to a Bar -- what's going on is not as mysterious as it first seems. You simply have two classes both using the same name.

So far, I've mentioned that two things the class loader does is that it (a) gets the bytecode somehow and it (b) performs the actual action of defining the class in the VM. These two functions are quite separate: it happens often that class loader B will want to use the same mechanics for obtaining the bytecode as class loader A does, so it will delegate to class loader A for that part. Then once that's done, class loader B will be the one to define the class, so the class will live in class loader B's namespace, and class loader A's noble contribution to this whole affair is just forgotten by everyone.

So. You have this big old soup of classes in memory, some of them have the same names as other ones, but the pair of (name, class loader) is always unique. And there are no hard barriers between these groups of classes; that is, the class Foo[A] can extend, implement, or in any other way refer to the class Bar[B] which comes from a different class loader. There is nothing weird about that.

Except how can that even happen? If I'm loading class Foo, and it extends class Bar, isn't that going to automatically trigger the loading of class Bar, and by the same class loader that's currently loading Foo? Well, yes, it is -- but the class loader can be crafty!

When you ask class loader A to load class Foo, it can say "okay," then when this triggers a request for it to also load class Bar, it can say, "no way, I'M not loading THAT piece of tripe", and it can delegate that operation over to class loader B to carry out. (This is different from the example I gave earlier, where one class loader cruelly exploits another just to get the bytes, but still defines the new class itself. Here, the class loader lets the delegate define the new class itself.)

So now you have class Foo[A] and class Bar[B], and all the references from Foo to Bar will be interpreted as references to Bar[B], not Bar[Z] or whatever other Bars were sitting around.

I'll stop here for now, but if time permits, I will come back and explain how much I don't understand about:

  • the bootstrap class loader
  • the system class loader
  • the extension class loader
  • the application class loader
  • the context class loader

Value objects WTF!!?!!2!

You might recall that here at the smallwig we recently, geologically speaking, discussed the interesting and important topic of how to model a simple "value object" in the Java[TM] Technology Platform Language Technology[TM[TM]]. (note: not its exact name. I can never remember exactly how we're supposed to refer to J*va. F*ckin' J*va.)

Now, in that post, we considered a simple example -- a class with no special behavior, only two simple attributes.

We'd like this class be simple, straightforward, well-behaved, idiomatic and correct.

Here's look at how the code came out -- actually no, let's make it immutable this time, because immutable is simpler and easier. Here goes:

public final class Foo implements Serializable {
private final String text;
private final Integer number;

public Foo(String text, Integer number) {
this.text = text;
this.number = number;

public String getText() {
return text;

public Integer getNumber() {
return number;

@Override public boolean equals(Object object) {
if (object instanceof Foo) {
Foo that = (Foo) object;
// oops! I cheated and used a helper class from the
// Google Collections Library!
return Objects.equal(this.text, that.text)
&& Objects.equal(this.number, that.number);
return false;

@Override public int hashCode() {
// oops! I did it again! ha ha!
return Objects.hashCode(text, number);

@Override public String toString() {
return String.format("[Foo: %s, %s]", text, number);

private static final long serialVersionUID = 0xB0B15C00L;

(Incidentally, this is the point in the previous post at which I proceeded to engage in the professionally dubious activity of laying down a few good old-fashioned "F-Bombs". Please note that it is generally considered inadvisable to spew "foul language" in a "technology blog" which you dream will become "respected" one "day." However, in some circumstances this approach is actually appropriate, for the basic reason that THIS IS A ****ING LOT OF CODE FOR SOMETHING SO MOTHER****ING SIMPLE WHAT THE ****.)

So anyway!

We have a problem here. Here's the problem:

  • A value object is a commonly-needed thing.
  • This is too much code to have to write for such a commonly-needed thing.
  • It easy to get some of the subtle details wrong.
  • If we write tests for these idiotic classes, we're wasting time; if we don't write tests for these idiotic classes, we find out later that they're buggy because, say, we forgot to use a null-safe equality check for a nullable field, or something.
  • Any special behavior you want to add to the class just gets lost in the sea of boilerplate.
  • Uh oh -- now you want your value object to be Comparable too, say by a lexical comparison of its fields. More code to write and rewrite.

Now, what should we do?

Solution 1: Do nothing?

But this answer makes no sense. We've all learned time and time again the perils of code duplication. And this is egregious code duplication. Why should we tolerate it? We shouldn't.

Solution 2: IDE Templates to the rescue?

But wait, you say, I don't have to write this stuff, I just click-click-click-click in my IDE and it generates all of that for me! Problem solved!

The last thing I'll do is argue against this because "not everyone uses an IDE." I'll be totally honest with you: forget people who don't use an IDE. I'm sorry, you know, I believe in "to each his own" and all that, it's just that if "your own" is to "run away from tools that are there to help you and work really well", then I just can't save you from yourself. You know what I mean?

No, that's not it. Look: IDE-generated code is copy-and-paste code. That's all it is. The IDE has a template, it copies it, it pastes it, it changes stuff around. So why people who vehemently detest copy-and-paste coders would then go and have their IDE generate equals() and hashCode() and toString() and compareTo() and clone() methods for them I don't know.

Sure I've alt-Enter'd my way through the creation of many a class. I like generating constructors and automatically extracting fields. But I like it because it's a faster way to write the code that I could have written myself, and would have written the same way anyway.

But no, the equals() and hashCode() methods I've seen IDEs generate are hideously ugly. Which brings me to my other point: IDE templates are not a solution because they only address a small part of the problem. They make classes faster to write the first time, but they do nothing at all towards making your code easier to read or maintain.

Solution 3: Pair! Triplet! Quadruplet! .... McCaugheys?

Don't laugh (all right, laugh). A lot of people really are doing this. They're getting their equals() and all that for free by subclassing their objects from classes like Pair and Triplet and.... well, God, I really hope they just stop there. This brings up all kinds of subtle trouble. For example, you don't want someone's FooPair("a", 1) to be considered as equals() to someone else's BarPair("a", 1), but they kind of have to be, since Pair is a useful (if degenerate) collection class in its own right which demands the customary equals() behavior and subclasses deviating from that breaks "substitutability" and blah blah blah blah.

It's even worse when they don't bother with even this much, have Pair showing up in their public API and all kinds of garbage.

Anyway, this is totally unscalable, so it's a non-solution and I don't think we should spend another column inch talking about this one.

Solution 4: We need a language change!

I once had this friend, a female friend, who was one of those rare people who had brains and beauty and a fun personality and wasn't a stark raving bitch, etc. But she had this problem that some people have, where she was incapable of ever falling in love with a guy unless that guy was somehow completely unavailable to her. She'd be smitten with him as long as he was married, or gay, or her faculty advisor, or her psychiatrist, or a minor, etc. But she kept never noticing the people who were right in front of her who she could have had any time she wanted. It was really sad.

Huh? Where was I? Oh yeah...

Solution 5: Reflectoporn

Here the idea is that you implement equals(), hashCode(), toString(), and compareTo() like this:

@Override public boolean equals(Object obj) {
return Reflectomatic.equals(this, obj);

And these libraries would use reflection to look at all the fields of your class and do the expected fieldwise thing. If you had a field you didn't want to be considered for these purposes, perhaps you could annotate it to that effect.

And in fact, these methods could be defined in an abstract base class which you could extend so you'd have to write even less code. Our example above might turn into:

public final class Foo extends AbstractValueObject {
private final String text;
private final Integer number;

public Foo(String text, Integer number) {
this.text = text;
this.number = number;

public String getText() {
return text;

public Integer getNumber() {
return number;

private static final long serialVersionUID = 0xC11FF15C00L;

Hrm. Well.... this isn't tooo, bad, actually. We're supposed to abhor reflection, though, aren't we? Demonized as being slow, isn't it? Doesn't it just feel like cheating?

Let's bookmark this idea and just go ahead and see if we can do any better.

Oh noe! This post is another goddamn teaser again!

Tune in next time when I discuss "classic" code generation, bytecode generation... and an idea which, unlike all the rest of these, might possibly be new to you! You can expect that post in... oh, certainly if not this year then definitely in the next one, I'm sure.

Wednesday, October 24, 2007

We're a spotlight feature on JavaLobby today!

Monday, October 22, 2007

Collections update: I've integrated more stuff out to subversion (including a big ol' rewrite of the Preconditions class) and I've built and posted a downloadable zip containing jar, source, and javadocs. I hope you'll check it out.

Friday, October 19, 2007

I got a kick out of this observation from Jesse:

Here's my two problems with using it [the assert keyword] for business logic:

  • it can be turned off, which leads to bugs
  • it can be turned on, which leads to performance problems

AWESOME! Only two problems! So if we can just figure out how to plug those two, it will really work sweet!

I love that guy.

Thursday, October 18, 2007

Boy, I sure am getting in some trouble for leaving that teaser post on value objects and never coming back to it! That should teach me a lesson!

The worst part is that I've been avoiding blogging anything at all, because I keep thinking that I have to finish that topic up before I can say much else.

So: you'll get the conclusion to that when you get it. :-) I hope you'll find it was worth waiting for.

So what's up right now? Man, I am tired! I've really been busting ass here trying to get more and more of our collectiony goodness polished and pushed out to the open-source project. (Sometimes I do those two things in one order, sometimes in the other.)

A couple of things just went out. First, I've finished a major reworking of crazybob's ReferenceMap. (Ever wonder how Bob and I collaborate, on things like this, and Guice? One, he writes amazing, amazing shit. Two, I take his shit and I fuck with it. Big time. That's about how it goes.)

I have to tell you, I am really, really proud of ReferenceMap -- the kind of proud you are of something you know you only helped across the last mile, but still. I really think this thing is a beauty.

So what is it? It's the complete generalization of the concept in WeakHashMap; you tell it whether you want to use strong, soft or weak references for keys, and whether you want strong, soft or weak references for values. It does the rest; all nine combinations.

Was Bob the first person ever to think of this? Probably not. I think Apache Commons made theirs around the same time he first made this. But regardless, and I mean no offense to my Apache friends, I'm pretty sure this is the best implementation of this concept you're gonna find. It's fully concurrent -- it implements the ConcurrentMap interface and is backed by a ConcurrentHashMap. Reclaiming of entries happens concurrently as the garbage collector gets to them -- no extra cost to your application threads. And, of course, it's fully generified -- expect no less from anything in our library.

So try it out. Read the source if you're into that kind of thing. Let us know what you think. We're working on more related stuff... to come.

Next up, Cliff Biffle and I (again, mostly Cliff!) wrote a concurrent implementation of Multiset. We like to call it, ConcurrentMultiset. I won't urge you to run out and read it over, though, since I'm knee-deep in the middle of a huge ground-up revision to the Multiset API and documentation which will have ripple effects throughout all the Multiset implementations. When I'm done, you'll know, cause I'll show up here one day talking about how great multisets are all of a sudden.

I feel great these days about where the Google Collections are and where they're heading. I think when we get to 1.0 we're going to have a product you won't want to live without. I hope so, anyway -- and I guess it's that hope that keeps me at work past 11 on nights like this!

Friday, October 12, 2007

Of course, everyone keeps asking me how much I decided to pay for In Rainbows. They ask me since, well, I love Radiohead enough that, from the perspective of most of my friends, I appear to be the biggest Radiohead fan who ever lived. (I'm not; there are people who FAR outclass me, but yes, I love this band.)

And it was hard for me to figure out how much I should pay. I know that their cut on each of their previous albums was probably about a pound at most per copy sold, and this time without a record label or physical CD production or anything, they probably have only a small list of people they had to pay. So I could probably have paid 3 pounds and felt perfectly good about it.

But then, I just kinda felt like I owe them, you know?

It got me thinking. What if some malevolent supervillain developed the power to actually _unmake_ side one of OK Computer, to snuff it out of existence? It could never be heard again, and if I tried to remember how any of it went, I'd just draw a blank. How much ransom would I be willing to pay to stop that from happening?

The answer is probably an amount so high it would shock you. After all, it's JUST MONEY. WTF is money, anyway? And what I'd be saving, I just can't put a dollar amount on. I can hardly quantify the value of what they've given me, even in just those six songs, and then when you add in the Bends, and the National Anthem and Talk Show Host and all of that?

So anyway, no cheaping out for me. I went up to like 8 pounds or something like that.

Man, I tell you this: if I'm ever to be executed, let them shoot me through the head while I'm listening to Paranoid Android. Loud. I mean REALLY fucking loud. At the very end, when it all hits maximum intensity -- just kill me right then. I won't even know it. And who knows, maybe my consciousness would just somehow stay trapped for eternity in that moment. Is that what Heaven is?

Next post is sure to be something about Java, don't worry.

Wednesday, June 27, 2007

Value objects WTF!!?!!1!

Hmm. I just might try my hand at this "blogging" thing after all.

So here's something that has me really mystified about Java.

Let's say you want a simple class to represent a simple value object. Just a garden-variety thingamabob that has -- let's just say two fields. And these two fields together give the object its identity.

And you want this class of yours to be simple, basic, idiomatic and well-behaved.

This seems like a pretty common thing to want to do, right? So far so good?

Better flex those fingers!

  public class Foo implements Serializable, Cloneable {
private String text;
private Integer number;

public Foo(String text, Integer number) {
this.text = text;
this.number = number;
public String getText() {
return text;
public void setText(String text) {
this.text = text;
public Integer getNumber() {
return number;
public void setNumber(Integer number) {
this.number = number;
@Override public boolean equals(Object object) {
if (object instanceof Foo) {
Foo that = (Foo) object;
// oops! I cheated and used a Google-internal helper class!
return Objects.equal(this.text, that.text)
&& Objects.equal(this.number, that.number);
return false;
@Override public int hashCode() {
// oops! I did it again! ha ha!
return Objects.hashCode(text, number);
@Override public String toString() {
return String.format("(%s, %s)", text, number);
@Override public Foo clone() {
try {
return (Foo) super.clone();
} catch (CloneNotSupportedException e) {
throw new AssertionError(e); // impossible
private static final long serialVersionUID = 0xB0B15C00L;

Now, in my native language we have an expression for this situation, which loosely translates as, "what the mother****ing ****???"

What the hell happened? What did we do to deserve this? We just typed forty-four lines of code, not counting any documentation, just to create a ****ing struct, for Christ's sake! And we even had helper methods to call on for everything we reasonably could, at that!

And we'll end up typing it again. And again, and again. Except that in each place in our whole system we discover this pattern, it will have been deviated from in some subtle way or another. Forgot to make it cloneable, Forgot to make it serializable, a typo in equals() -- and tell the truth now, have you never once made that most appalling error of forgetting to implement hashCode()? Really?

So, we'd better write unit tests for all these classes too, then.

Object-oriented languages are meant to save us from duplication. But what we have here is a bona fide disaster. Something has failed us. But what can we reasonably do about it?

This I will address.... in the next post!


Update: a crack team of Googlers who constitute the most powery of our Guice power users have assembled a few times now to hash out the design of more than a half-dozen killer Guice 2.0 features with me & Bob. I will say no more (I dropped some hints at the end of the "Becoming More Guicy" tech talk), but just know that I feel ecstatic about how the plans are coming along, and Bob seems to as well. We will give Guice the power to enable you to do more and better things, but we refuse to let Guice bloat with features it doesn't need, and we are extremely protective of the Guice Philosophy.

In the meantime! Christian Schenk has a swell post comparing Guice, PicoContainer, and Spring. Of course, my favorite part is this:

Spring seems to be harder to use than Guice: without autowiring you have to specify the dependencies between your beans in a very verbose manner, i.e. you have to think about the things you’d like to do and write them down in a configuration file. Ideally you don’t have to specify anything with Guice, except the Inject annotations and Guice will handle everything for you.

Ahh, you've just gotta love that shit. I think I'll set it to music.

But Christian's conclusion?

First I opted for Guice because I like experimenting with new, bleeding-egde software. But as I said earlier, all these annotations couple your software tightly to Guice - that’s not very desirable. . . . To draw a conclusion I think choosing between Guice, PicoContainer and Spring, for lack of a hard and fast rule, will be to do what works best for your project.

I want to address this idea of coupling, because I'm noticing that this concern crops up again and again, and I think that the issue has more to do with perception than anything else.

Does embedding @Inject in your code tightly couple your code to Guice? It does in one sense: some version of guice.jar is going to have to be present on your classpath whenever you compile your implementation classes. This much is true. Beyond this, though -- it's really not that bleak. Your classes have no runtime dependency on Guice. If they run as part of an application that doesn't wish to use Guice, the Guice jar file needn't even be present on the server at all! Now, this changes if you want to use more Guice features, like provider injection and injector injection, but this is the case with Spring as well.

Furthermore, the idea of "coupling" implies that "the one cannot function without the other." But with Guice's annotations, this simply isn't the case. It's important to understand that these are only annotations. They are decoration; meta-information. They don't, and can't, actually do anything. They just sit there, innocuously, in case tools will wish to read them, and otherwise have no effect whatsoever. They do absolutely nothing to impede you from testing your code, or from using the classes with Spring.

Perhaps the issue -- and I'm not specifically talking about Christian, but about all the many who have had this objection -- perhaps the issue is a more visceral disdain for seeing import anywhere in your code? Maybe you feel the same way about import org.apache, so you eschew the Apache Commons and all that stuff too? Begone, vile dependencies! Or maybe "that's different," after all, Google is a big old corporation, and I think it might have been on Slashdot the other day that we've Turned Evil now.

In any case, Google doesn't intend to stop open-sourcing our awesome Java shit anytime soon, so you may have to confront that particular issue again and again. Oop, I said too much....

Now, I ask you -- is one extra jar file in your javac classpath really all that bad? Or are there some additional problems that I don't yet understand?

Monday, June 11, 2007

So I went to the new Rasputin...

So I went to the new Rasputin that replaced Tower Records at El Camino and San Antonio, because I had to get the new Elliott Smith double album (awesome). And a strange thing happened.

I was using a new credit card, because my old one expired in 05/07. And the counter dude asked me to sign the back of my card. Whatever, I thought, and so I did. Then he swiped the card, punched his buttons, handed me the stub, and I signed it and gave it back to him.

And he compared the signatures.

Well, I thought it was funny...

Friday, June 8, 2007

There's something fundamentally weird about LinkedHashSet, isn't there?

You use it when you want to have a guaranteed order of iteration (insertion order or LRU order). Except, what would happen if you took your LinkedHashSet and, say, stored it into a HashSet? Later you'd be iterating through the elements of this greater set, and "your" LinkedHashSet will be there, but it might have a completely different iteration order! It might even suddenly be an LRU-ordered set while your original one was insertion-ordered. You'll have no idea what to expect.

So you wanted the iteration order to be respected, and you thought LHS would give you that. But it wasn't respected; it was just discarded.

So... a little bit strange. Just a little.

Tuesday, March 20, 2007

It was easier than I thought

I wanted a way to distribute JUnit tests among multiple threads. I couldn't believe how easy it turned out to be.

 public class ParallelTestSuite extends TestSuite {
private final int numWorkers;

public ParallelTestSuite(int numWorkers) {
this.numWorkers = numWorkers;

@Override public void run(TestResult result) {
final ExecutorService executor
= Executors.newFixedThreadPool(numWorkers);
TestResult newResult = new ForwardingTestResult(result) {
@Override protected void run(final TestCase test) {
executor.execute(new Runnable() {
public void run() {
private void superRun(TestCase test) {;
try {
executor.awaitTermination(3600, TimeUnit.SECONDS);
} catch (InterruptedException e) {

ForwardingTestResult is just what it sounds like; a TestResult that forwards all its method calls to another TestResult.

I found it interesting that I couldn't really figure out how to avoid needing that one-line "superRun()" method. What else could I do?

Anyone out there who understands JUnit and java.util.concurrent have any helpful criticism for how to make this better?

Monday, March 12, 2007

by the way, what does 'extraordinarily typesafe' mean?

I was feeling pretty proud about Guice. It went 1.0 last week and has been making the rounds in the blogosphere the last few days; it seems to make a positive first impression on most people.

I showed my wife an announcement that I sent to some mailing list or other about the release. This was like a few days ago. And just now she says to me, "by the way, what does 'extraordinarily typesafe' mean?"

I said,

"In the software world we really like to think in terms of objects, and our favorite programming languages are built to support this object-centric view. An object might be the name 'Caryn', or it might be Kevin's Discover card or the font that I like to post my blog in.

"When things are not 'typesafe', someone can walk up to you and ask for your credit card and you can hand him a font, and he'll say, '30 seconds.' And he'll walk away, many kilograms away and only later will he gleefully try to charge your credit card and explode cosmically. You know, like what was supposed to happen in Ghostbusters when they crossed the streams.

"Java has always been typesafe at some level, because you can't just pass a font to a credit-card processor. But there was a huge, yawning chasm in the type-safety landscape that an entire Stay-Puft man could fit in comfortably: collections of things. See, you might have a roster full of names, or a wallet full of credit cards, or a drop-down box full of font choices, and as soon as you put something into any of these, it would immediately forget what the hell kind of a thing you just put in. And when you'd take something out, you'd say, 'take this thingamabob out, and by the way, that's a credit card I'm taking out.' Yeah. Only it might not actually be.

"What this is, is it's not typesafe. So anyway, what this guy Josh I work with, who I'm always telling you about, helped to mastermind was this huge change in Java a few years ago, that finally plugged this leak, so that you can now say 'this is a bag of credit cards' or 'this is a bag of employees' and you're only allowed to do things with these bags that make sense. Now Java is finally very typesafe.

"So then there's this thing called Dependency Injection that our framework supports, and other things like Spring also support it, except Spring doesn't support all the latest advancements in Java, and it wants you to configure things outside your Java code, in a bloody text file. So you can go to that text file and say 'use this font renderer to process credit card payments' and it will think that's okay, until you try to run everything and then the asploding.

"But Bob and I made this framework where you get to only write Java code, and so as soon as you try to do this bogus thing, it's just like that example of the bag of employees above. The words turn red on your screen and you know something's wrong right away.

"And that's why we say we're 'extraordinarily typesafe'."

Do you buy it?

Friday, March 9, 2007

This is a blog.

I might start blogging again. I used to blog a ton back in 2001-2004. But, I had the benefit of a job that thoroughly bored me -- and there's nothing like that to inspire a blogger's best and most prolific work. Google came along and fixed all that real good.