Man, I just really don't understand Class Loader Stuff.
Doing what I do, it happens fairly often that someone busts into my office or starts an IM and says, "hey, you know a lot about Java, right? I need some help."
And I always get really excited whenever this happens. Because I just loooove answering questions about things I know! Who doesn't?
"Sure, sure!" I say, "make it quick though, cause I gotta run over to Neal's building and help him with some generics problems he's having" (this is what passes for humor in my cubicle).
And then it all goes downhill, because five seconds of the way in, I realize that they're asking about Class Loader Stuff.
Aw shit. Now I have to be all "um, well, remember how you were, um, asking if I was the guy who knew a lot about Java? yeah? Well, hehe, I was just kiddin, see. Pretty funny, huh?"
Nope, I don't understand class loaders and I don't understand all the fucked up problems that they seem to cause.
So here's what I'm going to do. In fact, I'm going to try to do this every time I become flummoxed by some major Java thing that I just don't get:
I'm going to come to this blog and explain the thing I don't understand to you.
How does that sound? It's really quite a novel privilege to be able to learn from someone who has no idea what they're talking about, I mean since graduation anyway, so I hope you'll enjoy this as much as I will!
Understanding CLS (Class Loader Stuff)
This post assumes that you understand, going in:
- What a class is; sort of, at least
Because if you don't, then you're in even worse shape than me, man.
The easiest way to understand what a class loader is all about would be to understand the one and only one purpose that it has. Unfortunately, it has something like three purposes, so that's just not going to work.
Plan B. Let's just start with the simplest thing it does first. At the most basic level, a class loader is a thing that you can tell it a name and will give you some bytecodez.
Definition
Class loader. n. It's a thing that you can tell it a name and it will give you some bytecodez.
It's a function from class name to class bytecode. You give it a String containing a valid Java class name, and it will use some mechanism to find and return you the bytecode to use for that name. So one of the things that makes different Class Loader implementations different is that they may each use different mechanics for how to come up with that bytecode.
- Some of them might look at your classpath, read and unpack a JAR file, and get the bytes of a file ending in .class outta there
- Some of them might make an HTTP request to the porn site you're currently frequenting and retrieve the class files from there (BoobieCam.class, etc.)
- Some of them might just make some shit up and give it to you, then laugh at you with their friends later
- Some of them want to abuse you
- Some of them want to be abused
So that was the easy part. Of course, there's more.
The class loader doesn't just come up with this byte[] containing the bytecode for the class; it also provides these bytes to the JVM in an act known definining the class. This just means -- well, the instant before it does this, that class does not exist in memory, and the instant after, it does.
Every single class your JRE has in memory was placed there by some class loader. The class loader is an obstetrician, delivering new baby classes into the world.
Quick: what was the name of the person who delivered you when you were born? You probably don't even know, do you? But the weird thing about classes is: they know. Whenever a class loader defines a new class, that class contains an immutable reference back to the class loader instance that defined it. You can ask it yourself: clazz.getClassLoader().
But now we're finally getting to the interesting part: every class in memory in your runtime environment can be uniquely identified by the pair of
(a) its full name
(b) the class loader that loaded it
These two things together form, in database terms, the "unique key" of that class within this JVM process. Another way to say it is that each class loader gets its own independent namespace in which it can define classes.
It's as if you asked me my name and I said, "I'm Kevin Dr. Bob Farquar" and you said, "nice to meet you, I'm Kevin Dr. Fenton Pulsifer" -- each of us known by the combination of our own name with the name of our obstetrician who delivered us.
When necessary, I will refer to "the class foo.bar[A]" as a shorthand for "the unique class with the name foo.Bar which was loaded by class loader A".
You may have heard someone explain, or you may have explained yourself, "see, you can't cast a foo.Bar to a foo.Bar here even though it's the same class, because they came from different class loaders, so there's funny class loader hoodoo going on there." (My explanations are not usually that eloquent and cogent, but I try.)
Or you may have said, "Right, this class is a singleton, but that actually doesn't mean you have only one instance of the class per VM, it means you have one instance of this class per class loader."
But both of these explanations are incorrect. In the situations described, these are multiple different classes that have been loaded. They are not two different "versions" of the same class, they're just two different classes.
The class named foo.Bar defined by class loader A (shorthand: 'foo.Bar[A]') and the class named foo.Bar defined by class loader B ('foo.Bar[B]') have essentially nothing in common with each other. Just a name, and that's just coincidence, really. Having the same name as each other makes it perhaps more likely that they have the same bytecode as well, but this is irrelevant; they very easily may not have the same bytecode at all.
So when "the class foo.Bar" appears to have multiple different static states at the same time, or you sometimes see a ClassCastException for trying to cast a Bar to a Bar -- what's going on is not as mysterious as it first seems. You simply have two classes both using the same name.
So far, I've mentioned that two things the class loader does is that it (a) gets the bytecode somehow and it (b) performs the actual action of defining the class in the VM. These two functions are quite separate: it happens often that class loader B will want to use the same mechanics for obtaining the bytecode as class loader A does, so it will delegate to class loader A for that part. Then once that's done, class loader B will be the one to define the class, so the class will live in class loader B's namespace, and class loader A's noble contribution to this whole affair is just forgotten by everyone.
So. You have this big old soup of classes in memory, some of them have the same names as other ones, but the pair of (name, class loader) is always unique. And there are no hard barriers between these groups of classes; that is, the class Foo[A] can extend, implement, or in any other way refer to the class Bar[B] which comes from a different class loader. There is nothing weird about that.
Except how can that even happen? If I'm loading class Foo, and it extends class Bar, isn't that going to automatically trigger the loading of class Bar, and by the same class loader that's currently loading Foo? Well, yes, it is -- but the class loader can be crafty!
When you ask class loader A to load class Foo, it can say "okay," then when this triggers a request for it to also load class Bar, it can say, "no way, I'M not loading THAT piece of tripe", and it can delegate that operation over to class loader B to carry out. (This is different from the example I gave earlier, where one class loader cruelly exploits another just to get the bytes, but still defines the new class itself. Here, the class loader lets the delegate define the new class itself.)
So now you have class Foo[A] and class Bar[B], and all the references from Foo to Bar will be interpreted as references to Bar[B], not Bar[Z] or whatever other Bars were sitting around.
I'll stop here for now, but if time permits, I will come back and explain how much I don't understand about:
- the bootstrap class loader
- the system class loader
- the extension class loader
- the application class loader
- the context class loader