References 2
Objects can contain references
Last time we looked at references, which are variables containing “pointers” to objects. We can also put these references inside objects. Look at this class definition:
public class Room {
public String text;
public Room north;
public Room south;
public Room east;
public Room west;
}
This class describes objects which can each link to four other objects of the same class. It could be part of a text adventure game - imagine a maze of rooms, each of which connects to four others:
For example, there will be a Room whose text
is “The Tower”, and
whose west
field contains a reference to another Room whose
text
is “The Hall.”
What else can we say about the maze?
Null references
What is the value of the east
reference in the room whose text is “The Tower”?
What do any of the references hold just after we have created a Room but haven’t created any links yet?
When we write our class declaration we can choose to give our fields initial values, and when we create an object it will get those values. Alternatively we can just leave them with default values. For example:
public class Enemy {
public int x; // new objects will have default value 0
public int y; // again, default value 0
public float hp=100.0 // objects will start with hp=100.0
public int ammo=50; // and 50 ammunition
}
It’s a good idea to set an initial value when you can. But this doesn’t make sense for objects some of the time - we might not have an object to refer to yet! This is certainly true in our Room class when we create the first Room in the maze.
Sometimes a null reference is an OK thing; in our class it might simply mean “you can’t go in this direction.” But often having a null reference indicates something has gone wrong, usually that an object hasn’t been created when it should have been.
Luckily we can test for it easily:
if (myReference == null) {
// the reference is null, do something
} else {
// the reference isn't null, do something else
}
The inventor of the null reference is famous computer scientist Tony Hoare. He now wishes he had never done it:
I call it my billion-dollar mistake. It was the invention of the null reference in 1965 […] I couldn’t resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.
—Hoare, Tony (2009). “Null References: The Billion Dollar Mistake” (Presentation abstract). QCon London.
In some modern languages (such as Kotlin), null references are not allowed unless you explicitly say they are. This can make code rather more complex, but avoids bugs in large projects.
Creating a maze
Let’s create a few rooms in a maze. The structure I want to create is the one shown below, with just two rooms connected to each other:
This diagram shows null references as black dots: only these two rooms exist.
For example, it is not possible to go east from “The Tower,” because the east
reference
is null in that Room.
Let’s create that structure - this time I’ll write a complete Main class with a main method - don’t worry if you can’t quite see how this fits into the idea of objects and classes (“static” methods are a little bit of a cheat to help us write actual code that does things without an needing an object).
Normally I would use a constructor to set up the room name and setter methods to manage the connections, but here I’m going to do it by hand so you can see what’s going on - again, this is generally a terrible idea:
public class Main {
public static void main(String[] args){
// instantiate two rooms, and assign them to two local
// variables
Room hall = new Room();
Room tower = new Room();
// Create room links - all other links are null by default
hall.east = tower;
tower.west = hall;
// set up room texts
hall.text = "The Hall";
tower.text = "The Tower";
}
}
What does this look like in a memory diagram? It could get quite complicated, so the little slideshow below will show how the memory changes after each line of code. I’ve changed the diagram notation a little too:
- Data in memory which is part of an object is green (as before);
- Locations in memory which are local variables have a pink comment (as before);
- Locations in memory which are fields inside an object have a purple background and white text, and I’ve also put a “.” in front of the name (to make it clear it is a field).
As before, clicking on the image will move onto the next slide. Read the caption below the slide to see what’s changed.
Pay careful attention to steps 4 and 5. Let’s look at step 4 in more detail - the line of code is
hall.east = tower;
- Remember that the
tower
variable is just a number - a location in memory. In our diagram, it has the value 7. hall.east
says “the variable you want is theeast
field inside the object pointed to by thehall
reference.” Here, the dot “.” dereferences the reference, so we can see the fields inside the object it points to.- We copy the value of
tower
into this variable - theeast
variable of the object indicated byhall
. - Now both
tower
andhall.east
have the same value, so if you go east from “The Hall”, the room you arrive in is the one indicated bytower
.
The diagram is now very, very messy, so I’m going to change it step-by-step into something simpler. Here it is again:
First, we don’t need to show all the memory in a single column - we can split that column up into sections for the local variables and the two objects. We won’t change anything in the computer or in the program, we’ll just move things around in the diagram:
When Java allocates memory, creates an object and returns a reference, we don’t need to know the numerical value of the reference. The actual location in memory is just a number, but we don’t need to see it printed out or ever treat it as a number. That means we can remove the address numbers from the diagram:
Note that the references are now shown as the starting points of arrows, pointing at the block of memory (the object) they refer to. We can swap over the “variable name” section and the contents section, to make it easier to read:
One last small change: we might have objects of classes other than Room, so we’ll add the class name to each object:
Here is the same thing as an object diagram, of the kind you have seen before:
Strings
What about the text
strings? I haven’t dealt with those in the diagram above, leaving null references to them.
This is because String
objects are “black boxes” - we don’t know their internal structure, so I don’t know how big they are. This makes them quite
difficult to draw in a memory diagram
which shows the addresses (such as in the first set of diagrams above). Now we aren’t showing the memory
addresses, we can draw them:
Because I don’t know their internal structure I can’t draw any fields. All I know is that there is some memory which contains a representation of some text - so that’s what I’m drawing.
Some people might go further, drawing this:
This is because although Strings are just objects, and the String
type
is a reference to a String object, our diagrams would become really messy if
we drew them as objects with all the time.
An end note
You will have noticed that for most of this page I have been very careful about how I refer to objects. I have mostly called my rooms “the room in the top-right corner of the diagram” or “the room whose text is ‘The Tower’” instead of referring to the room as “The Tower.”
This is deliberate. The only thing about that particular object that makes it “The Tower” is its text. It’s very easy to get confused between a thing and the name of a thing - or between a thing and a reference to it. In a lot of ways, the purpose this entire series of pages is to avoid this confusion. An object is not the same as a reference to an object, although experienced coders sometimes talk as if it were.
I will leave the final word to Lewis Carroll:
‘You are sad,’ the Knight said in an anxious tone: ‘let me sing you a song to comfort you.’
‘Is it very long?’ Alice asked, for she had heard a good deal of poetry that day.
‘It’s long,’ said the Knight, ‘but very, very beautiful. Everybody that hears me sing it—either it brings the tears into their eyes, or else—’
‘Or else what?’ said Alice, for the Knight had made a sudden pause.
‘Or else it doesn’t, you know. The name of the song is called “Haddocks’ Eyes.”’
‘Oh, that’s the name of the song, is it?’ Alice said, trying to feel interested.
‘No, you don’t understand,’ the Knight said, looking a little vexed. ‘That’s what the name is called. The name really is “The Aged Aged Man.”’
‘Then I ought to have said “That’s what the song is called”?’ Alice corrected herself.
‘No, you oughtn’t: that’s quite another thing! The song is called “Ways and Means”: but that’s only what it’s called, you know!’
‘Well, what is the song, then?’ said Alice, who was by this time completely bewildered.
‘I was coming to that,’ the Knight said. ‘The song really is “A-sitting On A Gate”: and the tune’s my own invention.’
We spend a lot of time in computer science thinking about things, and things which refer to them, and this piece from Through the Looking-glass neatly illustrates the kind of mess you can find yourself in.
The next page has a handy quiz that will help you see understand some of the common mistakes that references can lead to.