Chapter 4: Variables

Say you had a program that talked about your dog. It might look like this:

print "I have a dog.  It is called Bort."
print "Bort likes to eat dog food."
print "My children and Bort get along very well."
print "My dog Bort barks at my neighbor, whose name is also Bort."

You might go on like this for a while, asking the computer to display many such interesting things about your dog. If, one day, you changed your dog’s name, or instead wanted to talk about your other dog, or gave this program to a friend and they wanted to change it to refer to their own dog, you’d have to go through the whole program and change every occurrence of “Bort” to the new name. You could use your text editor’s search-and-replace feature to do this, but doing so is tedious and you might make mistakes. For example, you might accidentally replace the neighbor’s name when you only intended to change the dog’s name.

Programming languages solve this problem by allowing you to write the dog’s name only once and then refer to it using a label. This is a bit like using pronouns in natural languages. So you could write:

dog = "Bort"

If you’re “playing computer” and get to the above statement, you’ll think, “Remember the string Bort, and whenever the word dog is used, substitute the word Bort.” This is called an assignment statement because you’re assigning meaning to the word dog. A subsequent statement could say:

print dog

Note that there are no quotation marks here. You don’t literally want to print the word “dog”, you want the computer to remember the text it had previously associated with that word. It’s a bit like the distinction between saying “He said his age” and “He said, ‘his age’.” In the first case it’s a number and in the second it’s the words “his age”. The word dog is a variable. It’s a symbol that the computer associates with something else, in this case the word “Bort”. Our original program can now be written as:

print "I have a dog.  It is called " + dog + "."
print dog + " likes to eat dog food."
print "My children and " + dog + " get along very well."
print "My dog " + dog + " barks at my neighbor, whose name is Bort."

Note that string concatenation (using the plus sign) was necessary here, since sometimes we wanted to use quotation marks (to mean literal text) and sometimes we didn’t (to refer to the variable). The computer will see a statement like:

print dog + " likes to eat dog food."

and first substitute the word dog with the string “Bort”:

print "Bort" + " likes to eat dog food."

then concatenate the strings:

print "Bort likes to eat dog food."

then perform the statement, displaying this on the screen:

Bort likes to eat dog food.

Note also that the name of the neighbor was not replaced by a reference to dog. The variable dog refers to the name of the dog. That’s the whole point of using the variable: to keep track of a specific concept. You could have a different variable to keep track of the neighbor’s name:

neighbor = "Bort"

and change the last line of our program to:

print "My dog " + dog + " barks at my neighbor, whose name is " + neighbor + "."

That way you can change the name of your dog or your neighbor easily at the top of the file and the rest of the program will run correctly. I removed the word “also” from the line since, now that we’re using variables, I can no longer be sure that the two names are the same. I wouldn’t want to change one of the two and have my program still display the word “also” there. Later we’ll see a way of displaying the word “also” only if the two names are the same.

The whole program now looks like this:

dog = "Bort"
neighbor = "Bort"

print "I have a dog.  It is called " + dog + "."
print dog + " likes to eat dog food."
print "My children and " + dog + " get along very well."
print "My dog " + dog + " barks at my neighbor, whose name is " + neighbor + "."

The blank line between the assignment statements and the set of print statements is not necessary. It’s ignored by the computer. It’s there for your benefit, so that you can visually see the break between the two blocks of code. Since the two blocks do fundamentally different things, it’s nice that they’re visually separate. Use blank lines liberally in your code to separate blocks of code that are conceptually different.

It’s important to remember two things about variables. Firstly, a variable must be set before it is used. So you must write:

dog = "Bort"
print dog

and not:

print dog                             (Wrong!)
dog = "Bort"

Remember when “playing computer” that you must look at each statement individually and in order from top to bottom. In the second example above you would first get to the “print dog” statement and have no idea what the variable dog was referring to. Different languages react differently when they run into this problem. Some just assume the variable is an empty string (“”), but most stop the program and report the error.

Secondly, when you’re “playing computer” and see the second line of the correct program:

dog = "Bort"
print dog

it’s very important that you don’t “go back up” to the first line to see what dog refers to. The first line is long gone. You can only look at one line at a time, and always sequentially. The first line makes a note somewhere (in memory) that dog refers to “Bort”, but then the line itself is left behind and forgotten. The second line checks memory and gets the variable’s meaning there, not from the first line directly. This may seem like a trivial distinction, but we’ll use variables in more sophisticated ways later and it won’t work to “go back up” to find where a variable was set; you’ll need to think of the variable and its value as being stored in memory.

The word you use to name your variable is only important to you. To the computer it’s only important that you be consistent. So you can write:

dog = "Bort"
print dog

and that will work just as well as:

xyz = "Bort"
print xyz

You have to use the same name in both places so that the computer can figure out that you’re referring to the same thing, but the actual word (dog or xyz) is only ever visible to you the programmer, not to the user of the program. In this example the first case is better because the name dog represents what you’re using the variable for. To someone reading the program, it’ll be clear that the string “Bort” is the dog’s name. If you were to use the variable name xyz, this would not be clear to someone reading the program.

The word variable implies that variables can change, and indeed they can. You could write the following program:

dog = "Bort"
print "My first dog was called " + dog + "."
dog = "Milou"
print "And my second dog was called " + dog + "."

The computer will see the first line, associate the string “Bort” with the variable dog, use that string in the second line, change the association of the variable dog to “Milou” in the third line, and use the new association in the fourth. In memory the variable dog can only be associated with one string. The third line above replaces the old association with the new. The second and fourth lines look into memory to see what that association is, and at each statement the value is different because the value in memory is different at that point in the program’s execution. You’ll see:

My first dog was called Bort.
And my second dog was called Milou.