Avoid fields for communication between methods

Everyone knows global variables are bad. But why are they bad? They’re bad because they can obscure the path the data takes through the code. You can set a global in one function and read it in another, and it’s not clear that these two functions are passing data between them. It’s clearer to pass the data as function parameters or return values.

And although everyone knows this, they still do it every day in object-oriented programs, except the globals are now called “fields”. Setting a field in one method and reading it in another is just as bad as doing so with a global variable, but because the variable is inside an object on the heap somehow it has become okay. It’s not. It’s just as bad and it should be avoided. You must think of all fields as globals and minimize their use.

For example (which I saw in production code):

public void doSomething() {
    mStatus = OK;
    doSomethingElse();
    if (mStatus != OK) {
        fail();
    }
}

private void doSomethingElse() {
    if (problem) {
        mStatus = FAIL;
    }
}

The status could have just been returned from the second method:

public void doSomething() {
    Status status = doSomethingElse();
    if (status != OK) {
        fail();
    }
}

private Status doSomethingElse() {
    if (problem) {
        return FAIL;
    }

    return OK;
}

Here’s the opposite case:

public void doSomething() {
    mParameter = ...;
    doSomethingElse();
}

private void doSomethingElse() {
    if (mParameter == ...) {
        ...;
    }
}

The variable could have been passed in:

public void doSomething() {
    Parameter parameter = ...;
    doSomethingElse(parameter);
}

private void doSomethingElse(Parameter parameter) {
    if (parameter == ...) {
        ...;
    }
}

I see the above two patterns distressingly often, and from people who would never dream to do the same thing with a global variable in C.

Whenever possible, make fields final and pass mutable data around through method parameters and return values. This can be hard in Java when returning multiple values, but it’s still better to make a one-off class to hold both values than it is to use fields as a way to pass data around. At least it’s clear how the data is moving through your code.

Just to be clear, I’m only addressing the case of a single call from a user of a class. If the user calls the class two different times, then clearly a field might be necessary to keep state between the calls.

There’s a related pattern where some kind of context object is passed through a sequence of functions:

Context context = new Context(request);

analyzeRequest(context);
fetchData(context);
processData(context);
generateOutput(context);

Here each function reads from the context and writes back to it. The flow of data is completely obscured. Use parameters and return values:

Analysis analysis = analyzeRequest(request);
Data data = fetchData(analysis);
data = processData(data);
String output = generateOutput(request, data);

Here you can immediately see that generating output requires information from the request but processing data doesn’t; that processed data is of the same type as the raw data; that the raw data is discarded; etc. If you try as much as possible to make fields read-only, you’ll naturally gravitate to a clearer data flow.

~ See all programming posts ~