Sometimes, when you’re reading code that looks nice and clean but which doesn’t quite make sense, the problem lies in the naming. Naming smells are code smells that come from bad names.
A code smell indicates where you can improve your code, and often points to some deeper problem. A particular code smell often has a corresponding refactoring that removes that particular smell, improving the code. Naming smells appear in many forms, but all have the same refactoring: Rename. So how hard could it be?
This article describes some common naming smells. Most come from names that don’t carry as much meaning as they might, or suffer from ambiguity. Beyond that, bad names lack precision or accuracy.
foo doesn’t mean anything at all - the coding equivalent of
foo indicates a placeholder in an example where the name has no significance - a metasyntactic variable.
foo typically indicates a deliberate refusal to name something, with the excuse that the code won’t last long.
Tens of millions of occurrences on GitHub suggest programmers frequently succumb to this temptation.
The solution to meaningless names like
foo lies in the realisation that people find concrete examples easier to understand than abstract examples.
foo to a concrete real-world concept, such as
What is the worst ever variable name?
What is the second-worst name?
What is the third-worst name ever?
object lack precision.
No doubt they do name data and objects, but you already knew that without the vague name.
You might as well use
thing, which is equally abstract but at least has humour value in its uselessness.
data to a more descriptive name that identifies what kind of data it is.
employee2 is a more subtle case of a name that is too abstract.
employee2 has more meaning than
data, say, but the numeric suffix probably separates it from something already called
When this happens, the reader knows is that there are two employees, but not how one differs from the other.
employee2 to names that communicate the difference between the two employees.
Perhaps you have a
managementEmployee and a
If your programming language has types, both identifiers probably have the type
Employee, so you can remove the
employee prefix, leaving a
manager and a
Note that although code that includes
employee_2 makes you laugh, you’ll stop seeing the funny side when you see it in legacy production code that your team has just inherited.
You’ll probably also stop being able to read The Daily WTF at the same time.
Trust me on this.
pos look like abbreviated forms of words, which causes problems when you realise that you have to choose between words like
accuracy, or between
point of sale.
Abbreviations tend to suffer from ambiguity since multiple words often share the same abbreviation.
auth are more subtly ambiguous, because people may use them as standard abbreviations within specific contexts.
This stops working when you switch context: when
Meanwhile, no-one ever knows whether
authorisation (or both).
Refactor abbreviations to the complete word, which is easy as long as you know which word that is.
a and other single-letter names have more potential explanations than abbreviation, which makes them worse.
a might abbreviate a word, like
answer, but it might also be used as a meaningless placeholder, either because it comes first in the alphabet or because
foo was too long to type.
Single-letter names are meta-ambiguous because they’re ambiguous about which kind of ambiguity you’re dealing with. Refactor to a complete word, using the refactoring for either meaningless names or abbreviated names.
Traditionally, at least in certain programming languages, coders use names like
k for loop variables and would prefer to ignore these problems and carry on using single-letter names.
(If they’re particularly hardcore, they use
kk, which are easier to search for in a text editor with no programming language support.)
If you’re wondering whether it’s okay to make a for-loop index exception, you should probably switch to a more modern programming language that iterates over collection types without using a loop index. And before you whine that single-letter names are okay because Haskell, one letter is too short in functional programming as well.
This method from the
Lens class in Scalaz defies comprehension.
This is what Scala code written with a Haskell accent looks like.
At least there’s a comment that tells you what
modp is supposed to mean.
Haskell presumably uses single-letter names because the mathematicians who invented the language use single-letters in written mathematics. What this Haskell-style code seems to ignore is that written mathematics includes paragraphs of text and precise symbolic definitions that explain what each name means.
>=> ends up being called the fish operator by the Scalaz programmers who don’t know what it’s called.
This ASCII-art fish follows mathematics’ tradition of starting with single-letter names, running out of letters, subsequently exhausting the Greek and Hebrew alphabets, and then inventing new symbols in the vain hope that they’ll catch on.
Meanwhile, we have already reached peak Haskell, as noted in a tweet by @aisamanra:
InvoiceManager suffers from a vague noun.
Who knows what a manager really does?
Alan Green wrote about the ‘-Manager’ problem, and suggests some more meaningful alternatives, such as
getScore, on the other hand, suffers from a vague verb.
Rename names with a
get prefix to use a more specific verb, such as
get prefix is largely a Java-programming anti-pattern that comes from the JavaBean Specification’s requirement that the accessor method for a
score property must be called
Java developers habitually use the prefix for all kinds of methods, not just JavaBean properties.
A related anti-pattern results from the names
hasScore for a Boolean
score property: Boolean variables often have names like
isVictory, with a redundant
Vestigial Hungarian notation
dateCreated have prefixes that indicate Boolean and date types, respectively.
This is a kind of vestigial Hungarian notation.
In naming, Hungarian Notation takes the title of the most spectacular example of something that seemed like a good idea at the time but isn’t anymore, now that we have languages with types.
These names may be useful in stringly-typed languages, but you don’t need them in languages like Java and C#, where you can define your own types. Rename to drop the prefix -
created, and use the appropriate type.
birthDate are exceptions: widely-used domain terms in fields like human resources.
Although renaming to
born would make sense from a coding point of view, the domain likely never uses the term.
appointment_list typifies many names for collections of things that simply concatenate the collection type and the item type.
company_person names a relationship by using the names of two related entities.
This approach can lead to names that form a pidgin language of simple words, like
text_correction_by_editor, as if identifiers were written in Basic English.
We tend to write code in English, which has a rich vocabulary to use for naming.
This code smell is the observation that ‘there’s a word for that!’
shareholder or some more specific relationship name.
In the case of a
text_correction_by_editor, referring to the person who edits a manuscript, rename to just
carrier, depending on the context, can be the right name for the wrong thing, i.e. the wrong name.
In a supply chain context, for example, an order might be related to a shipment but isn’t the same thing.
Similarly, a carrier probably isn’t the same kind of organisation as a broker.
Some names are just wrong.
Spotting and fixing wrong names can prove difficult because you need some other clue that the intended concept is something else. Documentation benefits from a little repetition, which builds confidence in the intended meaning. If you only have one clock, you don’t know if it has the right time. If you have two clocks that differ, you don’t know which one is wrong. Only three clocks give you more confidence in the correct time than one, assuming that it’s unlikely for more than one to be wrong. Naming and documentation are like a pocket watch and a clock, in that respect.