Polls and elections aren’t the same type of thing. The former asks for descriptive data, the latter for prescriptive preferences. To illustrate the hazards of this common misunderstanding, let’s go over some other ways data can partitioned and the problems that arise from confusing them.
Ordered data vs unordered data
Ordered data has an innate linear ordering, whereas unordered data doesn’t.
- How many apples can you eat in one sitting? (0, 1, 2…)
- What was the last type of apple you ate? (Granny smith, red delicious…)
When unordered data is confused as ordered data, a relationship is erroneously implied between the items.
When ordered data is confused as unordered, data trends are buried.
Ordered discrete vs ordered continuous
With discrete data, there exist two values for which there is no valid value in between, whereas continuous data can always be broken down into more detail.
- How many thermostats do you have in your house? (0, 1, 2…)
- What is the average temperature in your house? (22.1°C, 20.3°C…)
When discrete data is confused as continuous, inappropriate conclusions can be reached (chances are you have 2.3 thermostats in your house).
When continuous data is confused as discrete, granularity is lost. We can’t tell from the graph below whether there’s a large portion of 0.0km entries, if only odd decimal entries exist (0.1, 0.3, 0.5…), or any other detail hidden by these brackets.
Simple data vs compound data
Compound data have multiple values that bear some relation to each other.
- How big is your television diagonally (24″, 32″, 34″…)
- What are the dimensions of your television (20″x18″, 24″x20″…)
Compound data can be misrepresented as multiple simple data axes by ignoring the coupling between the two values, whereby the conditional probabilities are lost. Worse, it can be restricted down to a single simple data axis, whereby relevant data is entirely discarded (e.g. do sampled televisions only exist at certain aspect ratios?).
Descriptive data vs prescriptive preferences
- What did you have for dinner last night?
- What should we have for dinner tonight?
The difference is subtle, but it’s there. With descriptive data, respondents have no incentive to tactically misrepresent their opinions. With prescriptive preferences, respondents (aka voters) are aware that their response will influence a future that likely affects them and will vote accordingly.
When prescriptive preferences are confused as descriptive data, a few things can happen:
- Continuous data can be filtered down into discrete data. Instead of allowing degrees of preference, ballots are restricted to binary expressions of support.
- Compound data can be filtered down into simple data. Although multiple options exist, ballots restrict the voter to supporting a single candidate.
Both confusions are present when Plurality voting is used in lieu of a more expressive ballot, where in voters could express more than just a single binary preference for a single option. Maybe this comes from ignorance of alternative voting methods, technical restrictions from the online voting application used, or seeing everything as a nail when you’re holding a hammer; Take your pick.
When Plurality voting is used to obtain prescriptive preferences, the same type of analytical fallacy is being made as is detailed in the above examples, all of which lead to a poor understanding of the underlying reality the data represents.