Friday, 19 January 2018

Professional tennis players are optimisers

With plenty of action in Melbourne at the Australian Open this week, it seems timely for me to write a post about tennis. I've already noted in an earlier post that tennis players appear to be loss averse. But are they optimising nonetheless? Do they make decisions that maximise their chances of winning (which would also be consistent with loss aversion)?

A recent paper by Jeffrey Ely (Northwestern University), Romain Gauriot (University of Sydney), and Lionel Page (Queensland University of Technology), published in the Journal of Economic Psychology (sorry I don't see an ungated version) provides us with some answer. The authors look specifically at the risk behaviour of servers on first and second serve:
When serving, players can opt for risky serves which are more likely to fail but are harder to return if successful or more conservative serves which are less likely to fail but are also easier to return.
The key is whether players behave differently on first and second serves (more on that in a moment). However, simply comparing first and second serves is not so straightforward. The authors correctly note that there is:
...a potential caveat with raw data on tennis serve: it can be characterised by a selection problem. First serves are always observed while second serves are only observed when the first serve failed. This means that second serves may be more likely to be observed when serving is harder than usual either for natural reasons (e.g wind conditions), fitness (e.g. tiredness late in the match) or strategic reasons (e.g. opponent having learned how to return the player’s serve).
Their solution is quite ingenious:
To cleanly compare first and second serves one ideally wants to observe some random events which determines in a given situation whether a serve is going to be a first or a second serve. We argue that such a situation occurs when the ball hits the tape (top of the net) on the first serve. The impact with the net gives the ball an unpredictable trajectory leading the ball to be either in or out. It introduces the required randomness as a first serve follows a ball let which lands in the court and a second serve follows a ball which lands outside the court.
The serve immediately following a 'let serve' is randomly either a first serve (if the 'let serve' landed in) or a second serve (if the 'let serve' landed out). Ely et al. use a dataset from 3,188 matches, involving over 690,000 serves, of which 7,605 follow a 'let serve' and are the core sample of interest. They test four conditions which would imply that players are correctly maximising their chance of winning:

  1. That first serves are more risky than second serves (the probability that a serve lands in is lower for first serves);
  2. That first serves are harder to return than second serves (players are more likely to win the point on their first serve);
  3. Using two first serves is a suboptimal strategy (it leads to a lower probability of winning the point); and
  4. Using two second serves is also a suboptimal strategy.
They find that:
...the serves from professional tennis players meet four conditions which make them consistent with the optimal strategy of risk taking between first and second serves. This result is observed both overall and when splitting the sample by gender and ranking.
So, it appears that professional tennis players are optimisers. Which we should expect - they are trained professionals who have developed skills in strategic play over many years.

Read more:

Thursday, 18 January 2018

Why men earn more when they marry

Alexandra Killewald (Harvard) and Ian Lundberg (Princeton) wrote in the IUSSP online magazine in June last year about their paper published in the journal Demography:
On average, in the United States, men earn more per hour when married than when single, even after adjusting for differences such as age and education. However, despite the suggestive evidence that marriage may exert a causal effect on men’s wages, we argue that closer inspection reveals little evidence of such a link.
There are a number of theories as to why marriage might cause an increase in wages for married men, compared with unmarried men. The Nobel prize-winner Gary Becker suggested that it was because of specialisation - wives contributing to unpaid labour at home freed up men to concentrate more on paid labour. Alternatively, it might be because marriage leads to a change of motivation - men who have become primary breadwinners for a family are more motivated to work harder and earn more to provide for their family. A third explanation is discrimination - employers may see marriage as a signal of stability for a male worker, and be willing to offer more work to men who are married.

However, it is also possible that the causation works in the other direction - that men who earn more are more marriage-worthy suitors and therefore more likely to be able to convince a woman to marry them. Alternatively, maybe there is actually no causal relationship between marriage and wages at all, but there is some third variable that causes both increases in marriage and increases in wages. One example is simply maturity - more mature men are more likely to marry, and as men mature they earn more (due to increased work experience).

In their paper, Killewald and Lundberg aren't able to directly test which direction causality runs, but they do gather some reasonably convincing evidence that marriage doesn't cause increases in earnings for men, using data on 4,218 men from the National Longitudinal Study of Youth 79 (NLSY79). First, they show that there is an apparent marriage wage premium of 3.8%, similar to other studies.

Second, they show that the increase in earnings happens before marriage, which seems to rule out specialisation as an explanation, since specialisation cannot easily occur before marriage [*]. However, that result might strengthen the case for motivation as an explanation, since some men will anticipate future marriage and being to work harder before marriage. It also suggests that reverse-causation might be at play. That is, men who are earning more are more marriage-worthy.

Third, they show that shotgun marriages (those that were followed by a birth within seven months) are no different than other marriages in terms of effects. That would rule out the increase in earnings arising from anticipation of future marriage, since shotgun weddings are less anticipated [**].

Fourth, they compare the results for men who marry at different ages. They find that men who marry after age 26 have no marriage premium, so the marriage premium is entirely among younger men. This seems to rule out both motivation and discrimination as explanations, as well as reverse causality.

What does that leave? Killewald and Lundberg suggest that maturation is the most likely explanation, and that the observed relationship between marriage and earnings for men is therefore spurious. They conclude in their paper that:
These results are consistent with the claim that marriage is associated with wage gains simply because the timing of marriage is correlated with the transition to adulthood. It may also be consistent with delay of marriage until financial thresholds are met, which may especially affect younger men, who have lower average wages.
I guess sometimes even Gary Becker can be wrong.


[*] When I was reading the paper, I thought they had missed the obvious point that cohabitation can precede marriage, but they include a separate control for cohabitation in their models. They also tested models in their robustness checks that "...described wage patterns relative to the start of a first coresidential partnership (either marriage or cohabitation)...", and "...we found results very similar to those in the main models...".

[**] It is worth noting though, that shotgun weddings only occur for those men who are willing to marry. They will generally be more responsible, and hence more similar to those who plan ahead, then the less responsible men who knock up their girlfriend and then don't marry them. I'm unsure that it biases their results, but it is certainly one explanation for why there are no differences between those two groups.

Wednesday, 17 January 2018

Dolphins, incentives, and unintended consequences

In ECON110, when I teach about incentives and unintended consequences in the first week of class, one of the tutorial examples involves a story about paleontologists in China, who offered to pay peasant villagers for each dinosaur fossil fragment they found. The villagers responded to the incentive by giving the paleontologists lots of fossil fragments. However, they obtained the fossil fragments by breaking larger fossils into smaller fragments. Incentives can (and often do) lead to unintended consequences.

Now, it turns out, at least one group of dolphins is responding in a very similar way to a similar set of incentives, as the Guardian reports:
At the Institute for Marine Mammal Studies in Mississippi, Kelly the dolphin has built up quite a reputation. All the dolphins at the institute are trained to hold onto any litter that falls into their pools until they see a trainer, when they can trade the litter for fish. In this way, the dolphins help to keep their pools clean.
Kelly has taken this task one step further. When people drop paper into the water she hides it under a rock at the bottom of the pool. The next time a trainer passes, she goes down to the rock and tears off a piece of paper to give to the trainer. After a fish reward, she goes back down, tears off another piece of paper, gets another fish, and so on. This behaviour is interesting because it shows that Kelly has a sense of the future and delays gratification. She has realised that a big piece of paper gets the same reward as a small piece and so delivers only small pieces to keep the extra food coming. She has, in effect, trained the humans.
Her cunning has not stopped there. One day, when a gull flew into her pool, she grabbed it, waited for the trainers and then gave it to them. It was a large bird and so the trainers gave her lots of fish. This seemed to give Kelly a new idea. The next time she was fed, instead of eating the last fish, she took it to the bottom of the pool and hid it under the rock where she had been hiding the paper. When no trainers were present, she brought the fish to the surface and used it to lure the gulls, which she would catch to get even more fish. After mastering this lucrative strategy, she taught her calf, who taught other calves, and so gull-baiting has become a hot game among the dolphins.
No one who creates an incentive will ever be as smart as all the people (or dolphins) out there scheming to take advantage of the incentives.

[HT: Marginal Revolution]

Monday, 15 January 2018

Should we worry about non-randomness in multiple choice answers?

The human brain is built (in part) to recognise and act on patterns - it is "one of the most fundamental cognitive skills we possess". Often, we can see patterns in what is essentially random noise. Another way of thinking about that is that humans are pretty bad at recognising true randomness (for example, see here or here), and perceive bias or patterns in random processes.

Now, consider a multiple choice test, with four options for each question (A, B, C, or D). When a most teachers prepare multiple choice tests, they probably aren't thinking about whether the answers will appear sufficiently random to students. That is, they probably aren't thinking about how students will perceive the sequence of answers to each question, and whether the sequence will affect how students answer. That can lead to some interesting consequences. For instance, in a recent semester in ECON100, we have five answers of 'C' in a row (and of the 15 questions, 'C' was the correct answer for eight of them). I didn't even realise we had set that up until I was writing up the answers (just before the students sat the test), and it made me a little worried.

Why was I worried? Consider this 'trembling hand hypothesis': Say that a student is very unsure of the answer, but they think it might be 'C'. But, they are also aware that there are four possible answers, and they believe that the teacher is likely to spread the answers around in a fairly random way. If this student had answered 'C' to the previous question, that might not be a problem. But if they had answered 'C' to the previous four answers, that might cause them to re-consider their answer. Their uncertainty then may cause them to change their answer (or one of the earlier answers that they are unsure of), even though 'C' might be the correct answer (or their preferred answer, even though they are unsure).

Conversations with students after that ECON100 test with the many 'C' answers suggested to me that it probably didn't cause too many students to change their answers, but it did raise their anxiety levels. However, a new paper by Hubert János Kiss (Eötvös Loránd University, Hungary) & Adrienn Selei (Regional Centre For Energy Policy Research, Hungary), published in the journal Education Economics (sorry I don't see an ungated version online), looks at this in more depth.

Kiss and Selei use data from 153 students who sat one (or more) of five exams at Eötvös Loránd University over a two-week period. All five exams were for the same course (students could choose which exam time they attended, but the exam questions were different at each time). The authors ensured that half of students in each exam session had an exam paper where there were 'streaks' of correct answers that were the same letter, and half of the students had an exam paper with a more 'usual' distribution of answers. They then tested the differences between the two groups, and found:
Treatment has a significant effect at date 1 [for the first exam]. Points obtained in the multiple-choice test are 3 points lower in the treated group even if we control for the other variables. However, at the other dates and when looking at the overall data, treatment has no significant effect.
They then concluded that there was no treatment effect - that is, that the 'streaks' did not affect student performance. However, students in the first exam were significantly negatively affects (and received about three fewer marks out of 100). Presumably, students talk to each other, and these students in the first exam would have told other students about the unusual pattern of multiple choice answers they found (even though they didn't know the correct answers at that time). So, students in the later exams would probably have been primed not to be caught out by 'streaks' of answers. To be fair, the authors note this:
One may argue that after the first exam, students learned from their peers that streaks were not uncommon, causing the treatment effect to become insignificant later. Unfortunately, we cannot test if this is the case.
Indeed, but it doesn't seem unlikely. Kiss and Selei then go on to test whether students who give a particular letter answer to a question are more (or less) likely to give the same letter answer to the next question, and find that:
In half of the cases, the effect of having an identical previous correct answer (samecorrect1) is not significant at the usual significance levels... In the control treatments, we tend to observe a significant positive effect. Having two identical previous correct answers (samecorrect2) has a consistently negative impact on the probability of giving the correct answer to a given question, and this effect is significant... However, the effect of having three identical previous correct answers (samecorrect3) goes against our expectations, as in the cases where it has a significant effect, this effect is positive!
These results are a little unusual, but I think the problem is in the analysis. There are essentially two effects occurring here. First, good students are more likely to get the correct answer, regardless of whether it is the same letter answer as the previous question. Second, students may have a trembling hand when they observe a 'streak' of the same letter answer. Students who are willing to maintain a streak are likely to be the better students (since the not-so-good students eventually switch out of the streak due to the trembling hand, especially if the trembling hand effect increases with the length of the 'streak'). So, it doesn't at all surprise me that observing two previous same letter answers leads students on average to switch to an incorrect answer, but for that effect to become statistically insignificant for longer streaks - only the good students remain in the streak.

The authors control for student quality by using their results in essay questions, but that only adjusts for average (mean) differences between good and not-so-good students. It doesn't test whether the 'streaks' have different effects on different quality students. In any case, their sample size is probably too small to detect any difference in these effects.

All of which still leaves me wondering, should we worry about non-randomness in multiple choice answers? We'll have to wait for a more robust study for an answer to that question, and in the meantime, I'll make sure to check the distribution of answers to ECONS101 multiple choice questions. Just in case.

Read more: