Living in a World Full of Answers


I‘m really enjoying Ulrich Boser’s new book Learn Better. It is nicely balanced, humble, serious and informative, with a good tempo and a somewhat answers uncanny ability to make me spin to the side in my office chair every once in a while to rest my chin in a finger tent.

In particular, a section on the theme of embracing difficulties in learning got my chair spinning and fingers tenting. Here’s just one snippet:

The practical takeaway here is pretty simple. We need to believe in struggle. We need to know that learning is difficult. What’s more, we need the people around us to believe, too. . . .

This idea is at the heart of Lisa Son’s approach. She’s building norms around the nature of effort, the essence of struggle, the path to expertise. As Son told me, laughing, “I think I overdid it, but if someone gives my kid the answer, she’ll kill you.”

What I would add to this, though, is that, what typifies “struggle” for students in adulthood is not figuring things out for themselves when no one will give them the answers; it’s figuring things out for themselves in a world awash with answers. Students need to be able to deal with answers—from experts and from their peers—while keeping the lights of critical thinking on upstairs. That’s the struggle. And you don’t get practice with that struggle when you spend a lot of your schooling time stuck in a goobery game of hint-hint hide-and-seek with answers.

Students need practice dealing with “answers” from other people—from people of different races and religions, or no religion; from people whom you don’t like or who don’t like you; and from people who are more expert or less expert than you. And they need to be able to figure out that some of those answers are correct, and some of them are not, and other times there is no solid answer even when everyone else is convinced there is, or there is a solid answer when everyone else is convinced there isn’t. Students need practice listening to other people and understanding what they are saying, without feeling that their identity or cognitive liberty has been threatened.

I have to think that, while withholding answers is a good technique to use occasionally (and deliberately and skillfully), as a strategy it can run the risk of producing a generation of narcissistic idiots who close their ears to “answers” they themselves didn’t come up with.

Postscript: The subject of embracing difficulties comes up a little later in the book as well:

As a learning tool, DragonBox does not seem to teach students all that much, though, and people who play the game don’t do any better at solving algebraic equations, according to one recent study. Researcher Robert Goldstone recently examined the software, and he told me that the app didn’t appear to provide any more grounding in algebra than “tuning guitars.”

In the bluntest of terms, there’s simply no such thing as effortless learning. To develop a skill, we’re going to be uncomfortable, strained, often feeling a little embattled. Just about every major expert in the field of the learning sciences agrees on this point. Psychologist Daniel Willingham writes that students often struggle because thinking is difficult.

It’s true that thinking is hard work, actually, but it’s also true that learning is difficult in part because what you are learning, whatever it is, was created by people who think differently from you (at the moment, if you’re a novice). And school—when it is not mostly a game of “guess what’s in my head”—is one of the first places young people are exposed to this kind of thinking.

Educational Achievement and Religiosity


educational achievement

I outlined a somewhat speculative argument that would support a prediction that increased religiosity at the social level should have a negative effect on educational achievement here, where I suggested that

Educators surrounded by cultures with higher religiosity—and regardless of their own personal religious orientations—will simply have greater exposure to concerns about moral and spiritual harm that can be wrought by science, in addition to the benefits it can bring.

Such weakened confidence in science may not only directly water down the content of instruction in both science and mathematics—by, for example, diluting science content antagonistic to religious beliefs in published standards and curriculum guides—but could also represent an environment in which it is seen as inartful or even taboo for educators of any stripe to lean on scientific findings and perspectives in order to improve educational outcomes (because nurturing children may be seen to be the provenance of more spiritual and less scientific approaches). Both of these effects, one social, one policy-level, could have a negative effect on achievement.

A new paper, coauthored by renowned evolutionary psychologist David Geary, shows that religiosity at a national level does indeed have a strong negative effect on achievement (r = –0.72, p < 0.001). Yet, Stoet and Geary’s research suggests a different, simpler mechanism at work than the mechanisms I suggested above to explain the connection between religiosity and math and science educational achievement. This mechanism is displacement.

The Displacement Hypothesis

It’s a bit much to give this hypothesis its own section heading—not that it isn’t important, necessarily. It’s just self-explanatory. Religiosity may be correlated with lower educational achievement because people have a finite amount of time and attention, and spending time learning about religion or engaging in religious activities necessarily takes time away from learning math and science.

It is not necessarily the content of the religious beliefs that might influence educational growth (or lack thereof), but that investment of intellectual abilities that support educational development are displaced by other (religious) activities (displacement hypothesis). This follows from Cattell’s (1987) investment theory, with investment shifting from secular education to religious materials rather than shifts from one secular domain (e.g., mathematics) to another (e.g., literature). This hypothesis might help to explain part of the variation in educational performance broadly (i.e., across academic domains), not just in science literacy.

One reason the displacement hypothesis makes sense is that religiosity is as powerfully negatively correlated with achievement in mathematics as it is with science achievement.

The Scattering Hypothesis

But certainly a drawback of the displacement hypothesis is that there are activities we engage in—as unrelated to mathematics and science as religion is—which don’t, as far as we know, correlate strongly negatively with achievement. Physical exercise, for goodness’ sake, is one example of such an activity. Perhaps there is something especially toxic about religiosity as the displacer which deserves our attention.

Maybe religiosity (or, better, a perspective which allows for supernatural explanations or, indeed, unexplainable phenomena) has a diluent or scattering effect on learning. If so, here are two analogies for how that might work:

  • Consider object permanence. Prior to developing the understanding that objects continue to exist once they are out of view, children will almost immediately lose interest in an object that is deliberately hidden from them, even if they were attending to it just moments earlier. Why? Because it is possible (to them) that the object has vanished from existence when you move it out of their view. If it were possible for a 4-month-old to crawl up and look behind the sofa to see that grandma had actually disappeared during a game of peek-a-boo, they would have nothing to wonder about. The disappearance was possible, so why shouldn’t it happen? This possibility is gone once you develop object permanence.
  • Perhaps more relevant, not to mention ominous: climate change. It is well known that religiosity and acceptance of the theory of evolution are negatively correlated. And it turns out there is a strong positive link between evolution denialism and climate-change denialism. How might religiosity support both of these denialisms? Here we can benefit from substituting for ‘religiosity’ some degree of subscription to supernatural explanations: If the universe was made by a deity for us, then how can we be intruders in it, and how could we—by means that do not transgress the laws of this deity—defile it? This seems a perfectly reasonable use of logic, once you have allowed for the possibility of an omniscient benevolence who gifted your species the entire planet you live on.

The two of these together seem pretty bizarre. But I’m sure you catch the drift. In each case, I would argue that the constriction of possibilities—to those supported by naturalistic explanations rather than supernatural ones—is actually a good thing. You are less likely to be prodded to explain how the natural world works when supernatural reasons are perfectly acceptable. And supernaturalism can prevent you from fully appreciating your own existence and the effects it has on the natural world. Under supernaturalism, you can still engage in logical arguments and intellectual activity. You can write books and go to seminars. Your neurons could be firing. But if you’re not thinking about reality, it doesn’t do you any good.

Religiosity or supernaturalism does not make you dumb. But perhaps it has the broader effect of making it more difficult to fasten minds onto reality, as it fills the solution space with only those possibilities that have little bearing on the real world we live in. This would certainly show up in measures of educational achievement.
Stoet, G., & Geary, D. (2017). Students in countries with higher levels of religiosity perform lower in science and mathematics Intelligence DOI: 10.1016/j.intell.2017.03.001

Bloom’s Against Empathy


I‘m on my way out the door to be on vacation, but I wanted to mention (and recommend) Paul Bloom’s new book, Against Empathy: The Case for Rational Compassion, before I do—you know, to put you in the holiday spirit.

Bloom makes a strong case that empathic concern acts as a spotlight—inducing a kind of moral tunnel vision:

Empathy is a spotlight focusing on certain people in the here and now. This makes us care more about them, but it leaves us insensitive to the long-term consequences of our acts and blind as well to the suffering of those we do not or cannot empathize with. Empathy is biased, pushing us in the direction of parochialism and racism. It is shortsighted, motivating actions that might make things better in the short term but lead to tragic results in the future. It is innumerate, favoring the one over the many.

In line with Bloom’s narrative, I would say that the short-sightedness of empathy is what makes students’ boredom more salient than students’ lack of prior knowledge. The innumeracy of empathic concern leads to a valorization of personalization and individualism at the expense of shared knowledge of a shared reality. And its bias? I’m sure you can think of a few ways it blinkers us, makes us less fair, maybe leads us to believe that a white middle-class definition of “success” is one that everyone shares or that everyone should share.

Perhaps next year we can talk about how in-the-trenches empathy is not such a great thing, and that perhaps we need less of it in education—and more rational compassion.

Instructional Effects: Action at a Distance

I really like this recent post, called Tell Me More, Tell Me More, by math teacher Dani Quinn. The content is an excellent analysis of expert blindness in math teaching. The form, though, is worth seeing as well—it is a traditional educational syllogism, which Quinn helpfully commandeers to arrive at a non-traditional conclusion, that instructional effects have instructional causes, on the right:

The Traditional Argument An Alternative Argument
There is a problem in how we teach: We typically spoon-feed students procedures for answering questions that will be on some kind of test.

“There is a problem in how we teach: We typically show pupils only the classic forms of a problem or a procedure.”

This is why students can’t generalize to non-routine problems: we got in the way of their thinking and didn’t allow them to take ownership and creatively explore material on their own. “This is why they then can’t generalise: we didn’t show them anything non-standard or, if we did, it was in an exercise when they were floundering on their own with the least support.”

Problematically for education debates, each of these premises and conclusions taken individually are true. That is, they exist. At our (collective) weakest, we do sometimes spoon-feed kids procedures to get them through tests. We do cover only a narrow range of situations—what Engelmann refers to as the problem of stipulation. And we can be, regrettably in either case, systematically unassertive or overbearing.

Solving equations provides a nice example of the instructional effects of both spoon-feeding and stipulation. Remember how to solve equations? Inverse operations. That was the way to do equations. If you have something like \(\mathtt{2x + 5 = 15}\), the table shows how it goes.

Equation Step
\(\mathtt{2x + 5 \color{red}{- 5} = 15 \color{red}{- 5}}\) Subtract \(\mathtt{5}\) from both sides of the equation to get \(\mathtt{2x = 10}\).
\(\mathtt{\color{white}{+ 5 \,\,} 2x \color{red}{\div 2} = 10 \color{red}{\div 2}}\) Divide both sides of the equation by 2.
\(\mathtt{\color{white}{+ 5 \,\,}x = 5}\) You have solved the equation.

Do that a couple dozen times and maybe around 50% of the class freezes when they encounter \(\mathtt{22 = 4x + 6}\), with the variable on the right side, or, even worse, \(\mathtt{22 = 6 + 4x}\).

That’s spoon-feeding and stipulation: do it this one way and do it over and over—and, crucially, doing that summarizes most of the instruction around solving equations.

Of course, the lack of prior knowledge exacerbates the negative instructional effects of stipulation and spoon-feeding. But we’ll set that aside for the moment.

The Connection Between Premises and Conclusion

The traditional and alternative arguments above are easily (and often) confused, though, until you include the premise that I have omitted in the middle for each. These help make sense of the conclusions derived in each argument.

The Traditional Argument An Alternative Argument
There is a problem in how we teach: We typically spoon-feed students procedures for answering questions that will be on some kind of test.

“There is a problem in how we teach: We typically show pupils only the classic forms of a problem or a procedure.”

Students’ success in schooling is determined mostly by internal factors, like creativity, motivation, and self-awareness.

Students’ success in schooling is determined mostly by external factors, like amount of instruction, socioeconomic status, and curricula.

This is why students can’t generalize to non-routine problems: we got in the way of their thinking and didn’t allow them to take ownership and creatively explore material on their own. “This is why they then can’t generalise: we didn’t show them anything non-standard or, if we did, it was in an exercise when they were floundering on their own with the least support.”

In short, the argument on the left tends to diagnose pedagogical illnesses and their concomitant instructional effects as people problems; the alternative sees them as situation problems. The solutions generated by each argument are divergent in just this way: the traditional one looks to pull the levers that mostly benefit personal, internal attributes that contribute to learning; the alternative messes mostly with external inputs.

It’s Not the Spoon-Feeding, It’s What’s on the Spoon

I am and have always been more attracted to the alternative argument than the traditional one. Probably for a very simple reason: my role in education doesn’t involve pulling personal levers. Being close to the problem almost certainly changes your view of it—not necessarily for the better. But, roles aside, it’s also the case that the traditional view is simply more widespread, and informed by the positive version of what is called the Fundamental Attribution Error:

We are frequently blind to the power of situations. In a famous article, Stanford psychologist Lee Ross surveyed dozens of studies in psychology and noted that people have a systematic tendency to ignore the situational forces that shape other people’s behavior. He called this deep-rooted tendency the “Fundamental Attribution Error.” The error lies in our inclination to attribute people’s behavior to the way they are rather than to the situation they are in.

What you get with the traditional view is, to me, a kind of spooky action at a distance—a phrase attributed to Einstein, in remarks about the counterintuitive consequences of quantum physics. Adopting this view forces one to connect positive instructional effects (e.g., thinking flexibly when solving equations) with something internal, ethereal and often poorly defined, like creativity. We might as well attribute success to rabbit’s feet or lucky underwear or horoscopes!

instructional effects

Entia Successiva: How to Like Science

entia successiva

The term ‘entia successiva’ means ‘successive entities.’ And, as you may guess, it is a term one might come across in a philosophy class, in particular when discussing metaphysical questions about personhood. For instance, is a person a single thing throughout its entire life or a succession of different things—an ‘ens successivum’? Though there is no right answer to this question, becoming familiar with the latter perspective can, I think, help people to be more skeptical and knowledgeable consumers of education research.

Richard Taylor provides an example of a symphony (in here) that is, depending on your perspective, both a successive and a permanent entity:

Let us imagine [a symphony orchestra] and give it a name—say, the Boston Symphony. One might write a history of this orchestra, beginning with its birth one hundred years ago, chronicling its many tours and triumphs and the fame of some of its musical directors, and so on. But are we talking about one orchestra?

In one sense we are, but in another sense we are not. The orchestra persists through time, is incorporated, receives gifts and funding, holds property, has a bank account, returns always to the same city, and rehearses, year in and year out, in the same hall. Yet its membership constantly changes, so that no member of fifty years ago is still a member today. So in that sense it is an entirely different orchestra. We are in this sense not talking about one orchestra, but many. There is a succession of orchestras going under the same name. Each, in [Roderick] Chisholm’s apt phrase, does duty for what we are calling the Boston Symphony.

The Boston Symphony is thus an ens successivum.

People are entia successiva, too. Or, at least their bodies are. Just about every cell in your body has been replaced from only 10 years ago. So, if you’re a 40-year-old Boston Symphony like me, almost all of your musicians and directors have been swapped out from when you were a 30-year-old symphony. People still call you the Boston Symphony of course (because you still are), but an almost entirely different set of parts is doing duty for “you” under the banner of this name. You are, in a sense, an almost completely different person—one who is, incidentally, made up of at least as many bacterial cells as human ones.

What’s worse (if you think of the above as bad news), the fact of evolution by natural selection tells us that humanity itself is an ens successivum. If you could line up your ancestors—your mother or father, his or her mother or father, and so on—it would be a very short trip down this line before you reached a person with whom you could not communicate at all, save through gestures. Between 30 and 40 people in would be a person who had almost no real knowledge about the physical universe. And there’s a good chance that perhaps the four thousandth person in your row of ancestors would not even be human.

The ‘Successive’ Perspective

Needless to say, seeing people as entia successiva does not come naturally to anyone. Nor should it, ever. We couldn’t go about out our daily lives seeing things this way. But the general invisibility of this ‘successiveness’ is not due to its only being operational at the very macro or very micro levels. It can be seen at the psychological level too. Trouble is, our brains are so good at constructing singular narratives out of even absolute gibberish, we sometimes have to place people in unnatural or extreme situations to get a good look at how much we can delude ourselves.

An Air Force doctor’s experiences investigating the blackouts of pilots in centrifuge training provides a nice example (from here). It’s definitely worth quoting at length:

Over time, he has found striking similarities to the same sorts of things reported by patients who lost consciousness on operating tables, in car crashes, and after returning from other nonbreathing states. The tunnel, the white light, friends and family coming to greet you, memories zooming around—the pilots experienced all of this. In addition, the centrifuge was pretty good at creating out-of-body experiences. Pilots would float over themselves, or hover nearby, looking on as their heads lurched and waggled about . . . the near-death and out-of-body phenomena are both actually the subjective experience of a brain owner watching as his brain tries desperately to figure out what is happening and to orient itself amid its systems going haywire due to oxygen deprivation. Without the ability to map out its borders, the brain often places consciousness outside the head, in a field, swimming in a lake, fighting a dragon—whatever it can connect together as the walls crumble. What the deoxygenated pilots don’t experience is a smeared mess of random images and thoughts. Even as the brain is dying, it refuses to stop generating a narrative . . . Narrative is so important to survival that it is literally the last thing you give up before becoming a sack of meat.

You’ll note, I hope, that not only does the report above disclose how our very mental lives are entia successiva—thoughts and emotions that arise and pass away—but the report assumes this perspective in its own narrative. That’s because the report is written from a scientific point of view. And from that vantage point, people are assumed (correctly) to have parts that “do duty” for them and may even be at odds with each other, as they were with the pilots (a perception part fighting against a powerful narrative-generating part). The unit of analysis in the report is not an entire pilot, but the various mechanisms of her mind. Allowing for these parts allows for functional explanations like the one we see.

An un-scientific analysis, on the other hand, is entirely possible. But it would stop at the pilot. He or she is, after all, an indivisible, permanent entity. There is nothing else “doing duty” for him, so there are really only two choices: his experience was an illusion or it was real. End of analysis. Interpret it as an illusion and you don’t really have much to say; interpret it as real, and you can make a lot of money.

Entia Permanentia

Good scientific research in education will adopt an entia successiva perspective about the people it studies. This does not guarantee that its conclusions are correct. But it makes it more likely that, over time, it will get to the bottom of things.

This is not to say that an alternative perspective is without scientific merit. If we want to know how to improve the performance of the Boston Symphony, we can make some headway with ‘entia permanentia’—seeing the symphony as a whole stable unit rather than a collection of successive parts. We could increase its funding, perhaps try to make sure “it” is treated as well as other symphonies around the world. We could try to change the music, maybe include some movie scores instead of that stuffy old classical music. That would make it more exciting for audiences (and more inclusive), which is certainly one interpretation of “improvement.” But to whatever extent improvement means improving the functioning of the parts of the symphony—the musicians, the director, etc.—we can do nothing, because with entia permanentia these tiny creatures do not exist. Even raising the question about improving the parts would be beyond the scope of our imagination.

Further, seeing students as entia permanentia rather than entia successiva stops us from being appropriately skeptical about both ‘scientific’ and ‘un-scientific’ ideas. Do students learn best when matched to their learning style? What parts of their neurophysiology and psychology could possibly make something like that true? Why would it have evolved, if it did? In what other aspects of our lives might this present itself? Adopting the entia successiva perspective would have slowed the adoption of this myth (even if were not a myth) to a crawl and would have eventually killed it. Instead, entia permanentia, a person-level analysis, holds sway: students benefit from learning-style matching because we see them respond differently to different representations. End of analysis.

Finally, it should be noted, though it goes without saying, that simply putting one’s ideas into a journal article does not guarantee that one is looking for functional explanations. Even nearly a decade later, Deborah Ball is still good on this point, though the situation since then has improved, I think:

Research that is ostensibly “in education” frequently focuses not inside the dynamics of education but on phenomena related to education—racial identity, for example, young children’s conceptions of fairness, or the history of the rise of secondary schools. These topics and others like them are important. Research that focuses on them, however, often does not probe inside the educational process. . . . Until education researchers turn their attention to problems that exist primarily inside education and until they develop systematically a body of specialized knowledge, other scholars who study questions that bear on educational problems will propose solutions. Because such solutions typically are not based on explanatory analyses of the dynamics of education, the education problems that confront society are likely to remain unsolved.

Update: Okay, maybe one last word from a recurring theme in the book Switch:

In a pioneering study of organizational change, described in the book The Critical Path to Corporate Renewal, researchers divided the change efforts they’d studied into three groups: the most successful (the top third), the average (the middle third), and the least successful (the bottom third). They found that, across the spectrum, almost everyone set goals: 89 percent of the top third and 86 percent of the bottom third . . . But the more successful change transformations were more likely to set behavioral goals: 89 percent of the top third versus only 33 percent of the bottom third.

Why do “behavioral” goals work when just “goals” don’t? Behavioral goals are, after all, telling you what to do, forcing you to behave in a certain way. Do you like to be told what to do? Probably not.

But the “you” that responds to behavioral goals isn’t the same “you” whose in-the-moment “likes” are important. You are more than just one solid indivisible self. You are many selves, and the self that can start checking stuff off the to-do list is often pulling the other selves behind it. And when it does, you get to think that “you” are determined, “you” take initiative, “you” have willpower. But in truth, your environment—both immediate and distant, both internal and external—has simply made it possible for that determined self to take the lead. Behavioral goals often create this exact environment.

The Wason Selection Task, Part 1

A variant of the Wason selection task goes as follows: four people are sitting at a bar. They are represented by the cards below, which show an age or a drink type.


Each person has an age and a drink type, but you can see only one of these for each person. Here is a rule: “every person that has an alcoholic drink is of legal age.” Your task is to select all those people, but only those people, that you would have to check in order to discover whether or not the rule has been violated.

Most people have little trouble picking the correct answer above. But, “across a wide range of published literature only around 10% of the general population” finds the correct answer to the original Wason selection task shown below:


Each card has a letter on one side and a number on the other, but you can see only one of these for each card. Here is a rule: “every card that has a D on one side has a 3 on the other.” Your task is to select all those cards, but only those cards, which you would have to turn over in order to discover whether or not the rule has been violated.

In fact, Matthew Inglis and Adrian Simpson (2004) found that mathematics undergraduates as well as mathematics academic staff, though performing significantly better than history undergraduates, performed unexpectedly poorly on the task, with only 29% of math undergrads and a shocking 43% of staff finding the correct answer.

In a chapter from The Cambridge Handbook of Expertise and Expert Performance, Paul Feltovich, Michael Prietula, and K. Anders Ericsson indicate the one factor that explains the differential results on the abstract and real-world versions of the task: knowledge.


Some studies showed reasoning itself to be dependent on knowledge. Wason and Johnson-Laird (1972) presented evidence that individuals perform poorly in testing the implications of logical inference rules (e.g., if p then q) when the rules are stated abstractly. Performance greatly improves for concrete instances of the same rules (e.g., ‘every time I go to Manchester, I go by train’). Rumelhart (1979), in an extension of this work, found that nearly five times as many participants were able to test correctly the implications of a simple, single-conditional logical expression when it was stated in terms of a realistic setting (e.g., a work setting: ‘every purchase over thirty dollars must be approved by the regional manager’) versus when the expression was stated in an understandable but less meaningful form (e.g., ‘every card with a vowel on the front must have an integer on the back’).

Reference: Inglis, M. & Simpson, A. Mathematicians and the Selection Task. Proceedings of the 28th Conference of the International Group for the Psychology of Mathematics Education, 2004. (3) 89-96.

A Thought About the Chinese Room

In 1980, philosopher John Searle proposed a thought experiment called The Chinese Room. Here’s a brief—and a bit fast-moving—summary of it:

Although the thought experiment was apparently meant to illustrate the impossibility of “strong AI,” or artificial intelligence capable of consciousness and human-like “mind,” it has clear relevance for thinking about education as well. What differentiates “meaning-based” understanding and mimicry? for example.

Three Little Words: “I Don’t Know”

chinese room

You are invited to notice that the room’s occupant doesn’t actually know Chinese, but simply matches characters from an input message to a page in a rule book in order to generate an output message. Presumably, the particular method for processing inputs is not important, so long as the person inside the room does not understand the characters he is receiving or sending out.

The problem here is that this observation is not interesting unless you smuggle in some assumption about how the human mind works. It is not interesting that the person in the room doesn’t understand Chinese unless you make the assumption that something in a Chinese speaker’s brain does understand Chinese. Needless to say, this is not the case. Neurons don’t understand Chinese. They are precisely as clueless as the man in the room.

The correct comparisons—at the comparable scales—should be Chinese room : Chinese speaker and person in room : inner workings of the mind. The thought experiment misleads us into comparing the person in the room with the Chinese speaker. This would work if the speaker were identical with the inner workings of her mind. Which is exactly the assumption we make about people, even though we know it has to be wrong.

Exactly how the human mind differs from elaborate mechanical rule-following is beside the point. The point is that for the Chinese Room thought experiment to have its intended effect, it seems you must assume that the human mind has some kind of “meaning” mechanism which is not built out of dumber parts. But John Searle didn’t know this in 1980. And you don’t know it now. It is an unwarranted assumption.

And almost certainly false.

chinese room

Teach Me My Colors

toy problem

In the box below, you can try your hand at teaching a program, a toy problem, to reliably identify the four colors red, blue, yellow, and green by name.

You don’t have a lot of flexibility, though. Ask the program to show you one of the four colors, and then provide it feedback as to its response—in that order. Then repeat. That’s all you’ve got. That and your time and endurance.

Of course, I’d love to leave the question about the meaning of “reliably identify the four colors” to the comments, but let’s say that the program knows the colors when it scores 3 perfect scores in a row—that is, if you cycle through the 4 colors three times in a row, and the program gets a 4 out of 4 all three times.

Just keep in mind that closing or refreshing the window wipes out any “learning.” Kind of like summer vacation. Or winter break. Or the weekend.

Death, Taxes, and the Mind

The teaching device above is a toy problem because it is designed to highlight what I believe to be the most salient feature of instruction—the fact that we don’t know a lot about our impact. Can you not imagine someone becoming frustrated with the “teaching” above, perhaps feverishly wondering what’s going on in the “mind” of the program? Ultimately, the one problem we all face in education is this unknown about students’ minds and about their learning—like the unknown of how the damn program above works, if it even does.

One can think of the collective activity of education as essentially the group of varied responses to this situation of fundamental ambiguity and ignorance. And similarly, there are a variety of ways to respond to the painful want of knowing solicited by this toy problem:

Seeing What You Want to See
Pareidolia is the name given to an occurrence where people perceive a pattern that isn’t there—like the famous “face” on Mars (just shadows, angles, and topography). This can happen when incessantly clicking on the teaching device above too. In fact, these kinds of pattern-generating hypotheses jumped up sporadically in my mind as I played with the program, and I wrote the program. For example, I noticed on more than one occasion that if I took a break from incessant clicking and came back, the program did better on that subsequent trial. And between sessions, I was at one point prepared to say with some confidence that the program simply learned a specific color faster than the others. There are a huge number of other, related superstitions that can arise. If you think they can only happen to technophobes and the elderly, you live in a bubble.

Constantly Shifting Strategies
It might be optimal to constantly change up what you’re doing with the teaching device, but trying to optimize the program’s performance over time is probably not why you do it. Frustration with a seeming lack of progress and following little mini-hypotheses about short-term improvements are more likely candidates. A colleague of mine used to characterize the general orientation to work in education as the “Wile E. Coyote approach”—constantly changing strategies rather than sticking with one and improving on it. The darkness is to blame.

Letting the Activity Judge You
This may be a bit out in left field, but it’s something I felt while doing the toy problem “teaching,” and it is certainly caused by the great unknown here—guilt. Did I remember to give feedback that last time? My gosh, when was the last time I gave it? Am I the only one who can’t figure this out, who is having such a hard time with this? (Okay, I didn’t experience that last one, but I can imagine someone experiencing it.) It seems we will happily choose even the distorted feel-bad projections of a hyperactive conscience over the irritating blankness of not knowing. Yet, while we might find some consolation in the truth that we’re too hard on ourselves, we also have the unhappy task of remembering that a thousand group hugs and high-fives are even less effective than a clinically diagnosable level of self-loathing at turning unknowns into knowns.

Conjecturing and Then Testing
This, of course, is the response to the unknown that we want. For the toy problem in particular, what strategies are possible? Can I exhaust them all? What knowledge can I acquaint myself with that will shine light on this task? How will I know if my strategy is working?

Here’s a plot I made of one of my runs through, using just one strategy. Each point represents a test of all 4 colors, and the score represents how many colors the program identified correctly.

Was the program improving? Yes. The mean for the first 60 trials was approximately 1.83 out of 4 correct, and the mean for the back 63 was approximately 2.14 out of 4. That’s a jump from about 46% to about 54%.

Is that the best that can be done? No. But that’s just another way the darkness gets ya—it makes it really hard to let go of hard-won footholds.

Knowing Stuff

Some knowledge about how the human mind works is analogous to knowing something about how programs work in the case of this toy problem. Such knowledge makes it harder to be bamboozled by easy to vary explanations. And in general such knowledge works like all knowledge does—it keeps you away, defeasibly, from dead-ends and wrong turns so that your cognitive energy is spent more productively.

Knowing something about code, for example, might instantly give you the idea to start looking for it in the source for this page. It’s just a right click away, practically. But even if you don’t want to “cheat,” you can notice that the program serves up answers even prior to any feedback, which, if you know something about code, would make you suspect that they might be generated randomly. Do they stay random, or do they converge based on feedback? And what hints does this provide about the possible functioning of the program? These better questions are generated by knowledge about typical behavior, not by having a vast amount of experience with all kinds of toy problem teaching devices.

How It Works

So, here’s how it works. The program contains 4 “registers,” or arrays, one for each of the 4 colors—blue, red, green, yellow. At the beginning of the training, each of those registers contains the exact same 4 items: the 4 different color names. So, each register looks like this at the beginning: [‘blue’, ‘red’, ‘green’, ‘yellow’].

Throughout the training, when you ask the program to show you a color, it chooses a random one from the register. This behavior never changes. It always selects a random color from the array. However, when you provide feedback, you change the array for that color. For example, if you ask the program to show you blue, and it shows you blue, and you select the “Yes” feedback from the dropdown, a “blue” choice is added to the register. So, if this happened on the very first trial, the “blue” register would change from [‘blue’, ‘red’, ‘green’, ‘yellow’] to [‘blue’, ‘red’, ‘green’, ‘yellow’, ‘blue’]. If, on the other hand, you ask for blue on the very first trial, and the program shows you green, and you select the “No” feedback from the dropdown, the 3 colors that are NOT green are added to the “blue” register. In that case, the “blue” register would change from [‘blue’, ‘red’, ‘green’, ‘yellow’] to [‘blue’, ‘red’, ‘green’, ‘yellow’, ‘blue’, ‘red’, ‘yellow’].

A little math work can reveal that positive feedback on the first trial moves the probability of randomly selecting the correct answer from 0.25 to 0.4. For negative feedback, there is still a strengthening of the probability, but it is much smaller: from 0.25 to about 0.29. These increases decrease over time, of course, as the registers fill up with color names. For positive feedback on the second trial, the probability would strengthen from 0.4 to 0.5. For negative feedback, approximately 0.29 to 0.3.

Thus, in some sense, you can do no harm here so long as your feedback matches the truth—i.e., you say no when the answer is incorrect and yes when it is correct. The probability of a correct answer from the program always gets stronger over time with appropriate feedback. Can you imagine an analogous conclusion being offered from education research? “Always provide feedback” seems to be the inescapable conclusion here.

But a limit analysis provides a different perspective. Given an infinite sequence of correct-answer-only trials \(\mathtt{C(t)}\) and an infinite sequence of incorrect-answer-only trials \(\mathtt{I(t)}\), we get these results:

\[\mathtt{\lim_{t\to\infty} C(t) = \lim_{t\to\infty}\frac{t + 1}{t + 4} = 1, \qquad \lim_{t\to\infty} I(t) = \lim_{t\to\infty}\frac{t + 1}{3t + 4} = \frac{1}{3}}\]

These results indicate that, over time, providing appropriate feedback only when the program makes a correct color identification strengthens the probability of correct answers from 0.25 to 1 (a perfect score), whereas the best that can be hoped for when providing feedback only when the program gives an incorrect answer is just a 1-in-3 shot at getting the correct answer. When both negative and positive feedback are given, I believe a similar analysis shows a limit of 0.5, assuming an equal number of both types of feedback.

Of course, the real-world trials bear out this conclusion. The data graphed above are from my 123 trials giving both correct and incorrect feedback. Below are data from just 67 trials giving feedback only on correct answers. The program hits the benchmark of 3 perfect scores in a row at Trial 53, and, just for kicks, does it again 3 more times shortly thereafter.


Of course, the toy problem here is not a student, and what is modeled as the program’s “cognitive architecture” is nowhere near as complex as a student’s, even with regard to the same basic task of identifying 4 colors. There are obviously a lot of differences.

Yet there are a few parallels as well. For example, behaviorally, we see progress followed by regress with both the program and, in general, with students. Perhaps our minds work in a probabilistic way similar to that of the program. Could it be helpful to think about improvements to learning as strengthening response probabilities? Relatedly, “practice” observably strengthens what we would call “knowledge” in the program just as it does, again in general, for students.

And, I think fascinatingly, we can create and reverse “misconceptions” in both students and in this toy problem. We can see how this operates on just one color in the program by first training it to falsely identify blue as ‘green’ (to a level we benchmarked earlier as mastery—3 perfect responses in a row). Then, we can switch and begin teaching it the correct correspondence. As we can now predict, reversing the misconception will take longer than instantiating it, even with the optimal strategy, because the program’s register will have a large amount of information in it—we will be fighting against that large denominator.

toy problem

Some Notes on Reductionism


There are two different meanings of reductionism to which I’ve been exposed, mainly by those who find themselves opposed to this particular ‘ism’ in education.

On the one hand, reductionism is what Sir Peter Medawar calls “nothing-buttery” in his brutal review of Pierre Teilhard de Chardin’s book The Phenomenon of Man. That is, tables are “nothing but” collections of atoms, mathematics understanding is “nothing but” a collection of memories about procedures, and so on.

On the other hand, reductionism can be simply talking too much about organizing learning—or using too many technical terms to do so. For instruction, it can refer to mere selectivity or filtering of information—it might be reductionist to say “To add two fractions, find common denominators, add the numerators, and then set the sum over the common denominator,” because there is more to adding fractions than this.

I mention these two meanings up front only to get them out of the way—to set them up as strawmen, between which (or outside which) we have to carve a path. The notion that only one level of analysis—one scale of interaction with the world—can apply to any topic (the “nothing-buttery” notion) is not held by any sensible person; nor is the notion that reductionism should be synonymous with scheduling or selectivity in instruction.

Turning Away from “Nothing-Buttery”

It seems to me that the author of this article in The Curriculum Journal occasionally makes the same general mistake that most everyone makes when arguing against reductionism in education—he steers us rightly away from the first strawman, only to run nearly headlong into the second. Here is how the first turn is made:

There are clearly considerable practical difficulties in converting the rich complexities of a discipline such as mathematics into a curriculum which can be accommodated within the artificial school experience of learning, where days are fragmented into discrete lessons of up to an hour or so. Yet mathematics teaching can become excessively fragmented beyond this. Ollerton (1994) condemns fragmented teaching where: “for one or two lessons children are responding to a set of short questions in an exercise, such as ‘solve the following equations’, and then the following day or week they are working on another skill such as adding fractions or working out areas of triangles.” (Ollerton, 1994, p. 63)

Almost without fail, those who would oppose reductionism will use the word “artificial” to describe school. This always strikes me as bizarre, even though I am completely in touch with the sensibility the use of this word appeals to. But if school is “artificial” because it is an activity divided into discrete time chunks, then so is the road trip I took with my family this past summer, or a young couple’s first date, or the most open and inclusive meeting of professional educators. Of course, we can choose to describe any of these scenarios as “nothing but” blocks of time filled with prescribed activities, but nothing makes them necessarily so outside of those descriptions. This applies even to those apparently awful, disconnected lessons full of short questions. A level of analysis consistent with painting a reductionist picture of school is chosen, and then we are invited to decry how reductionist it all seems.

And Into the Less-Than-Helpful

And what’s the alternative to this ‘artificiality’? Everyone has a limited amount of time, which must be taken up linearly in chunks. We do not regularly find ourselves in states of quantum superposition. Thus, having dodged the nothing-buttery strawman, here we at least graze the second one:

Working more holistically in the mathematics classroom means to some extent relinquishing teacher control (‘teacher lust’) over micromanaging every detail (Boole, 1931; Tyminski, 2010). It also entails a classroom focused on longer timescales. . . .

Features of working more holistically could include:

  • giving students richer, more complex mathematical problems with a deeper degree of challenge, so that solutions are not straightforward or obvious;
  • deliberately using problems which simultaneously call on a range of different areas of the curriculum, encouraging students to ‘see sideways’ and make connections;
  • using ‘open’ tasks, where students can exercise a significant degree of choice about how they define the task and how they approach it–importantly, the teacher does not have one fixed outcome in mind;
  • giving students sufficient time to explore different pathways without the pressure to arrive at ‘an answer’ quickly;
  • encouraging a view that being stuck or confused and not knowing what to do is normal and can be productive, that ambiguities can be beneficial for a time (Foster, 2011a), and that seeking not to ‘move students on’ too quickly can deepen their opportunities to learn (Dweck, 2000).

The second and fourth of these bullet points are good ideas for making teaching more ‘holistic’. The last and first don’t belong at all, and their appearance doesn’t inspire confidence that the word ‘holistic’ actually means anything in the article. As for the rest of this quote—it seems to represent this mind-boggling, to me, notion that teachers or teaching is the cause of this distasteful reductionism; that to make a class or an experience ‘holistic,’ we would do well to get rid of or diminish the teacher’s voice, rather than raise up its quality.

We can and should (and do) avoid the idea that stringing together “nothing but” pieces of content is sufficient to make ‘holistic’ understanding bubble up as an emergent property of student learning. But equally dubious, and equally unsubscribed, is the idea that learning can be transformed from fragmented to holistic by subtracting something from the experience.

The right reductionism and the right holism can work together. See this study, for example, summarized over here.

Audio Postscript

Foster, C. (2013). Resisting reductionism in mathematics pedagogy Curriculum Journal, 24 (4), 563-585 DOI: 10.1080/09585176.2013.828630

Toward an Education Science

A group known as Deans for Impact recently released this document, called “The Science of Learning,” as a very public beginning of an initiative to improve university teacher preparation. If you have a moment, take a look—it is an eminently brief and readable set of answers taken from cognitive science to questions about student learning. The appearance of the report also marks something of a beginning in building a true education science. I wrote about it nearly a decade ago and have been advocating for this beginning ever since.

The timing of DfI’s announcement also helpfully coincided with my revisiting David Deutsch’s The Beginning of Infinity, which readers will discover places near-fatal pressure on the common notion that the goodness of science is to be found ultimately in its oft-emphasized characteristics of testability, falsifiability, transparency, rejection of authority, openness to criticism, and empirical orientation. Rather, as Deutsch persuasively argues, the desire for good explanations—those that are “hard to vary”—is the real foundation for all of these characteristics, and is what has fundamentally made Enlightenment science so effective at allowing us to both control and make sense of the universe.

Consider, for example, the ancient Greek myth for explaining the annual onset of winter. Long ago, Hades, god of the underworld, kidnapped and raped Persephone, goddess of spring. Then Persephone’s mother, Demeter, goddess of the earth and agriculture, negotiated a contract for her daughter’s release, which specified that Persephone would marry Hades and eat a magic seed that would compel her to visit him once a year thereafter. Whenever Persephone was away fulfilling this obligation, Demeter became sad and would command the world to become cold and bleak so that nothing could grow. . . .

Now consider the true explanation of seasons. It is that the Earth’s axis of rotation is tilted relative to the plane of its orbit around the sun . . .

That is a good explanation—hard to vary, because all its details play a functional role. For instance, we know—and can test independently of our experience of seasons—that surfaces tilted away from radiant heat are heated less than when they are facing it, and that a spinning sphere in space points in a constant direction . . . Also, the same tilt appears in our explanation of where the sun appears relative to the horizon at different times of year. In the Persephone myth, in contrast, the coldness of the world is caused by Demeter’s sadness—but people do not generally cool their surroundings when they are sad, and we have no way of knowing that Demeter is sad, or that she ever cools the world, other than the onset of winter itself.

What’s the connection? Well, a somewhat out-of-focus constellation of legitimate worries appears whenever “education science” gets said a little too often in relation to classroom teaching. And just one star in that constellation seems to be the worry that “education science” doesn’t know what it’s talking about when it comes to teaching—that its methods ignore, among other things, the powerful effects of the relationship between teachers and students, and that the environments it sets up to test its hypotheses are far removed (environment and hypothesis both) from classroom realities.

And this worry has predictably resurfaced again following the release of the “Science of Learning” document and announcement.

What Deutsch’s argument can offer us in the face of this worry is the beginning of a convergence—away from feel-good unjustified assertions on the one hand and beating people over the head with stale research methods terminology and evidence mongering on the other—toward a shared desire for good, hard to vary, explanations: those that are functional (does it explain how it works?) and connected (does it help explain other things?).

A good, and necessary, first step toward an education science is not to arrogantly demand that science heed the “values” of practitioners nor to expect those practitioners to become classroom clinicians; but it will be to hold one another and ourselves accountable for better and better explanations of effective teaching and learning.

Image mask credit: Siyavula Education.

education science