Instructional Effects: Action at a Distance

I really like this recent post, called Tell Me More, Tell Me More, by math teacher Dani Quinn. The content is an excellent analysis of expert blindness in math teaching. The form, though, is worth seeing as well—it is a traditional educational syllogism, which Quinn helpfully commandeers to arrive at a non-traditional conclusion, that instructional effects have instructional causes, on the right:

The Traditional Argument An Alternative Argument
There is a problem in how we teach: We typically spoon-feed students procedures for answering questions that will be on some kind of test.

“There is a problem in how we teach: We typically show pupils only the classic forms of a problem or a procedure.”

This is why students can’t generalize to non-routine problems: we got in the way of their thinking and didn’t allow them to take ownership and creatively explore material on their own. “This is why they then can’t generalise: we didn’t show them anything non-standard or, if we did, it was in an exercise when they were floundering on their own with the least support.”

Problematically for education debates, each of these premises and conclusions taken individually are true. That is, they exist. At our (collective) weakest, we do sometimes spoon-feed kids procedures to get them through tests. We do cover only a narrow range of situations—what Engelmann refers to as the problem of stipulation. And we can be, regrettably in either case, systematically unassertive or overbearing.

Solving equations provides a nice example of the instructional effects of both spoon-feeding and stipulation. Remember how to solve equations? Inverse operations. That was the way to do equations. If you have something like \(\mathtt{2x + 5 = 15}\), the table shows how it goes.

Equation Step
\(\mathtt{2x + 5 \color{red}{- 5} = 15 \color{red}{- 5}}\) Subtract \(\mathtt{5}\) from both sides of the equation to get \(\mathtt{2x = 10}\).
\(\mathtt{\color{white}{+ 5 \,\,} 2x \color{red}{\div 2} = 10 \color{red}{\div 2}}\) Divide both sides of the equation by 2.
\(\mathtt{\color{white}{+ 5 \,\,}x = 5}\) You have solved the equation.

Do that a couple dozen times and maybe around 50% of the class freezes when they encounter \(\mathtt{22 = 4x + 6}\), with the variable on the right side, or, even worse, \(\mathtt{22 = 6 + 4x}\).

That’s spoon-feeding and stipulation: do it this one way and do it over and over—and, crucially, doing that summarizes most of the instruction around solving equations.

Of course, the lack of prior knowledge exacerbates the negative instructional effects of stipulation and spoon-feeding. But we’ll set that aside for the moment.

The Connection Between Premises and Conclusion

The traditional and alternative arguments above are easily (and often) confused, though, until you include the premise that I have omitted in the middle for each. These help make sense of the conclusions derived in each argument.

The Traditional Argument An Alternative Argument
There is a problem in how we teach: We typically spoon-feed students procedures for answering questions that will be on some kind of test.

“There is a problem in how we teach: We typically show pupils only the classic forms of a problem or a procedure.”

Students’ success in schooling is determined mostly by internal factors, like creativity, motivation, and self-awareness.

Students’ success in schooling is determined mostly by external factors, like amount of instruction, socioeconomic status, and curricula.

This is why students can’t generalize to non-routine problems: we got in the way of their thinking and didn’t allow them to take ownership and creatively explore material on their own. “This is why they then can’t generalise: we didn’t show them anything non-standard or, if we did, it was in an exercise when they were floundering on their own with the least support.”

In short, the argument on the left tends to diagnose pedagogical illnesses and their concomitant instructional effects as people problems; the alternative sees them as situation problems. The solutions generated by each argument are divergent in just this way: the traditional one looks to pull the levers that mostly benefit personal, internal attributes that contribute to learning; the alternative messes mostly with external inputs.

It’s Not the Spoon-Feeding, It’s What’s on the Spoon

I am and have always been more attracted to the alternative argument than the traditional one. Probably for a very simple reason: my role in education doesn’t involve pulling personal levers. Being close to the problem almost certainly changes your view of it—not necessarily for the better. But, roles aside, it’s also the case that the traditional view is simply more widespread, and informed by the positive version of what is called the Fundamental Attribution Error:

We are frequently blind to the power of situations. In a famous article, Stanford psychologist Lee Ross surveyed dozens of studies in psychology and noted that people have a systematic tendency to ignore the situational forces that shape other people’s behavior. He called this deep-rooted tendency the “Fundamental Attribution Error.” The error lies in our inclination to attribute people’s behavior to the way they are rather than to the situation they are in.

What you get with the traditional view is, to me, a kind of spooky action at a distance—a phrase attributed to Einstein, in remarks about the counterintuitive consequences of quantum physics. Adopting this view forces one to connect positive instructional effects (e.g., thinking flexibly when solving equations) with something internal, ethereal and often poorly defined, like creativity. We might as well attribute success to rabbit’s feet or lucky underwear or horoscopes!

instructional effects

Contiguity Effective for Deductive Inference

research post

The discourse that surrounds the technicalities in this paper contains an agenda: to convince readers that the benefits of retrieval practice extend beyond the boring old “helps you remember stuff” caricature to something more “higher order” like deductive inference. But I’m not convinced that the experiments show this. Rather, what they demonstrate fairly convincingly is that informational contiguity, not retrieval practice, benefits inference-making. A related result from the research, on the benefits of text coherence, is explained here.

The Setup

The arch-nemesis of this research is a paper by Tran, et al. last year, which appeared to show some domain limitations on retrieval practice:

They found that retrieval practice failed to benefit participants’ later ability to make accurate deductions from previously retrieved information. In their study, participants were presented sentences one at a time to learn . . . The sentences could be related to one another to derive inferences about particular scenarios. Although retrieval practice was shown to improve memory of the sentences relative to a restudy control condition, there was no benefit on a final inference test that required integration of information from across multiple sentences.


So, in this study, researchers replicated Tran et al.’s methods, except in one important way: they did not present the sentences to be learned one at a time but together instead.

Participants were each presented with four scenarios (two of which are outlined at right) consisting of seven to nine premises in the form of sentences. In each scenario, deductions to specific conclusions were possible. For two of the four scenarios, subjects used retrieval practice. They were given a chance to read the sentences in a scenario at their own pace and then were shown the sentences again for five minutes—in cycles where the order of the sentences was randomized. During this five-minute session, subjects were asked to type in the missing words in each premise (between one and three missing words). The complete sentences were then shown as feedback. Each participant used restudy for the other two scenarios. During the restudy five-minute session, subjects simply reread the premises again, in cycles again, with the order of the premises randomized for each cycle.

The Results and Discussion

Two days later, participants were given a 32-item multiple choice test which “assessed participants’ ability to draw logical conclusions derived from at least two premises within each scenario.” And consistent with the researchers’ hypothesis, the retrieval practice conditions yielded significantly better results on a test of deductive inference than did the restudy conditions.

Yet, it’s not at all clear that retrieval practice was the cause of the better performance with respect to inference-making. There was another cause preceding it: the improved contiguity of the presented information, as compared with Tran et al.’s one-at-a-time procedure. It’s possible that the effectiveness of retrieval practice is limited to recall of already-integrated information, and the contiguity of the premises in this study allowed for such integration, which, in turn, allowed retrieval practice to outperform restudy. It is a possibility the researchers raise in the paper and one that, in my view, the current research has not effectively answered:

However, other recent studies have failed to find a benefit of retrieval practice for learning educational materials (Leahy et al. 2015; Tran et al. 2015; Van Gog and Sweller 2015). These studies all used learning materials that required learners to simultaneously relate multiple elements of the materials during study and/or test. Such materials that are high in element interactivity need constituent elements to be related to one another in order for successful learning or task completion (element interactivity may also be considered as a measure of the complexity of materials, see (Sweller 2010)).

What we can say, with some confidence, is that even if the benefits of retrieval practice were limited to improvements in recall (as prior research has demonstrated), such improvements do not stand in the way of improvements to higher-order reasoning, such as inference-making. (And shaping the path for students, such as improving informational contiguity can have a positive effect too.)
Eglington, L., & Kang, S. (2016). Retrieval Practice Benefits Deductive Inference Educational Psychology Review DOI: 10.1007/s10648-016-9386-y

Entia Successiva: How to Like Science

entia successiva

The term ‘entia successiva’ means ‘successive entities.’ And, as you may guess, it is a term one might come across in a philosophy class, in particular when discussing metaphysical questions about personhood. For instance, is a person a single thing throughout its entire life or a succession of different things—an ‘ens successivum’? Though there is no right answer to this question, becoming familiar with the latter perspective can, I think, help people to be more skeptical and knowledgeable consumers of education research.

Richard Taylor provides an example of a symphony (in here) that is, depending on your perspective, both a successive and a permanent entity:

Let us imagine [a symphony orchestra] and give it a name—say, the Boston Symphony. One might write a history of this orchestra, beginning with its birth one hundred years ago, chronicling its many tours and triumphs and the fame of some of its musical directors, and so on. But are we talking about one orchestra?

In one sense we are, but in another sense we are not. The orchestra persists through time, is incorporated, receives gifts and funding, holds property, has a bank account, returns always to the same city, and rehearses, year in and year out, in the same hall. Yet its membership constantly changes, so that no member of fifty years ago is still a member today. So in that sense it is an entirely different orchestra. We are in this sense not talking about one orchestra, but many. There is a succession of orchestras going under the same name. Each, in [Roderick] Chisholm’s apt phrase, does duty for what we are calling the Boston Symphony.

The Boston Symphony is thus an ens successivum.

People are entia successiva, too. Or, at least their bodies are. Just about every cell in your body has been replaced from only 10 years ago. So, if you’re a 40-year-old Boston Symphony like me, almost all of your musicians and directors have been swapped out from when you were a 30-year-old symphony. People still call you the Boston Symphony of course (because you still are), but an almost entirely different set of parts is doing duty for “you” under the banner of this name. You are, in a sense, an almost completely different person—one who is, incidentally, made up of at least as many bacterial cells as human ones.

What’s worse (if you think of the above as bad news), the fact of evolution by natural selection tells us that humanity itself is an ens successivum. If you could line up your ancestors—your mother or father, his or her mother or father, and so on—it would be a very short trip down this line before you reached a person with whom you could not communicate at all, save through gestures. Between 30 and 40 people in would be a person who had almost no real knowledge about the physical universe. And there’s a good chance that perhaps the four thousandth person in your row of ancestors would not even be human.

The ‘Successive’ Perspective

Needless to say, seeing people as entia successiva does not come naturally to anyone. Nor should it, ever. We couldn’t go about out our daily lives seeing things this way. But the general invisibility of this ‘successiveness’ is not due to its only being operational at the very macro or very micro levels. It can be seen at the psychological level too. Trouble is, our brains are so good at constructing singular narratives out of even absolute gibberish, we sometimes have to place people in unnatural or extreme situations to get a good look at how much we can delude ourselves.

An Air Force doctor’s experiences investigating the blackouts of pilots in centrifuge training provides a nice example (from here). It’s definitely worth quoting at length:

Over time, he has found striking similarities to the same sorts of things reported by patients who lost consciousness on operating tables, in car crashes, and after returning from other nonbreathing states. The tunnel, the white light, friends and family coming to greet you, memories zooming around—the pilots experienced all of this. In addition, the centrifuge was pretty good at creating out-of-body experiences. Pilots would float over themselves, or hover nearby, looking on as their heads lurched and waggled about . . . the near-death and out-of-body phenomena are both actually the subjective experience of a brain owner watching as his brain tries desperately to figure out what is happening and to orient itself amid its systems going haywire due to oxygen deprivation. Without the ability to map out its borders, the brain often places consciousness outside the head, in a field, swimming in a lake, fighting a dragon—whatever it can connect together as the walls crumble. What the deoxygenated pilots don’t experience is a smeared mess of random images and thoughts. Even as the brain is dying, it refuses to stop generating a narrative . . . Narrative is so important to survival that it is literally the last thing you give up before becoming a sack of meat.

You’ll note, I hope, that not only does the report above disclose how our very mental lives are entia successiva—thoughts and emotions that arise and pass away—but the report assumes this perspective in its own narrative. That’s because the report is written from a scientific point of view. And from that vantage point, people are assumed (correctly) to have parts that “do duty” for them and may even be at odds with each other, as they were with the pilots (a perception part fighting against a powerful narrative-generating part). The unit of analysis in the report is not an entire pilot, but the various mechanisms of her mind. Allowing for these parts allows for functional explanations like the one we see.

An un-scientific analysis, on the other hand, is entirely possible. But it would stop at the pilot. He or she is, after all, an indivisible, permanent entity. There is nothing else “doing duty” for him, so there are really only two choices: his experience was an illusion or it was real. End of analysis. Interpret it as an illusion and you don’t really have much to say; interpret it as real, and you can make a lot of money.

Entia Permanentia

Good scientific research in education will adopt an entia successiva perspective about the people it studies. This does not guarantee that its conclusions are correct. But it makes it more likely that, over time, it will get to the bottom of things.

This is not to say that an alternative perspective is without scientific merit. If we want to know how to improve the performance of the Boston Symphony, we can make some headway with ‘entia permanentia’—seeing the symphony as a whole stable unit rather than a collection of successive parts. We could increase its funding, perhaps try to make sure “it” is treated as well as other symphonies around the world. We could try to change the music, maybe include some movie scores instead of that stuffy old classical music. That would make it more exciting for audiences (and more inclusive), which is certainly one interpretation of “improvement.” But to whatever extent improvement means improving the functioning of the parts of the symphony—the musicians, the director, etc.—we can do nothing, because with entia permanentia these tiny creatures do not exist. Even raising the question about improving the parts would be beyond the scope of our imagination.

Further, seeing students as entia permanentia rather than entia successiva stops us from being appropriately skeptical about both ‘scientific’ and ‘un-scientific’ ideas. Do students learn best when matched to their learning style? What parts of their neurophysiology and psychology could possibly make something like that true? Why would it have evolved, if it did? In what other aspects of our lives might this present itself? Adopting the entia successiva perspective would have slowed the adoption of this myth (even if were not a myth) to a crawl and would have eventually killed it. Instead, entia permanentia, a person-level analysis, holds sway: students benefit from learning-style matching because we see them respond differently to different representations. End of analysis.

Finally, it should be noted, though it goes without saying, that simply putting one’s ideas into a journal article does not guarantee that one is looking for functional explanations. Even nearly a decade later, Deborah Ball is still good on this point, though the situation since then has improved, I think:

Research that is ostensibly “in education” frequently focuses not inside the dynamics of education but on phenomena related to education—racial identity, for example, young children’s conceptions of fairness, or the history of the rise of secondary schools. These topics and others like them are important. Research that focuses on them, however, often does not probe inside the educational process. . . . Until education researchers turn their attention to problems that exist primarily inside education and until they develop systematically a body of specialized knowledge, other scholars who study questions that bear on educational problems will propose solutions. Because such solutions typically are not based on explanatory analyses of the dynamics of education, the education problems that confront society are likely to remain unsolved.

Update: Okay, maybe one last word from a recurring theme in the book Switch:

In a pioneering study of organizational change, described in the book The Critical Path to Corporate Renewal, researchers divided the change efforts they’d studied into three groups: the most successful (the top third), the average (the middle third), and the least successful (the bottom third). They found that, across the spectrum, almost everyone set goals: 89 percent of the top third and 86 percent of the bottom third . . . But the more successful change transformations were more likely to set behavioral goals: 89 percent of the top third versus only 33 percent of the bottom third.

Why do “behavioral” goals work when just “goals” don’t? Behavioral goals are, after all, telling you what to do, forcing you to behave in a certain way. Do you like to be told what to do? Probably not.

But the “you” that responds to behavioral goals isn’t the same “you” whose in-the-moment “likes” are important. You are more than just one solid indivisible self. You are many selves, and the self that can start checking stuff off the to-do list is often pulling the other selves behind it. And when it does, you get to think that “you” are determined, “you” take initiative, “you” have willpower. But in truth, your environment—both immediate and distant, both internal and external—has simply made it possible for that determined self to take the lead. Behavioral goals often create this exact environment.

Inference Calls in Text

research post

Britton and Gülgöz (1991) conducted a study to test whether removing “inference calls” from text would improve retention of the material. Inference calls are locations in text that demand inference from the reader. One simple example from the text used in the study is below:

Air War in the North, 1965

By the Fall of 1964, Americans in both Saigon and Washington had begun to focus on Hanoi as the source of the continuing problem in the South.

There are at least a few inferences that readers need to make here. Readers need to infer the causal link between “the fall of 1964” and “1965,” they are asked to infer that “North” in the title refers to North Vietnam, and they need to infer that “Hanoi” refers to the capital of North Vietnam.

The authors of the study identified 40 such inference calls (using the “Kintsch” computer program) throughout the text and “repaired” them to create a new version called a “principled revision.” Below is their rewrite of the text above, which appeared in the principled revision:

Air War in the North, 1965

By the beginning of 1965, Americans in both Saigon and Washington had begun to focus on Hanoi, capital of North Vietnam, as the source of the continuing problem in the South.

Two other versions (revisions), the details of which you can read about in the study, were also produced. These revisions acted as controls in one way or another for the original text and the principled revision.

Method and Predictions

One hundred seventy college students were randomly assigned one of the four texts–the original or one of the three revisions. The students were asked to read the texts carefully and were informed that they would be tested on the material. Eighty subjects took a free recall test, in which they were asked to write down everything they could remember from the text. The other ninety subjects took a ten-question multiple-choice test on the information explicitly stated in each text.

It’s not at all difficult, given this set up, to anticipate the researchers’ predictions:

We predicted that the principled revision would be retrieved better than the original version on a free-recall test. This was because the different parts of the principled revision were more likely to be linked to each other, so the learner was more likely to have a retrieval route available to use…. Readers of the original version would have to make the inferences themselves for the links to be present, and because some readers will fail to make some inferences, we predicted that there would be more missing links among readers of this version.

This is, indeed, what researchers found. Subjects who read the principled revision recalled significantly more propositions from the text (adjusted mean = 58.6) than did those who read the original version (adjusted mean = 35.5). Researchers’ predictions for the multiple-choice test were also accurate:

On the multiple-choice test of explicit factual information that was present in all versions, we predicted no advantage for the principled revision. Because we always provided the correct answer explicitly as one of the multiple choices, the learner did not have to retrieve this information by following along the links but only had to test for his or her recognition of the information by using the stem and the cue that was presented as one of the response alternatives. Therefore, the extra retrieval routes provided by the principled revision would not help, because according to our hypothesis, retrieval was not required.

Analysis and Principles

Neither of the two results mentioned above are surprising, but the latter is interesting. Although we might say that students “learned more” from the principled revision, subjects in the original and principled groups performed equally well on the multiple-choice test (which tests recognition, as opposed to free recall). As the researchers noted, this result was likely due to the fact that repairing the inference calls provided no advantage to the principled group in recognizing explicit facts, only in connecting ideas in the text.

But the result also suggests that students who were troubled by inference calls in the text just skipped over them. Indeed, subjects who read the original text did not read it at a significantly faster or slower rate than subjects who read the principled revision and both groups read the texts in about the same amount of time. Yet, students who read the original text recalled significantly less than those who read the principled revision.

In repairing the inference calls, the authors of the study identified three principles for better texts:

Principle 1: Make the learner’s job easier by rewriting the sentence so that it repeats, from the previous sentence, the linking word to which it should be linked. Corollary of Principle 1: Whenever the same concept appears in the text, the same term should be used for it.

Principle 2 is to make the learner’s job easier by arranging the parts of each sentence so that (a) the learner first encounters the old part of the sentence, which specifies where that sentence is to be connected to the rest of his or her mental representation; and (b) the learner next encounters the new part of the sentence, which indicates what new information to add to the previously specified location in his or her mental representation.

Principle 3 is to make the learner’s job easier by making explicit any important implicit references; that is, when a concept that is needed later is referred to implicitly, refer to it explicitly if the reader may otherwise miss it.

Britton, B., & Gülgöz, S. (1991). Using Kintsch’s computational model to improve instructional text: Effects of repairing inference calls on recall and cognitive structures. Journal of Educational Psychology, 83 (3), 329-345 DOI: 10.1037//0022-0663.83.3.329

Are Teaching and Learning Coevolved?

some hummingbirds coevolved with some flower species

Just a few pages in to David Didau and Nick Rose’s new book What Every Teacher Needs to Know About Psychology, and I’ve already come across what is, for me, a new thought—that teaching ability and learning ability coevolved:

Strauss, Ziv, and Stein (2002) . . . point to the fact that the ability to teach arises spontaneously at an early age without any apparent instruction and that it is common to all human cultures as evidence that it is an innate ability. Essentially, they suggest that despite its complexity, teaching is a natural cognition that evolved alongside our ability to learn.

Or perhaps this is, even for me, an old thought, but just unpopular enough—and for long enough—to seem like a brand new thought. Perhaps after years of exposure to the characterization of teaching as an anti-natural object—a smoky, rusty gearbox of torture techniques designed to break students’ wills and control their behavior—I have simply come to accept that it is true, and have forgotten that I had done so.

Strauss, et. al, however, provide some evidence in their research that it is not true. Very young children engage in teaching behavior before formal schooling by relying on a naturally developing ability to understand the minds of others, known as theory of mind (ToM).

Kruger and Tomasello (1996) postulated that defining teaching in terms of its intention—to cause learning, suggests that teaching is linked to theory of mind, i.e., that teaching relies on the human ability to understand the other’s mind. Olson and Bruner (1996) also identified theoretical links between theory of mind and teaching. They suggested that teaching is possible only when a lack of knowledge can be recognized and that the goal of teaching then is to enhance the learner’s knowledge. Thus, a theory of mind definition of teaching should refer to both the intentionality involved in teaching and the knowledge component, as follows: teaching is an intentional activity that is pursued in order to increase the knowledge (or understanding) of another who lacks knowledge, has partial knowledge or possesses a false belief.

The Experiment

One hundred children were separated into 50 pairs—25 pairs with a mean age of 3.5 and 25 with a mean age of 5.5. Twenty-five of the 50 children in each age group served as test subjects (teachers); the other 25 were learners. The teachers completed three groups of tasks before teaching, the first of which (1) involved two classic false-belief tasks. If you are not familiar with these kinds of tasks, the video at right should serve as a delightfully creepy precis—from what appears to be the late 70s, when every single instructional video on Earth was made. The second and third groups of tasks probed participants’ understanding that (2) a knowledge gap between teacher and learner must exist for “teaching” to occur and (3) a false belief about this knowledge gap is possible.

Finally, children participated in the teaching task by teaching the learners how to play a board game. The teacher-children were, naturally, taught how to play the game prior to their own teaching, and they were allowed to play the game with the experimenter until they demonstrated some proficiency. The teacher-learner pair was then left alone, “with no further encouragement or instructions.”

The Results

Consistent with the results from prior false-belief studies, there were significant differences between the 3- and 5-year-olds in Tasks (1) and (3) above, both of which relied on false-belief mechanisms. In Task (3), when participants were told, for example, that a teacher thought a child knew how to read when in fact he didn’t, 3-year-olds were much more likely to say that the teacher would still teach the child. Five-year-olds, on the other hand, were more likely to recognize the teacher’s false belief and say that he or she would not teach the child.

Intriguingly, however, the development of a theory of mind does not seem necessary to either recognizing the need for a special type of discourse called “teaching” or to teaching ability itself—only to a refinement of teaching strategies. Task (2), in which participants were asked, for instance, whether a teacher would teach someone who knew something or someone who didn’t, showed no significant differences between 3- and 5-year-olds in the study. But the groups were significantly different in the strategies they employed during teaching.

Three-year-olds have some understanding of teaching. They understand that in order to determine the need for teaching as well as the target learner, there is a need to recognize a difference in knowledge between (at least) two people . . . Recognition of the learner’s lack of knowledge seems to be a necessary prerequisite for any attempt to teach. Thus, 3-year-olds who identify a peer who doesn’t know [how] to play a game will attempt to teach the peer. However, they will differ from 5-year-olds in their teaching strategies, reflecting the further change in ToM and understanding of teaching that occurs between the ages of 3 and 5 years.

Coevolution of Teaching and Learning

The study here dealt with the innateness of teaching ability and sensibilities but not with whether teaching and learning coevolved, which it mentions at the beginning and then leaves behind.

It is an interesting question, however. Discussions in education are increasingly focused on “how students learn,” and it seems to be widely accepted that teaching should adjust itself to what we discover about this. But if teaching is as natural a human faculty as learning—and coevolved alongside it—then this may be only half the story. How students (naturally) learn might be caused, in part, by how teachers (naturally) teach, and vice versa. And learners perhaps should be asked to adjust to what we learn about how we teach as much as the other way around.

Those seem like new thoughts to me. But they’re probably not.

Strauss, S., Ziv, M., & Stein, A. (2002). Teaching as a natural cognition and its relations to preschoolers’ developing theory of mind Cognitive Development, 17 (3-4), 1473-1487 DOI: 10.1016/S0885-2014(02)00128-4

Eleven Matches

Here’s an interesting problem about eleven matches, which I’ll just work out as I write this. It’s from this book:

On the table are eleven matches (or other objects). The first player picks up 1, 2, or 3 matches. The second player picks up 1, 2, or 3, and so on. The player who picks up the last match loses. (A) Can the first player always win? (B) Can he if there are 30 matches instead of eleven matches? (C) Can he in general, with \(\mathtt{n}\) matches to be picked up 1 through \(\mathtt{p}\) at a time (\(\mathtt{p}\) not greater than \(\mathtt{n}\))?

My first thought was to play a little and see what I notice. So I cooked up a tiny program to simulate a single game—here, the choices made by each player are random whole numbers between 1 and 3 inclusive. You can press Run over and over to run several random games.

It’s not entirely random, of course. Players can’t choose a number of matches greater than the current count. Also, a player can’t deliberately lose the game by choosing all the remaining matches (if the number remaining exceeds 1).

A Wishful-Thinking Simplification

I’m not sure that did much good. But I got to see different games play out—get my head around the environment I’m dealing with. My next thought was to simplify things, a wishful-thinking simplification: as a player, I would want to be left with 4 matches. That’s a sure win for me. Ah, but not just 4. Three or two would be a win as well—corresponding to \(\mathtt{m = 2}\) matches and \(\mathtt{m = 1}\) match (where 4 corresponds to \(\mathtt{m = 3}\) matches).

Keep going by adding 4 to 4, 3, and 2. If I have 8, 7, or 6 matches in front of me, I can take 3, 2, or 1 to make it 5. My opponent must then take 3, 2, or 1 to leave me with 2, 3, or 4—the numbers that I have already decided are instant wins for me. Finally, one more round of adding 4: with 12, 11, or 10 matches, I can take 3, 2, or 1 to reduce the count to 9. And my opponent must reduce the number to 8, 7, or 6. With eleven matches, that means that I can always win by taking 2 to start and following the pattern above.

So . . . what is the pattern? It seems that if I’m Player A I want my opponent to have something like \(\mathtt{4n + 1}\) matches. And I want \(\mathtt{4n}\) or \(\mathtt{4n – 1}\) or \(\mathtt{4n – 2}\). Let’s focus on trying to control what my opponent gets.

Algebraic Thinking

Given that I’m dropped onto a random place on the number line, \(\mathtt{n}\), how do I get to the nearest multiple of 4 (plus 1) to the left of, or at, my location? Well, I take my location, \(\mathtt{n}\), and subtract \(\mathtt{(n \bmod 4) – 1}\), unless \(\mathtt{(n \bmod 4) – 1}\) is 0, in which case I just subtract 3. I’ll explain this in the future, but for now let’s change Player A’s strategy to that and keep Player B’s random to see if we can guarantee a win for Player A.

I think we nailed it in the code at the right. That answers Question A, probably a little more completely than we needed to, but it’s still answered mostly experimentally. We can worry about elegance later. There is a way that Player A can always win, when starting with eleven matches. No matter what Player B plays, as long as Player A plays \(\mathtt{(n \bmod 4) – 1}\) (or 3, when \(\mathtt{(n \bmod 4) – 1}\) results in \(\mathtt{-1}\)), then Player A will win.

Does the same strategy work when starting with 30 matches? My guess is that it should, since the rules haven’t changed (each player can still only pick up 1, 2, or 3 matches) and we built our strategy up from the simplest case. Let’s replicate the strategy in the code but change the count to start at 30 and see what we get.

Question B and the Rest

I think that answers Question B, mostly experimentally, just as we answered Question A. Here the code shows a bit more elegance. Instead of using an if statement, we can write Player A’s choice formula as \(\mathtt{(n+3) \bmod 4}\), and it accomplishes the same thing as above. And, really, we can answer Question C too, at least partly and tentatively. And the answer is no. If we had to start with, say, 29 matches or 9 matches, our formula would tell us to play 0 on the first play. These are numbers that we want for our opponent, not for us, because they lead inexorably to a win for us, no matter what our opponent plays. So, starting off with \(\mathtt{4n + 1}\) matches, when picking up 1, 2, or 3 matches, does not guarantee a win.

A good guess for a more general answer to Question C—given that the 4 in mod 4 seems to be the maximum number of matches that can be picked up, \(\mathtt{m}\) plus 1—is that the game is not winnable if the starting number of matches is \(\mathtt{(m + 1)n + 1}\), where \(\mathtt{m}\) is the maximum number of matches that can be picked up and \(\mathtt{n}\) is the set of natural numbers, \(\mathtt{\{1, 2, 3,\dots\}}\).

I’ll leave it to the reader to poke holes in that if it’s wrong or tighten the screws if it’s essentially correct.

Problem Solving, Instruction: Chicken, Egg

problem solving before instruction

We’ve looked before at research which evaluated the merits of different instructional sequences like problem solving before instruction.

In this post, for example, I summarized a research review by Rittle-Johnson that revealed no support for the widespread belief that conceptual instruction must precede procedural instruction in mathematics. The authors of that review went so far as to call the belief (one held and endorsed by the National Council of Teachers of Mathematics) a myth. And another study, summarized in this post, finds little evidence for another very popular notion about instruction—that cognitive conflict of some kind is a necessary prerequisite to learning.

The review we will discuss in this post looks at studies which compared two types of teaching sequences: problem solving followed by instruction (PS-I) and instruction followed by problem solving (I-PS). As far as horserace comparisons, the main takeaway is shown in the table below. Each positive (+) is a study result which showed that PS-I outperformed an I-PS control, each equals sign (=) a result where the two conditions performed the same, and each negative (–) a result where I-PS outperformed PS-I.

Procedural Conceptual Transfer

= = = = = = = = =

– –

+ + + +

= = =

– –

+ + + +

= = = =

Summary of learning outcomes for PS-I vs. I-PS.

Importantly, 8 of the results reviewed are not represented in the table above. In these results, the review authors suggest, participants in the PS-I conditions were given better learning resources than those in the I-PS conditions. This difference confounded those outcomes (see Greg’s post on this) and, unsurprisingly, added 15 plusses, 7 equals, and just 1 negative to the overall picture of PS-I.

Needless to say, when research has more fairly compared PS-I with I-PS, it has concluded that, in general, the sequence doesn’t matter all that much, though there are some positive trends on conceptual and transfer assessments for PS-I. Even if we ignore the Procedural column, roughly 55% of the results are equal or negative for PS-I. It really doesn’t seem to matter all that much whether you place problem solving before instruction or not.

Contrasting Cases and Building on Student Solutions

Horserace aside (sort of), an intriguing discussion in this review centers around two of the confounds identified above—those extra benefits provided in some studies to learners in the ‘problem solving before instruction’ conditions. They were (1) using contrasting cases during problem solving and (2) building on student solutions during instruction. Here the authors describe contrasting cases (I’ve included their example from the paper):

problem solving before instruction

Contrasting cases consist of small sets of data, examples, or strategies presented side-by-side (e.g., Schwartz and Martin 2004; Schwartz and Bransford 1998). These minimal pairs differ in one deep feature at a time ceteris paribus [other things being equal], thereby highlighting the target features. In the example provided in the right column of Table 2, the datasets of player A and player B differ with regard to the range, while other features (e.g., mean, number of data points) are held constant. The next pair of datasets addresses another feature: player B and C have the same mean and range but different distribution of the data points.

There’s something funny about this, I have to admit, given the soaring rhetoric one encounters in education about the benefits of “rich” problems and the awkwardness of textbook problems. Although they are confounds in these studies, contrasting cases manage to be helpful to learning in PS-I as sets of (a) small, (b) artificial problems which (c) vary one idea at a time. “Rich” problems, in contrast, do not show the same positive effects.

And here, some more detail about building on student solutions in instruction. The only note I have here is that it seems worthwhile to point out the obvious: that this confound which also improves learning in ‘problem solving before instruction’ has almost everything to do with the I, and very little to do with the PS:

Another way of highlighting deep features in problem solving before instruction is to compare non-canonical student solutions to each other and to the canonical solution during subsequent instruction (e.g., Kapur 2012; Loibl and Rummel 2014a). Explaining why erroneous solutions are incorrect has been found to be beneficial for learning, in some cases even more than explaining correct solutions (Booth et al. 2013). Furthermore, the comparison supports students in detecting differences between their own prior ideas and the canonical solution (Loibl and Rummel 2014a). More precisely, through comparing them to other students’ solution and to the canonical solution, students experience how their solution approaches fail to address one or more important aspects of the problem (e.g., diSessa et al. 1991). This process guides students’ attention to the deep features addressed by the canonical solution (cf. Durkin and Rittle-Johnson 2012).
Loibl, K., Roll, I., & Rummel, N. (2016). Towards a Theory of When and How Problem Solving Followed by Instruction Supports Learning Educational Psychology Review DOI: 10.1007/s10648-016-9379-x

The Appeal to Common Practice

appeal to common practice

At any point in a child’s life or schooling, he or she presents with a number of things he or she can do and a number—which could be 0—of things he or she knows. We can refer to these collectively as the “knowns.” And, of course, the “unknowns” are all those things a child does not know or cannot do at any of the same points. The problem of teaching from the known to the unknown involves making some kind of connection from a student’s knowns to a very restricted set of unknowns, which, taken together at any point, form a kind of immediate curriculum. When we are tempted to justify a teaching practice based on these knowns, we can run the risk of making an appeal to common practice.

Now, of course, it is impossible to teach without going from the known to the unknown in some way. On the one hand, a student can’t learn anything if s/he has absolutely no knowledge or skills (because then s/he wouldn’t exist), and on the other hand, nothing can be described purely in terms of itself. The inevitable connection from known to unknown itself is not at issue. What is at issue is the way this connection is made. What knowns are connected to what unknowns?

The Best “Known”

Over a wide variety of topics, educators will often argue about the quality of the knowns to be connected to specific unknowns. The ongoing debate about whether to teach fractions first or decimals first is an area where this argument pops up, with some making the case that place value is the better “known” to be connected to the unknown of rational numbers (decimals first) while others argue that equal shares is the better known (fractions first). Similarly, one can argue that, for the unknown of improper fractions, proper fractions serve as the best “known,” whereas another can argue that, because improper and proper fractions are used in such diverse situations (e.g., “no one says that they have eight fifths dollars”), we must scrap the use of proper fractions as the “known” in introducing improper fractions and come back to the connection later.

While there are certainly substantive reasons that serve as foundations for these arguments, there are also problems that seem almost impossible to duck. One of those is called the appeal to common practice.

Appeal to Common Practice

This is a fallacy. And it works like this: Such and such an action is justified because it is what everyone else is doing or what we’ve always done. Now, it is pretty rare to see an adult actually commit this fallacy so nakedly. But it does creep up somewhat, um, “un-nakedly.” Here’s Mark falling into the fallacy with repeated multiplication:

Try to give me a simple definition of exponentiation, which is understandable by a fifth or sixth grader, which doesn’t at least start by talking about repeated multiplication. Find me a beginners textbook or teachers class plans that explains exponentiation to kids without at least starting with something like “\(\mathtt{5^2 = 5 \times 5}\), \(\mathtt{5^3 = 5 \times 5 \times 5}\).”

The second of those sentences is pretty clearly the fallacy of appealing to common practice, to the extent that it is used in any way to justify or excuse the teaching of exponentiation as repeated multiplication. But notice what is said in the first sentence: “Try to give me a simple definition of exponentiation, which is understandable by a fifth or sixth grader, which doesn’t at least start by talking about repeated multiplication.” This, too, is an appeal to common practice, but the practice in this case is not necessarily the teaching of exponentiation as repeated multiplication to fifth or sixth graders but, rather, the teaching of everything before that. The argument is that repeated multiplication is the best “known” because currently the 8 to 10 years of schooling prior to teaching the “unknown” of exponentiation don’t prepare students for learning exponentiation any other way (or any better way).

But these circumstances do not make repeated multiplication the best “known,” just the most expedient “known.” The same goes for repeated addition as a “known” connected to the unknown of multiplication.

All that aside, though, the more general argument is more important: Expedience is not a proper basis for determining quality teaching. Yet, it happens all the time without our noticing it—the appeal to common practice makes it devilishly difficult to discern between expedience and quality.

Interleaving Study, Not Learning


The study I briefly discuss here is closely related to one I wrote up on interleaving just a few months ago (no surprise, since the two studies share an author). You can find a free link to the article referenced in that post here.

In the latest research, the authors found that a blocked schedule (presenting examples from one category at a time) outperformed an interleaved schedule (interspersing examples from all the categories) for category learning when the examples to be classified were more highly discriminable. This result was consistent across the two experiments in the study (p = 0.055 and p = 0.04). Importantly, however, although interleaving was a better strategy for learning categories of lower discriminability, the effects across the experiments were much weaker (p = 0.2 and p = 0.08). Blocking had either a significant or close to significant effect, whereas interleaving didn’t get nearly as close (if you like p-values, anyway).

The Study

The participants in the first experiment of the study (we’ll only focus on that one here) were quite a bit older than I’m used to reading about in education studies: between 19 and 57 years old, with a mean age of 30. (This is similar to the previous study.) Subjects were divided into two groups, one of which was presented with images like these:

These images represent the four presented categories: long, steep; long, flat; short, steep; and short, flat. One subset of participants in this group was exposed to 64 of these images in a blocked schedule (16 from each category at a time) while the other was presented with the images in an interleaved schedule. Each example was appropriately labeled with a category letter (A, B, C, or D). After this initial exposure, subjects were then given a test on the same set of 64 images, randomly ordered, requiring them to assign the image to the appropriate category.

The other group of participants was presented with similar images (line segments) and in blocked or interleaved subsets. However, for this group the images were rotated 45 degrees. According to the researchers, this created a distinction between the groups in which the first was learning verbalizable, highly discriminable categories (“long and flat,” etc.) whereas the second was learning categories difficult to express in words—categories of low discriminability.

Discussion, Questions, Connections

As mentioned above, the blocked arrangement of the examples produced a learning benefit for categories of higher discriminability when compared with interleaving. The same cannot be said for interleaving examples in the low-discriminability sequences, although the benefits for interleaving in these sequences were headed in the positive direction. So we are left to wonder about the positive effects of blocking here.

The authors suggest an answer by mentioning some data they are collecting in a separate pilot study: blocking makes it easier for learners to disregard irrelevant information.

We compared learning under a third study schedule (n = 26) in which the relevant dimensions were interleaved, but the irrelevant ones were blocked (i.e., this schedule was blocked-by-irrelevant-dimensions, as opposed to blocked-by-category), which was designed to draw learners’ attention to noticing what dimensions were relevant or irrelevant. On both the classification test and a test in which participants had to identify the relevant and irrelevant dimensions, this new blocked-by-irrelevant-dimensions condition yielded performance at a level comparable to the blocked condition and marginally better compared to the interleaved condition.

Therefore, although we initially hypothesized that participants, when studying one category at a time, are better able to compare exemplars from the same category and to generate and test their hypotheses as to the dimensions [that] define category membership (and this may still be true, particularly for Experiment 1), these pilot data suggest that with the addition of irrelevant dimensions . . . the blocking benefit is perhaps more likely driven by the fact that it allows participants to more easily identify and disregard the irrelevant dimensions.

This strikes me as making a good deal of sense. And it points to something that I may have previously confused: interleaving study examples is different from interleaving initial learning examples. When students are first learning something, blocking may be better; after acquisition, interleaving may benefit learners more.

We have a tendency, in my opinion, to ignore acquisition in learning. I’m not sure where this comes from. Perhaps it is believed that if we are justified in rejecting tabula rasa, we are safe to assume there are absolutely no rasas on any kid’s tabula. At any rate, it’s worth being clear about where in the learning process interleaving is beneficial—and where it may not be.

Postscript: It’s not unusual to believe, about a child’s cognitive subjectivity, that it is like a large glop of amorphous dough and that instruction or experience acts like a cookie cutter, shaping the child’s mind according to pre-made patterns and discarding the bits that don’t fit.

But these results could suggest something different—that prior to learning, the world is a hundred trillion things that must be connected together, not a stew of sensation that must be partitioned into socially valuable units.

This may be why blocking could work well for newly learned categories and for so-called highly discriminable categories: because what is new to us is highly discriminable—separate, without precedent, meaningless.

Image credit: Danny Nicholson
Noh, S., Yan, V., Bjork, R., & Maddox, W. (2016). Optimal sequencing during category learning: Testing a dual-learning systems perspective Cognition, 155, 23-29 DOI: 10.1016/j.cognition.2016.06.007

The Problem of Stipulation

I think it would surprise people to read Engelmann and Carnine’s Theory of Instruction. (The previous link is to the 2016 edition of the book on Amazon, but you can find the same book for free here.) Tangled noticeably within its characteristic atomism and obsession with the naïve learner are a number of ideas that seem downright progressive—a label never ever attached to Engelmann or his work. One of these ideas in particular is worth mentioning—what the authors call the problem of stipulation.

problem of stipulation

The diagram at left shows a teaching sequence described in the book, in all its banal, robotic glory.

To be fair, it’s much harder to roll your eyes at it (or at least it should be) when you consider the audience for whom it is intended—usually special education students.

Anyway, the sequence features a table and a chalkboard eraser in various positions relative to the table. And the intention is to teach the concept of suspended, allowing learners to infer the meaning of the concept while simultaneously preventing them from learning misrules.

Stipulation occurs when the learner is repeatedly shown a limited range of positive variation. If the presentation shows suspended only with respect to an eraser and table, the learner may conclude that the concept suspended is limited to the eraser and the table.

You may know about the problem of stipulation in mathematics education as the (very real) problem of math zombies (or maybe Einstellung)—which is, to my mind anyway, the sine qua non of anti-explanation pedagogical reform efforts.

But of course prescribing doses of teacher silence is only one way to deal with the symptoms of stipulation. Engelmann and Carnine have another.

To counteract stipulation, additional examples must follow the initial-teaching sequence. Following the learner’s successful performance with the sequence that teaches slanted, for instance, the learner would be shown that slanted applies to hills, streets, floors, walls, and other objects. Following the presentation of suspended, the learner would be systematically exposed to a variety of suspended objects.

Stipulation of the type that occurs in initial-teaching sequences is not serious if additional examples are presented immediately after the initial teaching sequence has been presented. The longer the learner deals only with the examples shown in the original setup, the greater the probability that the learner will learn the stipulation.

It’s a shame, I think, that more educators are not exposed to this work in college. It’s a shame, too, that Engelmann’s own commercial work has come to represent the theory in full flower—unjustly, I believe. Theory of Instruction could be much more than what it is sold to be, to antagonists and protagonists alike.

Update (07.13): It’s worth mentioning that, insofar as the problem of stipulation can be characterized as a student’s dependence on teacher input for thinking, complexity and confusion can be even more successful at creating cognitive dependence than monotony and hyper-simplicity. When students—or adults—are in a constant state of confusion, they may learn, incidentally, that the world, either at the moment or in general, is unpredictable and incommensurate with their own attempts at understanding. In such cases, even the unfounded opinions of authority figures will provide relief.

Audio Postscript