## Birds and Worms

research

Pay attention to your thought process and how you use expert knowledge as you answer the question below. How do you think very young students would think about it?

Here are some birds and here are some worms. Suppose the birds all race over, and each one tries to get a worm. Will every bird get a worm? How many birds won’t get a worm?

Hudson (1983) found that, among a small group of first-grade children (mean age of 7.0), just 64% completed this type of task correctly. However, when the task was rephrased as follows, all of the students answered correctly.

Here are some birds and here are some worms. Suppose the birds all race over, and each one tries to get a worm. Will every bird get a worm? How many birds won’t get a worm?

Interpret the Results

Still, what can we say about these results? Is it the case that 100% of the students used “their knowledge of correspondence to determine exact numerical differences between disjoint sets”? That is how Hudson describes students’ unanimous success in the second task. The idea seems to be that the knowledge exists; it’s just that a certain magical turn of phrase unlocks and releases this otherwise submerged expertise.

But that expert knowledge is given in the second task: “each one tries to get a worm.” The question paints the picture of one-to-one correspondence, and gives away the procedure to use to determine the difference. So, “their knowledge” is a bit of a stretch, and “used their knowledge” is even more of a stretch, since the task not only sets up a structure but animates its moving parts as well (“suppose the birds all race over”).

Further, questions about whether or not students are using knowledge they possess raise questions about whether or not students are, in fact, determining “exact numerical differences between disjoint sets.” On the contrary, it can be argued that students are simply watching almost all of a movie in their heads (a mental simulation)—a movie for which we have provided the screenplay—and then telling us how it ends (spoiler: 2 birds don’t get a worm). The deeper equivalence between the solution “2” and the response “2” to the question “How many birds won’t get a worm?” is evident only to a knowledgeable onlooker.

Experiment 3

Hudson anticipates some of the skepticism on display above when he introduces the third and last experiment in the series.

It might be argued, success in the Won’t Get task does not require a deep level of mathematical understanding; the children could have obtained the exact numerical differences by mimicking by rote the actions described by the problem context . . . In order to determine more fully the level of children’s understanding of correspondences and numerical differences, a third experiment was carried out that permitted a detailed analysis of children’s strategies for establishing correspondences between disjoint sets.

The wording in the Numerical Differences task of this third experiment, however, did not change. The “won’t get” locutions were still used. Yet, in this experiment, when paying attention to students’ strategies, Hudson observed that most children did not mentally simulate in the way directly suggested by the wording (pairing up the items in a one-to-one correspondence).

This does not defeat the complaint above, though. The fact that a text does not effectively compel the use of a procedure does not mean that it is not the primary influence on correct answers. It still seems more likely than not that participants who failed the “how many more” task simply didn’t have stable, abstract, transferable notions about mathematical difference. And the reformulation represented by the “won’t get” task influenced students to provide a response that was correct.

But this was a correct response to a different question. As adults with expert knowledge, we see the logical and mathematical similarities between the “how many more” and “won’t get” situations, and, thus we are easily fooled into believing that applying skills and knowledge in one task is equivalent to doing so in the other.

## Entia Successiva

The term ‘entia successiva’ means ‘successive entities.’ And, as you may guess, it is a term one might come across in a philosophy class, in particular when discussing metaphysical questions about personhood. For instance, is a person a single thing throughout its entire life or a succession of different things—an ‘ens successivum’? Though there is no right answer to this question, becoming familiar with the latter perspective can, I think, help people to be more skeptical and knowledgeable consumers of education research.

Richard Taylor provides an example of a symphony (in here) that is, depending on your perspective, both a successive and a permanent entity:

Let us imagine [a symphony orchestra] and give it a name—say, the Boston Symphony. One might write a history of this orchestra, beginning with its birth one hundred years ago, chronicling its many tours and triumphs and the fame of some of its musical directors, and so on. But are we talking about one orchestra?

In one sense we are, but in another sense we are not. The orchestra persists through time, is incorporated, receives gifts and funding, holds property, has a bank account, returns always to the same city, and rehearses, year in and year out, in the same hall. Yet its membership constantly changes, so that no member of fifty years ago is still a member today. So in that sense it is an entirely different orchestra. We are in this sense not talking about one orchestra, but many. There is a succession of orchestras going under the same name. Each, in [Roderick] Chisholm’s apt phrase, does duty for what we are calling the Boston Symphony.

The Boston Symphony is thus an ens successivum.

People are entia successiva, too. Or, at least their bodies are. Just about every cell in your body has been replaced from only 10 years ago. So, if you’re a 40-year-old Boston Symphony like me, almost all of your musicians and directors have been swapped out from when you were a 30-year-old symphony. People still call you the Boston Symphony of course (because you still are), but an almost entirely different set of parts is doing duty for “you” under the banner of this name. You are, in a sense, an almost completely different person—one who is, incidentally, made up of at least as many bacterial cells as human ones.

What’s worse (if you think of the above as bad news), the fact of evolution by natural selection tells us that humanity itself is an ens successivum. If you could line up your ancestors—your mother or father, his or her mother or father, and so on—it would be a very short trip down this line before you reached a person with whom you could not communicate at all, save through gestures. Between 30 and 40 people in would be a person who had almost no real knowledge about the physical universe. And there’s a good chance that perhaps the four thousandth person in your row of ancestors would not even be human.

The ‘Successive’ Perspective

Needless to say, seeing people as entia successiva does not come naturally to anyone. Nor should it, ever. We couldn’t go about out our daily lives seeing things this way. But the general invisibility of this ‘successiveness’ is not due to its only being operational at the very macro or very micro levels. It can be seen at the psychological level too. Trouble is, our brains are so good at constructing singular narratives out of even absolute gibberish, we sometimes have to place people in unnatural or extreme situations to get a good look at how much we can delude ourselves.

An Air Force doctor’s experiences investigating the blackouts of pilots in centrifuge training provides a nice example (from here). It’s definitely worth quoting at length:

Over time, he has found striking similarities to the same sorts of things reported by patients who lost consciousness on operating tables, in car crashes, and after returning from other nonbreathing states. The tunnel, the white light, friends and family coming to greet you, memories zooming around—the pilots experienced all of this. In addition, the centrifuge was pretty good at creating out-of-body experiences. Pilots would float over themselves, or hover nearby, looking on as their heads lurched and waggled about . . . the near-death and out-of-body phenomena are both actually the subjective experience of a brain owner watching as his brain tries desperately to figure out what is happening and to orient itself amid its systems going haywire due to oxygen deprivation. Without the ability to map out its borders, the brain often places consciousness outside the head, in a field, swimming in a lake, fighting a dragon—whatever it can connect together as the walls crumble. What the deoxygenated pilots don’t experience is a smeared mess of random images and thoughts. Even as the brain is dying, it refuses to stop generating a narrative . . . Narrative is so important to survival that it is literally the last thing you give up before becoming a sack of meat.

You’ll note, I hope, that not only does the report above disclose how our very mental lives are entia successiva—thoughts and emotions that arise and pass away—but the report assumes this perspective in its own narrative. That’s because the report is written from a scientific point of view. And from that vantage point, people are assumed (correctly) to have parts that “do duty” for them and may even be at odds with each other, as they were with the pilots (a perception part fighting against a powerful narrative-generating part). The unit of analysis in the report is not an entire pilot, but the various mechanisms of her mind. Allowing for these parts allows for functional explanations like the one we see.

An un-scientific analysis, on the other hand, is entirely possible. But it would stop at the pilot. He or she is, after all, an indivisible, permanent entity. There is nothing else “doing duty” for him, so there are really only two choices: his experience was an illusion or it was real. End of analysis. Interpret it as an illusion and you don’t really have much to say; interpret it as real, and you can make a lot of money.

Entia Permanentia

Good scientific research in education will adopt an entia successiva perspective about the people it studies. This does not guarantee that its conclusions are correct. But it makes it more likely that, over time, it will get to the bottom of things.

This is not to say that an alternative perspective is without scientific merit. If we want to know how to improve the performance of the Boston Symphony, we can make some headway with ‘entia permanentia’—seeing the symphony as a whole stable unit rather than a collection of successive parts. We could increase its funding, perhaps try to make sure “it” is treated as well as other symphonies around the world. We could try to change the music, maybe include some movie scores instead of that stuffy old classical music. That would make it more exciting for audiences (and more inclusive), which is certainly one interpretation of “improvement.” But to whatever extent improvement means improving the functioning of the parts of the symphony—the musicians, the director, etc.—we can do nothing, because with entia permanentia these tiny creatures do not exist. Even raising the question about improving the parts would be beyond the scope of our imagination.

Further, seeing students as entia permanentia rather than entia successiva stops us from being appropriately skeptical about both ‘scientific’ and ‘un-scientific’ ideas. Do students learn best when matched to their learning style? What parts of their neurophysiology and psychology could possibly make something like that true? Why would it have evolved, if it did? In what other aspects of our lives might this present itself? Adopting the entia successiva perspective would have slowed the adoption of this myth (even if were not a myth) to a crawl and would have eventually killed it. Instead, entia permanentia, a person-level analysis, holds sway: students benefit from learning-style matching because we see them respond differently to different representations. End of analysis.

A different but similar perspective on this, from a recurring theme in the book Switch:

In a pioneering study of organizational change, described in the book The Critical Path to Corporate Renewal, researchers divided the change efforts they’d studied into three groups: the most successful (the top third), the average (the middle third), and the least successful (the bottom third). They found that, across the spectrum, almost everyone set goals: 89 percent of the top third and 86 percent of the bottom third . . . But the more successful change transformations were more likely to set behavioral goals: 89 percent of the top third versus only 33 percent of the bottom third.

Why do “behavioral” goals work when just “goals” don’t? Behavioral goals are, after all, telling you what to do, forcing you to behave in a certain way. Do you like to be told what to do? Probably not.

But the “you” that responds to behavioral goals isn’t the same “you” whose in-the-moment “likes” are important. You are more than just one solid indivisible self. You are many selves, and the self that can start checking stuff off the to-do list is often pulling the other selves behind it. And when it does, you get to think that “you” are determined, “you” take initiative, “you” have willpower. But in truth, your environment—both immediate and distant, both internal and external—has simply made it possible for that determined self to take the lead. Behavioral goals often create this exact environment.

## Provided vs Generated Examples

research

The results reported in this research (below) about the value of provided examples versus generated examples are a bit surprising. To get a sense of why that’s the case, start with this definition of the concept availability heuristic used in the study—a term from the social psychology literature:

Availability heuristic: the tendency to estimate the likelihood that an event will occur by how easily instances of it come to mind.

All participants first read this definition, along with the definitions of nine other social psychology concepts, in a textbook passage. Participants then completed two blocks of practice trials in one of three groups: (1) subjects in the provided examples group read two different examples, drawn from an undergraduate psychology textbook, of each of the 10 concepts (two practice blocks, so four examples total for each concept), (2) subjects in the generated examples group created their own examples for each concept (four generated examples total for each concept), and (3) subjects in the combination group were provided with an example and then created their own example of each concept (two provided and two generated examples total for each concept).

The researchers—Amanda Zamary and Katharine Rawson at Kent State University in Ohio—made the following predictions, with regard to both student performance and the efficiency of the instructional treatments:

We predicted that long-term learning would be greater following generated examples compared to provided examples. Concerning efficiency, we predicted that less time would be spent studying provided examples compared to generating examples . . . [and] long-term learning would be greater after a combination of provided and generated examples compared to either technique alone. Concerning efficiency, our prediction was that less time would be spent when students study provided examples and generate examples compared to just generating examples.

Achievement Results

All participants completed the same two self-paced tests two days later. The first assessment, an example classification test, asked subjects to classify each of 100 real-world examples into one of the 10 concept definition categories provided. Sixty of these 100 were new (Novel) to the provided-examples group, 80 of the 100 were new to the combination group, and of course all 100 were likely new to the generated-examples group. The second assessment, a definition-cued recall test, asked participants to type in the definition of each of the 10 concepts, given in random order. (The test order was varied among subjects.)

Given that participants in the provided-examples and combination groups had an advantage over participants in the generated-examples group on the classification task (they had seen between 20 and 40 of the examples previously), the researchers helpfully drew out results on just the 60 novel examples.

Subjects who were given only textbook-provided examples of the concepts outperformed other subjects on applying these concepts to classifying real-world examples. This difference was significant. No significant differences were found on the cued-recall test between the provided-examples and generated-examples groups.

Also, Students’ Time Is Valuable

Another measure of interest to the researchers in this study, as mentioned above, was the time used by the participants to read through or create the examples. What the authors say about efficiency is worth quoting, since it does not often seem to be taken as seriously as measures of raw achievement (emphasis mine):

Howe and Singer (1975) note that in practice, the challenge for educators and researchers is not to identify effective learning techniques when time is unlimited. Rather, the problem arises when trying to identify what is most effective when time is fixed. Indeed, long-term learning could easily be achieved if students had an unlimited amount of time and only a limited amount of information to learn (with the caveat that students spend their time employing useful encoding strategies). However, achieving long-term learning is difficult because students have a lot to learn within a limited amount of time (Rawson and Dunlosky 2011). Thus, long-term learning and efficiency are both important to consider when competitively evaluating the effectiveness of learning techniques.

With that in mind, and given the results above, it is noteworthy to learn that the provided-examples group outperformed the generated-examples group on real-world examples after engaging in practice that took less than half as much time. The researchers divided subjects’ novel classification score by the amount of time they spent practicing and determined that the provided-examples group had an average gain of 5.7 points per minute of study, compared to 2.2 points per minute for the generated-examples group and 1.7 points per minute for the combination group.

For learning declarative concepts in a domain and then identifying those concepts in novel real-world situations, provided examples proved to be better than student-generated examples for both long-term learning and for instructional efficiency. The second experiment in the study replicated these findings.

Some Commentary

First, some familiarity with the research literature makes the above results not so surprising. The provided-examples group likely outperformed the other groups because participants in that group practiced with examples generated by experts. Becoming more expert in a domain does not necessarily involve becoming more isolated from other people and their interests. Such expertise is likely positively correlated with better identifying and collating examples within a domain that are conceptually interesting to students and more widely generalizable. I reported on two studies, for example, which showed that greater expertise was associated with a significantly greater number of conceptual explanations, as opposed to “product oriented” (answer-getting) explanations—and these conceptual explanations resulted in the superior performance of students receiving them.

Second, I am sympathetic to the efficiency argument, as laid out here by the study’s authors—that is, I agree that we should focus in education on “trying to identify what is most effective when time is fixed.” Problematically, however, a wide variety of instructional actions can be informed by decisions about what is and isn’t “fixed.” Time is not the only thing that can be fixed in one’s experience. The intuition that students should “own their own learning,” for example, which undergirds the idea in the first place that students should generate their own examples, may rest on the more fundamental notion that students themselves are “fixed” identities that adults must work around rather than try to alter. This notion is itself circumscribed by the research summarized above. So, it is worth having a conversation about what should and should not be considered “fixed” when it comes to learning.

## Sum and Product Loops

It’s something of a truism that mathematical symbolism is difficult. There are some situations, though, where the symbolism is not just difficult, but also annoying and ridiculous. It likely saved a lot of time when people were still mostly writing ideas out by hand, so back then even the annoying and ridiculous could not be righteously pointed at and mocked, but nowadays it is almost certainly more difficult to set some statements in LaTeX than it is to type them—and, if the text is intended to teach students, more difficult to unpack the former than it is to understand the latter.

Examples of symbols that are justly symbolized, even today, are $$\mathtt{\sum}$$ and $$\mathtt{\prod}$$, representing a sum and a product, respectively. More specifically, these symbols represent loops—an addition loop or a multiplication loop.

So, for example, take this expression on the left side of the equals sign, which represents the loop sum on the right of the equals sign: $$\mathtt{\sum_{n=1}^{5}n=1+2+3+4+5}$$. The expression on the left just means (a) start a counter at 1, (b) count up to 5 by 1s, (c) let n = each number you count, then (d) add all the n’s one by one in a loop.

How about this one? $\mathtt{\sum_{n=0}^{4}2n=0+2+4+6+8}$

This one means (a) start a counter at 0, (b) count up to 4 by 1s, (c) let n = each number you count, then (d) add all the 2n’s one by one in a loop.

For products, we just swap out the symbol. Here is the corresponding product for the first loop: $$\mathtt{\prod_{n=1}^{5}n=1\times2\times3\times4\times5}$$. And here’s one for the second loop: $\mathtt{\prod_{n=0}^{4}2n=0\times2\times4\times6\times8}$

Loops and Linear Algebra

You’ll often see the summation loop in linear algebra contexts, because it is an equivalent way to write a dot product, for example. The sum $$\mathtt{\sum_{n=0}^{4}2n=0+2+4+6+8}$$ above can be written as shown below, which looks like more work to write—and is—but when we’re dealing mostly with variables, the savings in writing effort is more evident. $\quad\,\,\,\begin{bmatrix}\mathtt{2}\\\mathtt{2}\\\mathtt{2}\\\mathtt{2}\\\mathtt{2}\end{bmatrix}\cdot \begin{bmatrix}\mathtt{0}\\\mathtt{1}\\\mathtt{2}\\\mathtt{3}\\\mathtt{4}\end{bmatrix}\mathtt{=2\cdot0+2\cdot1+2\cdot2\ldots}$

The loop sum $$\mathtt{\sum_{i}a_{i}x_{i}+b}$$, where $$\mathtt{i}$$ is an index pointing to a component of vector $$\mathtt{a}$$ and vector $$\mathtt{x}$$, can be written more simply as $$\mathtt{a\cdot x+b}$$, as long as the context is clear that $$\mathtt{a}$$ and $$\mathtt{x}$$ are vectors.

## Subtractive Knowledge

I was intrigued by a pedagogical insight offered by the example below, from the introductory class of a course called Computational Linear Algebra. The setup is that the graph (diagram) at the bottom of the image represents a Markov model, and it shows the probabilities of moving from one stage of a disease to another in a year.

So, if a patient is asymptomatic, there is a 7% (0.07) probability of moving from asymptomatic to symptomatic, a 90% chance of staying at asymptomatic (indicated by a curved arrow), and so on. This information is also encoded in the stochastic matrix shown.

Here’s the problem: Given a group of people: 85% are asymptomatic, 10% are symptomatic, 5% have AIDS, and of course 0% are deceased, what percent will be in each health state in a year? Putting yourself in the mind of a student right now, take a moment to try to answer the question and, importantly, reflect on your thinking at this stage, even if that thinking involves having no clue how to proceed.

I hope that, given this setup, you’ll be somewhat surprised to learn the following: If a high school student (or even middle school student) knows a little bit about probability and how to multiply and add, they should be able to answer this question.

Why? Well, if 85 out of every 100 in a group is asymptomatic, and there is a 90% probability of remaining asymptomatic in a year, then (0.85)(0.9) = 76.5% of the group is predicted to be asymptomatic in a year. The symptomatic group has two products that must be added: 10% of the group is symptomatic, and there is a 93% probability of remaining that way, so that’s (0.93)(0.1). But this group also takes on 7% of the 85% that were asymptomatic but moved to symptomatic. So, the total is (0.93)(0.1) + (0.07)(0.85) = 15.25%. The AIDS group percent is the sum of three products, for a total of 6.45%, and the Death group percent is a sum of four products, for a total of 1.8%.

Probability, multiplication, and addition are all you need to know. No doubt, knowing something about matrix-vector multiplication, as we have discussed, (and transposes) can be helpful, but it does not seem to be necessary in this case.

Bamboozled

I think it’s reasonable to suspect that many knowledgeable students—and knowledgeable adults—would be bamboozled by the highfalutin language here into believing that they cannot solve this problem, when in fact they can. If that’s true, then why is that the case?

Knowledge is domain specific, of course, (and very context specific) and that would seem to be the best explanation of students’ hypothesized difficulties. That is, given the cues (both verbal and visual) that this problem involves knowledge of linear algebra, Markov models, and/or stochastic matrices, anyone without that knowledge would naturally assume that they don’t have what is required to solve the problem and give up. And even if they suspected that some simple probability theory, multiplication, and addition were all that they needed, being bombarded by even a handful of foreign mathematical terms would greatly reduce their confidence in this suspicion.

Perhaps, then, the reason we are looking for—the reason students don’t believe they can solve problems when in fact they can—has to do with students’ attitudes, not their knowledge. And situations like these during instruction are enough to convince many that knowledge is overrated. The solution to this psychological reticence is, for many people, to encourage students to be fearless, to have problem-solving orientations and growth mindsets. After all, it’s clear that more knowledge would be helpful, but it’s not necessary in many cases. We’ll teach knowledge, sure, but we can do even better if we spend time encouraging soft skills along the way. Do we want students to give up every time they face a situation in life that they were not explicitly taught to deal with?

The problem with this view is that it construes knowledge as only additive. That is, it is thought, knowledge only works to give its owner things to think about and think with. So, in the above example, students already have all the knowledge things to work with: probability, multiplication, and addition. Anything else would only serve to bring in more things to think about, which would be superfluous.

But this isn’t the only way knowledge works. It can also be subtractive—that is, knowing something can tell you that it is irrelevant to the current problem. Not knowing it means that you can’t know about its relevance (and situations like the above will easily bias you to giving superficial information a high degree of relevance). So, students cannot know with high confidence that matrices are essentially irrelevant to the problem above if they don’t know what matrices are. But even knowing nothing about matrices, knowing that, computationally, linear algebra is fundamentally about multiplying and adding things may be enough. Taking that perspective can allow you to ignore the superficial setup of the problem. But that’s still knowledge.

A better interpretation of students’ difficulties with the above is that, in fact, they do need more knowledge to solve the problem. The knowledge they need is subtractive; it will help them ignore superficial irrelevant details to get at the marrow of the problem.

Knowledge is obviously additive, but it is much more subtly subtractive too, helping to clear away facts that are irrelevant to a given situation. Subtractive knowledge is like the myelinated sheaths around some nerve cells in the brain. It acts as an insulator for thinking—making it faster, more efficient, and, as we have seen, more effective.

## The Gricean Maxims

When we converse with one another, we implicitly obey a principle of cooperation, according to language philosopher Paul Grice’s theory of conversational implicature.

This ‘cooperative principle’ has four maxims, which although stated as commands are intended to be descriptions of specific rules that we follow—and expect others will follow—in conversation:

• quality:    Be truthful.
• quantity:  Don’t say more or less than is required.
• relation:  Be relevant.
• manner:    Be clear and orderly.

I was drawn recently to these maxims (and to Grice’s theory) because they rather closely resemble four principles of instructional explanation that I have been toying with off and on for a long time now: precision, clarity, order, and cohesion.

In fact, there is a fairly snug one-to-one correspondence among our respective principles, a relationship which is encouraging to me precisely because it is coincidental. Here they are in an order corresponding to the above:

• precision:  Instruction should be accurate.
• cohesion: Group related ideas.
• clarity:     Instruction should be understandable and present to its audience.
• order:       Instruction should be sequenced appropriately.

Both sets of principles likely seem dumbfoundingly obvious, but that’s the point. As principles (or maxims), they are footholds on the perimeters of complex ideas—in Grice’s case, the implicit contexts that make up the study of pragmatics; in my case (insert obligatory note that I am not comparing myself with Paul Grice), the explicit “texts” that comprise the content of our teaching and learning.

The All-Consuming Clarity Principle

Frameworks like these can be more than just armchair abstractions; they are helpful scaffolds for thinking about the work we do. Understanding a topic up and down the curriculum, for example, can help us represent it more accurately in instruction. We can think about work in this area as related specifically to the precision principle and, in some sense, as separate from (though connected to) work in other areas, such as topic sequencing (order), explicitly building connections (cohesion), and motivation (clarity).

But principle frameworks can also lift us to some height above this work, where we can find new and useful perspectives. For instance, simply having these principles, plural, in front of us can help us see—I would like to persuade you to see—that “clarity,” or in Grice’s terminology, “relevance,” is the only one we really talk about anymore, and that this is bizarre given that it’s just one aspect of education.

The work of negotiating the accuracy, sequencing, and connectedness of instruction drawn from our shared knowledge has been largely outsourced to publishers and technology startups and Federal agencies, and goes mostly unquestioned by the “delivery agents” in the system, whose role is one of a go-between, tasked with trying to sell a “product” in the classroom to student “customers.”

## Spooky Action at a Distance

I really like this recent post, called Tell Me More, Tell Me More, by math teacher Dani Quinn. The content is an excellent analysis of expert blindness in math teaching. The form, though, is worth seeing as well—it is a traditional educational syllogism, which Quinn helpfully commandeers to arrive at a non-traditional conclusion, that instructional effects have instructional causes, on the right:

There is a problem in how we teach: We typically spoon-feed students procedures for answering questions that will be on some kind of test.

“There is a problem in how we teach: We typically show pupils only the classic forms of a problem or a procedure.”

This is why students can’t generalize to non-routine problems: we got in the way of their thinking and didn’t allow them to take ownership and creatively explore material on their own.“This is why they then can’t generalise: we didn’t show them anything non-standard or, if we did, it was in an exercise when they were floundering on their own with the least support.”

Problematically for education debates, each of these premises and conclusions taken individually are true. That is, they exist. At our (collective) weakest, we do sometimes spoon-feed kids procedures to get them through tests. We do cover only a narrow range of situations—what Engelmann refers to as the problem of stipulation. And we can be, regrettably in either case, systematically unassertive or overbearing.

Solving equations provides a nice example of the instructional effects of both spoon-feeding and stipulation. Remember how to solve equations? Inverse operations. That was the way to do equations. If you have something like $$\mathtt{2x + 5 = 15}$$, the table shows how it goes.

EquationStep
$$\mathtt{2x + 5 \color{red}{- 5} = 15 \color{red}{- 5}}$$Subtract $$\mathtt{5}$$ from both sides of the equation to get $$\mathtt{2x = 10}$$.
$$\mathtt{\color{white}{+ 5 \,\,} 2x \color{red}{\div 2} = 10 \color{red}{\div 2}}$$Divide both sides of the equation by 2.
$$\mathtt{\color{white}{+ 5 \,\,}x = 5}$$You have solved the equation.

Do that a couple dozen times and maybe around 50% of the class freezes when they encounter $$\mathtt{22 = 4x + 6}$$, with the variable on the right side, or, even worse, $$\mathtt{22 = 6 + 4x}$$.

That’s spoon-feeding and stipulation: do it this one way and do it over and over—and, crucially, doing that summarizes most of the instruction around solving equations.

Of course, the lack of prior knowledge exacerbates the negative instructional effects of stipulation and spoon-feeding. But we’ll set that aside for the moment.

The Connection Between Premises and Conclusion

The traditional and alternative arguments above are easily (and often) confused, though, until you include the premise that I have omitted in the middle for each. These help make sense of the conclusions derived in each argument.

There is a problem in how we teach: We typically spoon-feed students procedures for answering questions that will be on some kind of test.

“There is a problem in how we teach: We typically show pupils only the classic forms of a problem or a procedure.”

Students’ success in schooling is determined mostly by internal factors, like creativity, motivation, and self-awareness.

Students’ success in schooling is determined mostly by external factors, like amount of instruction, socioeconomic status, and curricula.

This is why students can’t generalize to non-routine problems: we got in the way of their thinking and didn’t allow them to take ownership and creatively explore material on their own.“This is why they then can’t generalise: we didn’t show them anything non-standard or, if we did, it was in an exercise when they were floundering on their own with the least support.”

In short, the argument on the left tends to diagnose pedagogical illnesses and their concomitant instructional effects as people problems; the alternative sees them as situation problems. The solutions generated by each argument are divergent in just this way: the traditional one looks to pull the levers that mostly benefit personal, internal attributes that contribute to learning; the alternative messes mostly with external inputs.

It’s Not the Spoon-Feeding, It’s What’s on the Spoon

I am and have always been more attracted to the alternative argument than the traditional one. Probably for a very simple reason: my role in education doesn’t involve pulling personal levers. Being close to the problem almost certainly changes your view of it—not necessarily for the better. But, roles aside, it’s also the case that the traditional view is simply more widespread, and informed by the positive version of what is called the Fundamental Attribution Error:

We are frequently blind to the power of situations. In a famous article, Stanford psychologist Lee Ross surveyed dozens of studies in psychology and noted that people have a systematic tendency to ignore the situational forces that shape other people’s behavior. He called this deep-rooted tendency the “Fundamental Attribution Error.” The error lies in our inclination to attribute people’s behavior to the way they are rather than to the situation they are in.

What you get with the traditional view is, to me, a kind of spooky action at a distance—a phrase attributed to Einstein, in remarks about the counterintuitive consequences of quantum physics. Adopting this view forces one to connect positive instructional effects (e.g., thinking flexibly when solving equations) with something internal, ethereal and often poorly defined, like creativity. We might as well attribute success to rabbit’s feet or lucky underwear or horoscopes!

## Intuition and Domain Knowledge

Can you guess what the graphs below show? I’ll give you a couple of hints: (1) each graph measures performance on a different task, (2) one pair of bars in each graph—left or right—represents participants who used their intuition on the task, while the other pair of bars represents folks who used an analytical approach, and (3) one shading represents participants with low domain knowledge while the other represents participants with high domain knowledge (related to the actual task).

It will actually help you to take a moment and go ahead and guess how you would assign those labels, given the little information I have provided. Is the left pair of bars in each graph the “intuitive approach” or the “analytical approach”? Are the darker shaded bars in each graph “high knowledge” participants or “low knowledge” participants?

When Can I Trust My Gut?

A 2012 study by Dane, et. al, published in the journal Organizational Behavior and Human Decision Processes, sets out to address the “scarcity of empirical research spotlighting the circumstances in which intuitive decision making is effective relative to analytical decision making.”

To do this, the researchers conducted two experiments, both employing “non-decomposable” tasks—i.e., tasks that required intuitive decision making. The first task was to rate the difficulty (from 1 to 10) of each of a series of recorded basketball shots. The second task involved deciding whether each of a series of designer handbags was fake or authentic.

Why these tasks? A few snippets from the article can help to answer that question:

Following Dane and Pratt (2007, p. 40), we view intuitions as “affectively-charged judgments that arise through rapid, nonconscious, and holistic associations.” That is, the process of intuition, like nonconscious processing more generally, proceeds rapidly, holistically, and associatively (Betsch, 2008; Betsch & Glöckner, 2010; Sinclair, 2010). [Footnote: “This conceptualization of intuition does not imply that the process giving rise to intuition is without structure or method. Indeed, as with analytical thinking, intuitive thinking may operate based on certain rules and principles (see Kruglanski & Gigerenzer, 2011 for further discussion). In the case of intuition, these rules operate largely automatically and outside conscious awareness.”]

As scholars have posited, analytical decision making involves basing decisions on a process in which individuals consciously attend to and manipulate symbolically encoded rules systematically and sequentially (Alter, Oppenheimer, Epley, & Eyre, 2007).

We viewed [the basketball] task as relatively non-decomposable because, to our knowledge, there is no universally accepted decision rule or procedure available to systematically break down and objectively weight the various elements of what makes a given shot difficult or easy.

We viewed [the handbag] task as relatively non-decomposable for two reasons. First, although there are certain features or clues participants could attend to (e.g., the stitching or the style of the handbags), there is not necessarily a single, definitive procedure available to approach this task . . . Second, because participants were not allowed to touch any of the handbags, they could not physically search for what they might believe to be give-away features of a real or fake handbag (e.g., certain tags or patterns inside the handbag).

Results

Canvas not supported.

As you can see in the graphs at the right (hover for expertise labels), there was a fairly significant difference in both tasks between low- and high-knowledge participants when those participants approached the task using their intuition. In contrast, high- and low-knowledge subjects in the analysis condition in each experiment did not show a significant difference in performance. (The decline in performance of the high-knowledge participants from the Intuition to the Analysis conditions was only significant in the handbag experiment.)

It is important to note that subjects in the analysis conditions (i.e., those who approached each task systematically) were not told what factors to look for in carrying out their analyses. For the basketball task, the researchers simply “instructed these participants to develop a list of factors that would determine the difficulty of a basketball shot and told them to base their decisions on the factors they listed.” For the handbag task, “participants in the analysis condition were given 2 min to list the features they would look for to determine whether a given handbag is real or fake and were told to base their decisions on these factors.”

Also consistent across both experiments was the fact that low-knowledge subjects performed better when approaching the tasks systematically than when using their intuition. For high-knowledge subjects, the results were the opposite. They performed better using their intuition than using a systematic analysis (even though the ‘system’ part of ‘systematic’ here was their own system!).

In addition, while the combined effects of approach and domain knowledge were significant, the approach (intuition or analysis) by itself did not have a significant effect on performance one way or the other in either experiment. Domain knowledge, on the other hand, did have a significant effect by itself in the basketball experiment.

Any Takeaways for K–12?

The clearest takeaway for me is that while knowledge and process are both important, knowledge is more important. Even though each of the tasks was more “intuitive” (non-decomposable) than analytical in nature, and even when the approach taken to the task was “intuitive,” knowledge trumped process. Process had no significant effect by itself. Knowing stuff is good.

Second, the results of this study are very much in line with what is called the ‘expertise reversal effect’:

Low-knowledge learners lack schema-based knowledge in the target domain and so this guidance comes from instructional supports, which help reduce the cognitive load associated with novel tasks. If the instruction fails to provide guidance, low-knowledge learners often resort to inefficient problem-solving strategies that overwhelm working memory and increase cognitive load. Thus, low-knowledge learners benefit more from well-guided instruction than from reduced guidance.

In contrast, higher-knowledge learners enter the situation with schema-based knowledge, which provides internal guidance. If additional instructional guidance is provided it can result in the processing of redundant information and increased cognitive load.

Finally, one wonders just who it is we are thinking about more when we complain, especially in math education, that overly systematized knowledge is ruining the creativity and motivation of our students. Are we primarily hearing the complaints of the 20%—who barely even need school—or those of the children who really need the knowledge we have, who need us to teach them?

Dane, E., Rockmann, K., & Pratt, M. (2012). When should I trust my gut? Linking domain expertise to intuitive decision-making effectiveness Organizational Behavior and Human Decision Processes, 119 (2), 187-194 DOI: 10.1016/j.obhdp.2012.07.009

## Telling vs. No Telling

So, with that in mind, let’s move on to just one of the dichotomies in education, that of “telling” vs. “no telling,” and I hope the reader will forgive my leaving Clarke’s paper behind. I recommend it to you for its international perspective on what we discuss below.

“Reports of My Death Have Been Greatly Exaggerated”

We should start with something that educators know but people outside of education may not: there can be a bit of an incongruity, shall we say, between what teachers want to happen in their classrooms, what they say happens in their classrooms, and what actually happens there. Given how we talk and what we talk about on social media—and even in face-to-face conversations—and the sensationalist tendencies of media reports about education, an outsider could be forgiven, I think, for assuming that teachers have been moving en masse away from the practice of explicit instruction.

There is a large body of research which would suggest that this assumption is almost certainly “greatly exaggerated.”

Typical of this research is a small 2004 study (PDF download) in the U.K. which found that primary classrooms in England remained places full of teacher talk and “low-level” responding by students, despite intentions outlined in the 1998–1999 National Literacy and National Numeracy Strategies. The graph at the right, from the study, shows the categories of discourse observed and a sense of their relative frequencies.

John Goodlad made a similar and more impactful observation in his much larger study of over 1,000 classrooms across the U.S. in the mid-80s (I take this quotation from the 2013 edition of John Hattie’s book Visible Learning, where more of the aforementioned research is cited):

In effect, then, the modal classroom configurations which we observed looked like this: the teacher explaining or lecturing to the total class or a single student, occasionally asking questions requiring factual answers; . . . students listening or appearing to listen to the teacher and occasionally responding to the teacher’s questions; students working individually at their desks on reading or writing assignments.

Thus, despite what more conspiracy-oriented opponents of “no telling” sometimes suggest, the monotonic din of “understanding” and “guide on the side” and “collaboration” we hear today—and have heard for decades—is not the sound of a worldview that has, in practice, taken over education. Rather, it is one of a seemingly quixotic struggle on the part of educators to nudge each other—to open up more space for students to exercise independent and critical thinking. This a finite space, and something has to give way.

Research Overwhelmingly Supports Explicit Instruction

Teacher as
Activator
dTeacher as Facilitatord
Teaching students self-verbalization0.76Inductive Teaching0.33
Teacher clarity0.75Simulation and gaming0.32
Reciprocal teaching0.74Inquiry-based teaching0.31
Feedback0.74Smaller classes0.21
Metacognitive strategies0.67Individualised instruction0.22
Direct instruction0.59Web-based learning0.18
Mastery learning0.57Problem-based learning0.15
Providing worked examples0.57Discovery method (math)0.11

On the other hand, it is manifestly clear from the research literature that, when student achievement is the goal, explicit instruction has generally outperformed its less explicit counterpart.

The table at the left, taken from Hattie’s book referenced above, directly compares the effect sizes of various explicit and indirect instructional protocols, gathered and interpreted across a number of different meta-analyses in the literature.

Results like these are not limited to the K–12 space, nor do they involve only the teaching of lower-level skills or teaching in only in well-structured domains, such as mathematics. These are robust results across many studies and over long periods of time.

And while research supporting less explicit instructional techniques is out there (as obviously Hattie’s results also attest), there is much less of it—and certainly far less than one would expect given the sheer volume of rhetoric in support of such strategies. On this point, it is worth quoting Sigmund Tobias at some length, from his summarizing chapter in the 2009 book Constructivist Instruction: Success or Failure?:

When the AERA 2007 debate was organized, I described myself as an eclectic with respect to whether constructivist instruction was a success or failure, a position I also took in print earlier (Tobias, 1992). The constructivist approach of immersing students in real problems and having them figure out solutions was intuitively appealing. It seemed reasonable that students would feel more motivated to engage in such activities than in those occurring in traditional classrooms. It was, therefore, disappointing to find so little research documenting increased motivation for constructivist activities.

A personal note may be useful here. My Ph.D. was in clinical psychology at the time when projective diagnostic techniques in general, and the Rorschach in particular, were receiving a good deal of criticism. The logic for these techniques was compelling and it seemed reasonable that people’s personality would have a major impact on their interpretation of ambiguous stimuli. Unfortunately, the empirical evidence in support of the validity of projective techniques was largely negative. They are now a minor element in the training of clinical psychologists, except for a few hamlets here or there that still specialize in teaching about projective techniques.

The example of projective techniques seems similar to the issues raised about constructivist instruction. A careful reading and re-reading of all the chapters in this book, and the related literature, has indicated to me that there is stimulating rhetoric for the constructivist position, but relatively little research supporting it. For example, it is encouraging to see that Schwartz et al. (this volume) are conducting research on their hypothesis that constructivist instruction is better for preparing individuals for future learning. Unfortunately, as they acknowledge, there is too little research documenting that hypothesis. As suggested above, such research requires more complex procedures and is more time consuming, for both the researcher and the participants, than procedures advocated by supporters of explicit instruction. However, without supporting research these remain merely a set of interesting hypotheses.

In comparison to constructivists, advocates for explicit instruction seem to justify their recommendations more by references to research than rhetoric. Constructivist approaches have been advocated vigorously for almost two decades now, and it is surprising to find how little research they have stimulated during that time. If constructivist instruction were evaluated by the same criterion that Hilgard (1964) applied to Gestalt psychology, the paucity of research stimulated by that paradigm should be a cause for concern for supporters of constructivist views.

Both the Problem and the Solution

So, it seems that while a “telling” orientation is better supported by research, it is also identified as a barrier, if not the barrier, to progress. And it seems that a lot of our day-to-day struggle with the issue centers around the negative consequences of continued unsuccessful attempts at resolving this paradox.

Yet perhaps we should see that this is not a paradox at all. Of course it is a problem when students learn to rely heavily on explicit instruction to make up their thinking, and it is perfectly appropriate to find ways of punching holes in teacher talk time to reduce the possibility of this dependency. But we could also research ways of tackling this explicitly—differentiating ways in which explicit instruction can solicit student inquiry or creativity and ways in which it promotes rule following, for example.

It is at least worth considering that some of our problems—particularly in mathematics education—have less to do with explicit instruction and more to do with bad explicit instruction. If dealing with instructional problems head on is more effective (even those that are “high level,” such as creativity and critical thinking), then we should be making the sacrifices necessary to give teachers the resources and training required to meet those challenges, explicitly.

## Interleaving

research

Inductive teaching or learning, although it has a special name, happens all the time without our having to pay any attention to technique. It is basically learning through examples. As the authors of the paper we’re discussing here indicate, through inductive learning:

Children . . . learn concepts such as ‘boat’ or ‘fruit’ by being exposed to exemplars of those categories and inducing the commonalities that define the concepts. . . . Such inductive learning is critical in making sense of events, objects, and actions—and, more generally, in structuring and understanding our world.

The paper describes three experiments conducted to further test the benefit of interleaving on inductive learning (“further” because an interleaving effect has been demonstrated in previous studies). Interleaving is one of a handful of powerful learning and practicing strategies mentioned throughout the book Make It Stick: The Science of Successful Learning. In the book, the power of interleaving is highlighted by the following summary of another experiment involving determining volumes:

Two groups of college students were taught how to find the volumes of four obscure geometric solids (wedge, spheroid, spherical cone, and half cone). One group then worked a set of practice problems that were clustered by problem type . . . The other group worked the same practice problems, but the sequence was mixed (interleaved) rather than clustered by type of problem . . . During practice, the students who worked the problems in clusters (that is, massed) averaged 89 percent correct, compared to only 60 percent for those who worked the problems in a mixed sequence. But in the final test a week later, the students who had practiced solving problems clustered by type averaged only 20 percent correct, while the students whose practice was interleaved averaged 63 percent.

The research we look at in this post does not produce such stupendous results, but it is nevertheless an interesting validation of the interleaving effect. Although there are three experiments described, I’ll summarize just the first one.

Discriminative-Contrast Hypothesis

But first, you can try out an experiment like the one reported in the paper. Click start to study pictures of different bird species below. There are 32 pictures, and each one is shown for 4 seconds. After this study period, you will be asked to try to identify 8 birds from pictures that were not shown during the study period, but which belong to one of the species you studied.

Once the study phase is over, click test to start the test and match each picture to a species name. There is no time limit on the test. Simply click next once you have selected each of your answers.

Based on previous research, one would predict that, in general, you would do better in the interleaved condition, where the species are mixed together in the study phase, than you would in the ‘massed,’ or grouped condition, where the pictures are presented in species groups. The question the researchers wanted to home in on in their first experiment was about the mechanism that made interleaved study more effective.

So, their experiment was conducted much like the one above, except with three groups, which all received the interleaved presentation. However, two of the groups were interrupted in their study by trivia questions in different ways. One group—the alternating trivia group—received a trivia question after every picture; the other group—the grouped trivia group—received 8 trivia questions after every group of 8 interleaved pictures. The third group—the contiguous group—received no interruption in their study.

What the researchers discovered is that while the contiguous group performed the best (of course), the grouped trivia group did not perform significantly worse, while the alternating trivia group did perform significantly worse than both the contiguous and grouped trivia groups. This was seen as providing some confirmation for the discriminative-contrast hypothesis:

Interleaved studying might facilitate noticing the differences that separate one category from another. In other words, perhaps interleaving is beneficial because it juxtaposes different categories, which then highlights differences across the categories and supports discrimination learning.

In the grouped trivia condition, participants were still able to take advantage of the interleaving effect because the disruptions (the trivia questions) had less of an effect when grouped in packs of 8. In the alternating trivia condition, however, a trivia question appeared after every picture, frustrating the discrimination mechanism that seems to help make the interleaving effect tick.

Takeaway Goodies (and Questions) for Instruction

The paper makes it clear that interleaving is not a slam dunk for instruction. Massed studying or practice might be more beneficial, for example, when the goal is to understand the similarities among the objects of study rather than the differences. Massed studying may also be preferred when the objects are ‘highly discriminable’ (easy to tell apart).

Yet, many of the misconceptions we deal with in mathematics education in particular can be seen as the result of dealing with objects of ‘low discriminability’ (objects that are hard to tell apart). In many cases, these objects really are hard to tell apart, and in others we simply make them hard through our sequencing. Consider some of the items listed in the NCTM’s wonderful 13 Rules That Expire, which students often misapply:

• When multiplying by ten, just add a zero to the end of the number.
• You cannot take a bigger number from a smaller number.
• Addition and multiplication make numbers bigger.
• You always divide the larger number by the smaller number.

In some sense, these are problematic because they are like the sparrows and finches above when presented only in groups—they are harder to stop because we don’t present them in situations that break the rules, or interleave them. Appending a zero to a number to multiply by 10 does work on counting numbers but not on decimals; addition and multiplication do make counting numbers bigger until they don’t always make fractions bigger; and you cannot take a bigger counting number from a smaller one and get a counting number. For that, you need integers.

Notice any similarities above? Can we please talk about how we keep kids trapped for too long in counting number land? I’ve got this marvelous study to show you which might provide some good reasons to interleave different number systems throughout students’ educations. It’s linked above, and below.