Independent Events

Honestly, sometimes my first off-the-cuff measure of how well we do collectively with a topic in mathematics is how well I do with it. This is terrible reasoning, of course, but it has some uses as a kind of first measurement that needs to be independently verified. It helps me notice potential weak spots in instruction, anyway, like with independent events.

The concept of independent events is certainly a candidate for being a weak spot. Most of what I’ve seen online doesn’t really crack the surface. Students are allowed to explore the concept of independent events, but all that seems to mean is to take a situation and tell me whether the events are independent. Big whoop.

And the definition often used—that two events are independent if the occurrence of one does not affect the probability of the other—has the potential of really confusing students (and me). Consider this situation.

independent events

Are the events “selecting a letter card” and “selecting a circle card” independent events? The way a student (and I) might reason, given just the definition above, would be to think, “Well, if I pick B, that would definitely change the probability of picking a circle, because the B-card is also a circle card. And, if I pick A, that would change the probability of picking a circle card, because then I would only have three cards to choose from. So, the events are not independent.”

Try This Instead

The above was perfectly sane reasoning; it’s just wrong because of a terrible explanation (or lack of an explanation in most cases). I thought of something maybe a little better. Here it is in sentence form, similar to how we worked on the impenetrability of trig ratios:

You have the same probability of choosing a letter card from all the cards as you have of choosing a letter card from just the circle cards.

This is what makes choosing a letter card and choosing a circle card “independent.” If I know that I’ve drawn a circle, the probability that I’ve also drawn a letter, \(\mathtt{0.5}\), is the same as if I didn’t know I’d drawn a circle. And of course this works automatically the other way around too: if I know that I’ve drawn a letter card, the probability of drawing a circle is the same as if I didn’t know. If I reduce the sample space from all cards to circle cards, the probability of “letter” is the same.

At the heart of independent events (besides the conditional probability flavoring above) are equivalent ratios, or a proportion . . . which gives me an ultra-short way of saying it a little more mathematically:

\[\mathtt{\frac{2}{4} = \frac{1}{2}}\]

Or, “2 letter cards out of 4 cards in all is the same as 1 letter card out of 2 circle cards.” In the symbolism of probability, we actually write this with complex fractions (by giving each numerator and denominator above a denominator of 4), and then disguise the complex fractions with \(\mathtt{P()}\) statements, which is all equivalent to the above proportion:

\[\mathtt{\frac{\color{purple}{\frac{2}{4}}}{\color{red}{\frac{4}{4}}} = \frac{\color{red}{\frac{1}{4}}}{\color{purple}{\frac{2}{4}}} \longrightarrow \frac{\color{purple}{P(\textrm{letter})}}{\color{red}{1}} = \frac{\color{red}{P(\textrm{letter and circle})}}{\color{purple}{P(\textrm{circle})}}}\]

And that gives us the definition of independence using conditional probability, from S-CP.A.3. If we remember our proportion work from way back when, then the “other” test for the independence of two events pops out of the equivalence of the products of means and extremes: \[\mathtt{P(\textrm{letter}) \cdot P(\textrm{circle}) = P(\textrm{letter and circle})}\]

The complex fraction part of this explanation seems to be the most important, actually. And we don’t really do a good job of letting kids in on that disguise either. But still, stapling the idea of independent events to a pair of equivalent ratios (a proportion) helps the whole idea make a lot more sense to me. And, truthfully, it makes the notion of “independence” as “not having an effect on another probability” seem almost wrong.

Update: This kind of reasoning works for the typical example of independent events. The situation involving separate spinners is fairly easy for kids to identify as being about independent events, but like a lot of other topics in mathematics education, we start off with examples that are easy and also completely misleading. Then we all opine that kids have difficulties because the material gets “harder.” Anyway, spinning a C on the first spinner and a 2 on the second spinner are independent events, but not because there are “independent” spinners.

What’s the proportion (if the events are independent) that matches the situation?

independent events

Assuming we spin the first spinner and don’t know what we get, there are \(\mathtt{3 \times 1}\), or 3, outcomes that have 2 as the second spin, out of 12 possible outcomes. The outcomes are {(A, 2), (B, 2), (C, 2)}. But, if we know that we have spun a C first, then there is 1 outcome showing 2 on the second spinner, out of 4 possible outcomes. So, our proportion is \[\mathtt{\frac{3}{12} = \frac{1}{4}}\]

This is all we need to show that the two events are independent, actually. If that proportion is true, then the events are independent. But we can cue the complex fraction magic again for reinforcement: \[\mathtt{\frac{\color{purple}{\frac{3}{12}}}{\color{red}{\frac{12}{12}}} = \frac{\color{red}{\frac{1}{12}}}{\color{purple}{\frac{4}{12}}} \longrightarrow \frac{\color{purple}{P(2)}}{\color{red}{1}} = \frac{\color{red}{P(\textrm{C and 2})}}{\color{purple}{P(\textrm{C})}}}\]

Provided Examples vs. Generated Examples

research post

The results reported in this research (below) about the value of provided examples versus generated examples are a bit surprising. To get a sense of why that’s the case, start with this definition of the concept availability heuristic used in the study—a term from the social psychology literature:

Availability heuristic: the tendency to estimate the likelihood that an event will occur by how easily instances of it come to mind.

All participants first read this definition, along with the definitions of nine other social psychology concepts, in a textbook passage. Participants then completed two blocks of practice trials in one of three groups: (1) subjects in the provided examples group read two different examples, drawn from an undergraduate psychology textbook, of each of the 10 concepts (two practice blocks, so four examples total for each concept), (2) subjects in the generated examples group created their own examples for each concept (four generated examples total for each concept), and (3) subjects in the combination group were provided with an example and then created their own example of each concept (two provided and two generated examples total for each concept).

The researchers—Amanda Zamary and Katharine Rawson at Kent State University in Ohio—made the following predictions, with regard to both student performance and the efficiency of the instructional treatments:

We predicted that long-term learning would be greater following generated examples compared to provided examples. Concerning efficiency, we predicted that less time would be spent studying provided examples compared to generating examples . . . [and] long-term learning would be greater after a combination of provided and generated examples compared to either technique alone. Concerning efficiency, our prediction was that less time would be spent when students study provided examples and generate examples compared to just generating examples.

Achievement Results

All participants completed the same two self-paced tests two days later. The first assessment, an example classification test, asked subjects to classify each of 100 real-world examples into one of the 10 concept definition categories provided. Sixty of these 100 were new (Novel) to the provided-examples group, 80 of the 100 were new to the combination group, and of course all 100 were likely new to the generated-examples group. The second assessment, a definition-cued recall test, asked participants to type in the definition of each of the 10 concepts, given in random order. (The test order was varied among subjects.)

provided examples

Given that participants in the provided-examples and combination groups had an advantage over participants in the generated-examples group on the classification task (they had seen between 20 and 40 of the examples previously), the researchers helpfully drew out results on just the 60 novel examples.

Subjects who were given only textbook-provided examples of the concepts outperformed other subjects on applying these concepts to classifying real-world examples. This difference was significant. No significant differences were found on the cued-recall test between the provided-examples and generated-examples groups.

Also, Students’ Time Is Valuable

Another measure of interest to the researchers in this study, as mentioned above, was the time used by the participants to read through or create the examples. What the authors say about efficiency is worth quoting, since it does not often seem to be taken as seriously as measures of raw achievement (emphasis mine):

Howe and Singer (1975) note that in practice, the challenge for educators and researchers is not to identify effective learning techniques when time is unlimited. Rather, the problem arises when trying to identify what is most effective when time is fixed. Indeed, long-term learning could easily be achieved if students had an unlimited amount of time and only a limited amount of information to learn (with the caveat that students spend their time employing useful encoding strategies). However, achieving long-term learning is difficult because students have a lot to learn within a limited amount of time (Rawson and Dunlosky 2011). Thus, long-term learning and efficiency are both important to consider when competitively evaluating the effectiveness of learning techniques.

provided examples

With that in mind, and given the results above, it is noteworthy to learn that the provided-examples group outperformed the generated-examples group on real-world examples after engaging in practice that took less than half as much time. The researchers divided subjects’ novel classification score by the amount of time they spent practicing and determined that the provided-examples group had an average gain of 5.7 points per minute of study, compared to 2.2 points per minute for the generated-examples group and 1.7 points per minute for the combination group.

For learning declarative concepts in a domain and then identifying those concepts in novel real-world situations, provided examples proved to be better than student-generated examples for both long-term learning and for instructional efficiency. The second experiment in the study replicated these findings.

Some Commentary

First, some familiarity with the research literature makes the above results not so surprising. The provided-examples group likely outperformed the other groups because participants in that group practiced with examples generated by experts. Becoming more expert in a domain does not necessarily involve becoming more isolated from other people and their interests. Such expertise is likely positively correlated with better identifying and collating examples within a domain that are conceptually interesting to students and more widely generalizable. I reported on two studies, for example, which showed that greater expertise was associated with a significantly greater number of conceptual explanations, as opposed to “product oriented” (answer-getting) explanations—and these conceptual explanations resulted in the superior performance of students receiving them.

Second, I am sympathetic to the efficiency argument, as laid out here by the study’s authors—that is, I agree that we should focus in education on “trying to identify what is most effective when time is fixed.” Problematically, however, a wide variety of instructional actions can be informed by decisions about what is and isn’t “fixed.” Time is not the only thing that can be fixed in one’s experience. The intuition that students should “own their own learning,” for example, which undergirds the idea in the first place that students should generate their own examples, may rest on the more fundamental notion that students themselves are “fixed” identities that adults must work around rather than try to alter. This notion is itself circumscribed by the research summarized above. So, it is worth having a conversation about what should and should not be considered “fixed” when it comes to learning.

provided examples
Zamary, A., & Rawson, K. (2016). Which Technique is most Effective for Learning Declarative Concepts—Provided Examples, Generated Examples, or Both? Educational Psychology Review DOI: 10.1007/s10648-016-9396-9

Instructional Effects: Action at a Distance

I really like this recent post, called Tell Me More, Tell Me More, by math teacher Dani Quinn. The content is an excellent analysis of expert blindness in math teaching. The form, though, is worth seeing as well—it is a traditional educational syllogism, which Quinn helpfully commandeers to arrive at a non-traditional conclusion, that instructional effects have instructional causes, on the right:

The Traditional Argument An Alternative Argument
There is a problem in how we teach: We typically spoon-feed students procedures for answering questions that will be on some kind of test.

“There is a problem in how we teach: We typically show pupils only the classic forms of a problem or a procedure.”

This is why students can’t generalize to non-routine problems: we got in the way of their thinking and didn’t allow them to take ownership and creatively explore material on their own. “This is why they then can’t generalise: we didn’t show them anything non-standard or, if we did, it was in an exercise when they were floundering on their own with the least support.”

Problematically for education debates, each of these premises and conclusions taken individually are true. That is, they exist. At our (collective) weakest, we do sometimes spoon-feed kids procedures to get them through tests. We do cover only a narrow range of situations—what Engelmann refers to as the problem of stipulation. And we can be, regrettably in either case, systematically unassertive or overbearing.

Solving equations provides a nice example of the instructional effects of both spoon-feeding and stipulation. Remember how to solve equations? Inverse operations. That was the way to do equations. If you have something like \(\mathtt{2x + 5 = 15}\), the table shows how it goes.

Equation Step
\(\mathtt{2x + 5 \color{red}{- 5} = 15 \color{red}{- 5}}\) Subtract \(\mathtt{5}\) from both sides of the equation to get \(\mathtt{2x = 10}\).
\(\mathtt{\color{white}{+ 5 \,\,} 2x \color{red}{\div 2} = 10 \color{red}{\div 2}}\) Divide both sides of the equation by 2.
\(\mathtt{\color{white}{+ 5 \,\,}x = 5}\) You have solved the equation.

Do that a couple dozen times and maybe around 50% of the class freezes when they encounter \(\mathtt{22 = 4x + 6}\), with the variable on the right side, or, even worse, \(\mathtt{22 = 6 + 4x}\).

That’s spoon-feeding and stipulation: do it this one way and do it over and over—and, crucially, doing that summarizes most of the instruction around solving equations.

Of course, the lack of prior knowledge exacerbates the negative instructional effects of stipulation and spoon-feeding. But we’ll set that aside for the moment.

The Connection Between Premises and Conclusion

The traditional and alternative arguments above are easily (and often) confused, though, until you include the premise that I have omitted in the middle for each. These help make sense of the conclusions derived in each argument.

The Traditional Argument An Alternative Argument
There is a problem in how we teach: We typically spoon-feed students procedures for answering questions that will be on some kind of test.

“There is a problem in how we teach: We typically show pupils only the classic forms of a problem or a procedure.”

Students’ success in schooling is determined mostly by internal factors, like creativity, motivation, and self-awareness.

Students’ success in schooling is determined mostly by external factors, like amount of instruction, socioeconomic status, and curricula.

This is why students can’t generalize to non-routine problems: we got in the way of their thinking and didn’t allow them to take ownership and creatively explore material on their own. “This is why they then can’t generalise: we didn’t show them anything non-standard or, if we did, it was in an exercise when they were floundering on their own with the least support.”

In short, the argument on the left tends to diagnose pedagogical illnesses and their concomitant instructional effects as people problems; the alternative sees them as situation problems. The solutions generated by each argument are divergent in just this way: the traditional one looks to pull the levers that mostly benefit personal, internal attributes that contribute to learning; the alternative messes mostly with external inputs.

It’s Not the Spoon-Feeding, It’s What’s on the Spoon

I am and have always been more attracted to the alternative argument than the traditional one. Probably for a very simple reason: my role in education doesn’t involve pulling personal levers. Being close to the problem almost certainly changes your view of it—not necessarily for the better. But, roles aside, it’s also the case that the traditional view is simply more widespread, and informed by the positive version of what is called the Fundamental Attribution Error:

We are frequently blind to the power of situations. In a famous article, Stanford psychologist Lee Ross surveyed dozens of studies in psychology and noted that people have a systematic tendency to ignore the situational forces that shape other people’s behavior. He called this deep-rooted tendency the “Fundamental Attribution Error.” The error lies in our inclination to attribute people’s behavior to the way they are rather than to the situation they are in.

What you get with the traditional view is, to me, a kind of spooky action at a distance—a phrase attributed to Einstein, in remarks about the counterintuitive consequences of quantum physics. Adopting this view forces one to connect positive instructional effects (e.g., thinking flexibly when solving equations) with something internal, ethereal and often poorly defined, like creativity. We might as well attribute success to rabbit’s feet or lucky underwear or horoscopes!

instructional effects

Making “Connections”

I‘m nearing the end of my read of James Lang’s terrific book Small Teaching, and I’ve wanted for the last 100 pages or so to recommend it highly here. While I do that, however, I’d also like to mention a confusion that piqued my interest near the middle of the book—a very common false distinction, I think, between ‘making connections’ and knowing things. Lang sets it up this way:

When we are tackling a new author in my British literature survey course, I might begin class by pointing out some salient feature of the author’s life or work and asking students to tell me the name of a previous author (whose work we have read) who shares that same feature. “This is a Scottish author,” I will say. “And who was the last Scottish author we read?” Blank stares. Perhaps just a bit of gaping bewilderment. Instead of seeing the broad sweep of British literary history, with its many plots, subplots, and characters, my students see Author A and then Author B and then Author C and so on. They can analyze and remember the main works and features of each author, but they run into trouble when asked to forge connections among writers.

What immediately follows this paragraph is what one would expect from a writer who has done his homework on the research: Lang reminds himself that his students are novices and he an expert; his students’ knowledge of British literature and history is “sparse and superficial.”

But then, suddenly, the false distinction, where ‘knowledge’ takes on a different meaning, becoming synonymous with “sparse and superficial,” and his students have it again:

In short, they have knowledge, in the sense that they can produce individual pieces of information in specific contexts; what they lack is understanding or comprehension. And they lack comprehension, even more shortly, because they lack connections.

Nope, Still Knowledge


As we saw here, with the Wason Selection Task, reasoning ability itself is dependent on knowledge. Participants who were given abstract rules had tremendous difficulties with modus tollens reasoning in particular, yet when these rules were set in concrete contexts, the difficulties all but vanished.

One might say, indeed, that in concrete contexts, the connections are known, not inferred. Thus, if you want students to make connections among various authors, it might help to tell them that they are connected, and how.


I‘ve come around to thinking about acquisition again—that window of time in which learners make first contact with something to be learned. Ohlsson refers to this as the “Getting Started” phase, one of three phases in “skill acquisition”:

The first stage begins when the learner encounters the practice task and ends when he completes the task correctly for the first time. . . .

There are five distinct types of information that might be available at the outset of practice: direct instructions; declarative knowledge about the task; strategies for analogous tasks; demonstrations, models and solved examples; and outcomes of unselective search. . . .

The second stage of practice begins when the learner completes the task for the first time and it lasts until he can reliably perform the task correctly. During this stage, the main theoretical problem is how the learner can improve his incomplete and possibly incorrect version of the strategy-to-be-learned. . . .

The optimization stage begins when the target skill has been mastered, that is, when it can be executed reliably, and lasts as long as the person keeps practicing.


At right is a graph from the same book which shows “how the relative importance of four learning mechanisms might shift across the three phases of skill acquisition.” Instruction and solved examples are of utmost importance in the acquisition phase, with feedback about performance taking over in Phase 2, and self-directed learning from the statistical regularities of the domain in Phase 3.

You Are Here

At the moment, I would argue that modern educational thought does not pay sufficient attention to that first acquisition phase of learning. This does not seem to be a deliberate shifting of attentional resources away from Phase 1; rather, it is more a matter of conceptualizing “learning” as not having a Phase 1 at all—or a Phase 1 so straightforward and inevitable that it is of little interest to either practitioners or researchers.

We hear a lot about the testing effect, for example—as well we should, and more of it!—but we hear comparatively little about whether the representational content of what we are testing is any good. There is a lot of talk about methodologies such as using worked examples or uncovering the curriculum (differences that are important to talk about), but comparatively little talk about the quality of the content of those examples and whether what is being uncovered are just in-the-moment rules or in-the-moment ‘sense-making’ that won’t make sense later on.


A colleague of mine recently reminded me that we still teach both rectangles and squares at the Kindergarten level, with no discussion about how squares are rectangles. It’s right there in the CC Standards with no caveats.

Consider for a moment that teaching this false distinction requires extra energy on our part. We’re not letting things slide here; we’re reinforcing a misconception, purposely and actively. Consider also that there is no pedagogical justification for this, except within a system that always starts at Phase 2. We take ’em as they are after watching Sesame Street and playing with pre-school toys and try to roll back their misconceptions with feedback and exams. This process then repeats itself at every grade level. Finally, consider that this is Kindergarten math. If we can be moved, by some benevolent or malevolent philosophizing, to talk about rectangles and squares as two different things, there’s no telling what else we can be talked into.


In my view, the study of acquiring representational content, which amounts to the study of representational content itself, is a promising avenue for AI research—which I think will gradually weigh in with increasing credibility on educational issues.

Let’s “teach” one computer to form a solid schema in which rectangles and squares are distinct (but related) objects and then measure the cost (if any) of trying to undo this schema when we try to teach it about quadrilateral classification. I wouldn’t be surprised if there was already work on this. I wouldn’t be surprised if there wasn’t either.

Update: The Weak Argument About the Weakness of Instruction

Education theorizing that features an emaciated acquisition phase is a good fit for the occupational constraints of classroom teaching (as I mention here); conversely, theories that promote a robust acquisition phase may likely be seen as threatening.

As such, there is a healthy amount of rationalizing discourse in education about “telling” that keeps this dissonance at bay. Ridiculous deepities like “the two lies of teaching” passed around in pep-rally seminars and online are of course far more common and easy to dismiss, but this, research reported in Ohlsson’s book, demonstrates how virtually anyone can conclude that instruction is ineffective when they are resigned to terrible instruction:

What could the adult mean by saying that the Earth is round? The word “round” is ambiguous; it is used to refer to both circular (a two-dimensional property) and spherical (a three-dimensional property). Which of these meanings will be activated? Unless the child conceives the Earth as extending indefinitely in all directions, the flat Earth must have an edge somewhere, and an edge is a kind of thing that can be circular. The child is likely to conclude that the adult is saying that the flat Earth has a circular edge. . . . In short, if the listener believes that the Earth is flat, the apparently contradictory discourse, the Earth is round has no power to teach him or her otherwise, because the listener’s prior beliefs hold too much power over its interpretation.

Note the bold conclusion that teaching has no power over prior knowledge; and that teaching here is conceptualized as saying a sentence to students. The entire book, of course, rather than a blog post, provides the right context in which to interpret Ohlsson’s thoughts. But I thought I’d point out that the above is repeated a little later, even more boldly:

Because new information is interpreted in terms of a person’s current concepts and beliefs, such information has little power to change those concepts and beliefs. There is no obvious way to circumvent this assimilation paradox. As we cannot see without our eyeballs, so we cannot understand a discourse without our prior concepts.

It’s bizarre that the obvious objection is not dealt with, if only to try to dismiss it. That is, it seems far more likely that children are told that the Earth is round, and this represents incomplete instruction, which students fill in with false notions.

The researchers even noted that demonstrating the shape of the Earth with a globe was sometimes not enough to dislodge misconceptions—children modified their ideas to make the Earth a hollow sphere partially filled with dirt, creating a flat plane on which people walked. Yet this did remove the misconception, because it succeeded in modifying children’s ideas about the shape of the Earth, and the instruction was about the shape of the Earth, not about how people move around on it.

The idea that gravity is like a magnet holding people to the Earth, even “upside down” from a certain perspective, is an idea that you rarely see explicitly demonstrated. But this missing idea is clearly a possible reason for the misconceptions mentioned.

Vosniadou, S., & Brewer, W. (1992). Mental models of the earth: A study of conceptual change in childhood Cognitive Psychology, 24 (4), 535-585 DOI: 10.1016/0010-0285(92)90018-W

Searching the Solution Space


My reading in education has been a bit disappointing lately. This has everything to do with the relationship between what I’m currently thinking about and the specific material I’m looking into, rather than the books and articles by themselves. But Ohlsson’s 2011 book Deep Learning is so far a wonderful exception to the six or seven books collecting digital dust inside my Kindle, waiting for me to be interested in them again. The reason, I think, is that Ohlsson is looking to tackle topics that I am incredibly suspicious about, insight and creativity, in a smart and systematically theoretical way. The desire to provide technical, functional, connected explanations of concepts is evident on every page.

Prior Knowledge Constrains the Solution Space

Of particular interest to me is the idea that prior knowledge constrains a ‘problem space,’ or what Ohlsson wants to re-classify as a ‘solution space’:

A problem solution consists of a path through the solution space, a sequence of cognitive operations that transforms the initial situation into a situation in which the goal is satisfied. In a familiar task environment, the person already knows which step is the right one at each successive choice point. However, in unfamiliar environments, the person has to act tentatively and explore alternatives. Analytical problem solving is difficult because the size of a solution space is a function of the number of actions that are applicable in each situation—the branching factor—and the number of actions along the solution path—the path length. The number of problem states, \(\mathtt{S}\), is proportional to \(\mathtt{b^N}\), where \(\mathtt{b}\) is the branching factor and \(\mathtt{N}\) the path length. \(\mathtt{S}\) is astronomical for even modest values of \(\mathtt{b}\) and \(\mathtt{N}\), so solution spaces can only be traversed selectively. By projecting prior experience onto the current situation, both problem perception and memory retrieval help constrain the options to be considered to the most promising ones.

So, prior knowledge casts a finite amount of light on a select portion of the solution space, illuminating those elements which are consistent with representations in long-term memory and with a person’s current perception of the problem. It may even be the case that the length of the beam from the prior-knowledge flashlight corresponds to the limitations of working memory.

Crucially, this selectivity creates a dilemma. It is necessary to limit the solution space—otherwise, a person would be quickly overwhelmed by multiple, interacting elements of a problem situation—but, as is shown, prior knowledge (among other things) may restrict activation to those elements in the solution space which are unhelpful in reaching the goal.

It may be a goal, for example, for students to have a flexible sense of number, such that they can estimate with sums, products, differences, and quotients over a variety of numbers. Yet, students’ prior knowledge of working with mathematics can lead them to activate (and thus ‘see’) only ‘narrow’ procedural elements of solution spaces. The result can be that procedural mathematics is activated even when it serves no useful purpose at all.

This ‘tyranny’ of prior knowledge effects can be seen in the classic Einstellung experiments, a version of which is below—originally included in Dr Hausmann’s write-up on the topic, which I recommend highly. The goal below is to simply fill up one of the “jars” to the target level (the first target is 100). When you’re done, head over to Dr Bob’s site for the explanation. A similar obstacle to learning, described by S. Engelmann in his work, is called the problem of stipulation.

A: 21
B: 127
C: 3
Target: 100

But Creativity Theories Are Not Learning Theories

If one wanted to provide a slightly more serious intellectual justification for much of the popular folk-theorizing in education over the last decade—and then essentially replay its development, idea by idea—misinterpreting insight and creativity theories like Ohlsson’s would be an excellent strategy for doing so. (He never says, for example, that simply prior knowledge constrains the solution space, but that unhelpful prior knowledge does.)

It all seems to be there for the taking in these kinds of theories: the notion that solving problems is education’s raison d’etre, the idea that an unknowable future—rather than being just a fact that we must accept—can play a part as a premise in some chain of reasoning, the bizarre thought that removing instructional support can represent a game-changing way of restructuring the majority of learning time, a fluttering emphasis on collaboration and distributed cognition. All of this that has been humming in the background (and foreground) for a while in education fits comfortably and rationally inside creativity theories rather than learning theories.

From a learning-theory point of view, the problem of, say, thinking flexibly about number is primarily a problem of constructing better solution spaces—bring the goal within the flashlight’s view by instructing students (and thus making activation of ‘number sense’ more likely). Unfortunately, this requires a longer-term view, greater political will, and a bit of distance from everyday reality. Insight and creativity theories, on the other hand, assume that this number sense is already there but remains inert (students are only ever experiencing ‘unwarranted impasses’). The problem for insight theory becomes simply how to redirect the flashlight’s beam so that it uncovers the right knowledge. Along with the background assumptions listed above, these further assumptions of insight theories make them remarkably well tuned to the constraints of institutional teaching, both self-imposed and externally imposed. The practical work of teaching is still mostly a one-year-at-a-time affair, and sixth grade teachers, for example, do not have the time to remake solution spaces anew over the course of one year. What is within reach are redirection techniques suggested by insight theories. In this context, misinterpreting theories about insight for learning theories is practically inevitable.

Perhaps I’ll find out where Ohlsson makes learning theory and insight/creativity theory connect as I read further. But it’s worth noting that research Ohlsson himself conducted after the publication of this book has produced conclusions that run counter to certain predictions within it.

We’ll see!

Audio Postscript

Perplexity Is Not Required for Learning

Let me see if I can accurately describe the results of the first experiment from this study. But before I do, watch this video, featuring the perplexity of one student, Justine:

The video features a student, Justine, faced with a ‘cognitive conflict’—one between her notion that objects fall at different rates depending on their weight and the scientific reality that objects fall at the same rate under the force of gravity.

Importantly, according to the video this conflict seems to be a necessary part of teaching and learning, as Miss Reyes suggests at about 0:54:

Look, I’ve been teaching for 12 years, and trust me, these students are anything but blank slates. They may be here for 8 hours a day, but for the other 16, they’re in all kinds of classrooms: the dinner table, karate lessons, their sneeper-peeper feed, trashy vampire novels . . . My point is that these kids are out there, living life, constantly learning things.

Thus, for learning to occur, student thinking must be probed for misconceptions, and these misconceptions must be patiently challenged—i.e., cognitive conflict (or in some versions, perplexity or incongruity) must be induced. This I believe is the most widely subscribed version of the conceptual change model of learning. It is the prescriptive cousin of empirical and philosophical work that has assigned the cause of conceptual change to “the drive to make sense of anomalous observations that are inconsistent with existing concepts.”

Change Without Conflict

Yet, in recent years—more recent than the papers cited in the link at the end of the video—the construct of conceptual change has met with a great deal of challenge and has not found much robust empirical support, leading the author of the study I examine here to write this fairly bold statement (emphasis mine):

Since Limon’s (2001) review, cognitive change researchers have investigated the influence on conceptual change of a broader range of cognitive and non-cognitive factors, including affect and motivation, individual differences, metacognition, epistemological beliefs, and intention to change. . . .

Taken at face value, the relative lack of effect of such conflicts across a broad range of studies falsifies the cognitive conflict hypothesis: The difficulty of conceptual change must reside elsewhere than in conflict, or rather the lack thereof, between misconceptions and normatively correct subject matter.

In fact, in their first experiment, Ramsburg and Ohlsson found that even inducing conflict via the age-old method of telling students they were wrong did not have a relatively significant impact.

Over several trials, the researchers first taught 120 undergraduates a misconception regarding visual characteristics of bacteria that make them oxygen-resistant (e.g., ‘contain black nuclei’). Seventy-four of these students learned the misconception ‘to mastery.’ Then, for one group of students, called the complete condition, a further set of trials switched up the critical characteristic in the images, allowing students to see immediately the visual evidence disconfirming their misconception. For example, images of bacteria with black nuclei were shown with accompanying feedback informing students that these were not oxygen-resistant.

A second group of students, called the confirmatory-only condition, by contrast, were never presented with disconfirming evidence of their prior misconception in subsequent trials. All images that did not show the new oxygen-resistant characteristic also did not show the black nuclei. Thus, this group saw only positive examples of the new characteristic; their prior misconception was not explicitly challenged in the image trials.

As you might expect at this point, though it is still somewhat suprising in the context of common wisdom, no significant differences were found between the groups’ abilities to learn the new characteristic after the misconception had been taught. No significant differences were identified in the rate of learning either. The second and third experiments in this study replicated these results, even after attempting to control for some weaknesses in the first experiment and strengthening the complete ‘perplexity’ condition.

What Does This Mean for Justine?

Galileo en Pisa

Given the results of this study and similar results over recent years, we can at least conclude that it is wrong to suggest that provoking perplexity or inducing dissonance or cognitive conflict is necessary for learning. And even ‘softer’ claims about conceptual change probably deserve a great deal of suspicion and scrutiny. However, as the authors describe in some detail in the study, there are a number of methodological and conceptual challenges one faces in experimenting with this construct. So, it is by no means time to cut the funding for conceptual change research.

An intriguing question, I think, is why perplexity and incongruity don’t work (to the extent that they don’t). Why might it not be necessary—or even important—to induce a conflict between what students know and what they don’t? I suspect the beneficial effects of cognitive conflict are mediated by prior knowledge (well, what isn’t?), in favor of those with more of it.


Ramsburg, J., & Ohlsson, S. (2016). Category change in the absence of cognitive conflict. Journal of Educational Psychology, 108 (1), 98-113 DOI: 10.1037/edu0000050

Teach Me My Colors

toy problem

In the box below, you can try your hand at teaching a program, a toy problem, to reliably identify the four colors red, blue, yellow, and green by name.

You don’t have a lot of flexibility, though. Ask the program to show you one of the four colors, and then provide it feedback as to its response—in that order. Then repeat. That’s all you’ve got. That and your time and endurance.

Of course, I’d love to leave the question about the meaning of “reliably identify the four colors” to the comments, but let’s say that the program knows the colors when it scores 3 perfect scores in a row—that is, if you cycle through the 4 colors three times in a row, and the program gets a 4 out of 4 all three times.

Just keep in mind that closing or refreshing the window wipes out any “learning.” Kind of like summer vacation. Or winter break. Or the weekend.

Death, Taxes, and the Mind

The teaching device above is a toy problem because it is designed to highlight what I believe to be the most salient feature of instruction—the fact that we don’t know a lot about our impact. Can you not imagine someone becoming frustrated with the “teaching” above, perhaps feverishly wondering what’s going on in the “mind” of the program? Ultimately, the one problem we all face in education is this unknown about students’ minds and about their learning—like the unknown of how the damn program above works, if it even does.

One can think of the collective activity of education as essentially the group of varied responses to this situation of fundamental ambiguity and ignorance. And similarly, there are a variety of ways to respond to the painful want of knowing solicited by this toy problem:

Seeing What You Want to See
Pareidolia is the name given to an occurrence where people perceive a pattern that isn’t there—like the famous “face” on Mars (just shadows, angles, and topography). This can happen when incessantly clicking on the teaching device above too. In fact, these kinds of pattern-generating hypotheses jumped up sporadically in my mind as I played with the program, and I wrote the program. For example, I noticed on more than one occasion that if I took a break from incessant clicking and came back, the program did better on that subsequent trial. And between sessions, I was at one point prepared to say with some confidence that the program simply learned a specific color faster than the others. There are a huge number of other, related superstitions that can arise. If you think they can only happen to technophobes and the elderly, you live in a bubble.

Constantly Shifting Strategies
It might be optimal to constantly change up what you’re doing with the teaching device, but trying to optimize the program’s performance over time is probably not why you do it. Frustration with a seeming lack of progress and following little mini-hypotheses about short-term improvements are more likely candidates. A colleague of mine used to characterize the general orientation to work in education as the “Wile E. Coyote approach”—constantly changing strategies rather than sticking with one and improving on it. The darkness is to blame.

Letting the Activity Judge You
This may be a bit out in left field, but it’s something I felt while doing the toy problem “teaching,” and it is certainly caused by the great unknown here—guilt. Did I remember to give feedback that last time? My gosh, when was the last time I gave it? Am I the only one who can’t figure this out, who is having such a hard time with this? (Okay, I didn’t experience that last one, but I can imagine someone experiencing it.) It seems we will happily choose even the distorted feel-bad projections of a hyperactive conscience over the irritating blankness of not knowing. Yet, while we might find some consolation in the truth that we’re too hard on ourselves, we also have the unhappy task of remembering that a thousand group hugs and high-fives are even less effective than a clinically diagnosable level of self-loathing at turning unknowns into knowns.

Conjecturing and Then Testing
This, of course, is the response to the unknown that we want. For the toy problem in particular, what strategies are possible? Can I exhaust them all? What knowledge can I acquaint myself with that will shine light on this task? How will I know if my strategy is working?

Here’s a plot I made of one of my runs through, using just one strategy. Each point represents a test of all 4 colors, and the score represents how many colors the program identified correctly.

Was the program improving? Yes. The mean for the first 60 trials was approximately 1.83 out of 4 correct, and the mean for the back 63 was approximately 2.14 out of 4. That’s a jump from about 46% to about 54%.

Is that the best that can be done? No. But that’s just another way the darkness gets ya—it makes it really hard to let go of hard-won footholds.

Knowing Stuff

Some knowledge about how the human mind works is analogous to knowing something about how programs work in the case of this toy problem. Such knowledge makes it harder to be bamboozled by easy to vary explanations. And in general such knowledge works like all knowledge does—it keeps you away, defeasibly, from dead-ends and wrong turns so that your cognitive energy is spent more productively.

Knowing something about code, for example, might instantly give you the idea to start looking for it in the source for this page. It’s just a right click away, practically. But even if you don’t want to “cheat,” you can notice that the program serves up answers even prior to any feedback, which, if you know something about code, would make you suspect that they might be generated randomly. Do they stay random, or do they converge based on feedback? And what hints does this provide about the possible functioning of the program? These better questions are generated by knowledge about typical behavior, not by having a vast amount of experience with all kinds of toy problem teaching devices.

How It Works

So, here’s how it works. The program contains 4 “registers,” or arrays, one for each of the 4 colors—blue, red, green, yellow. At the beginning of the training, each of those registers contains the exact same 4 items: the 4 different color names. So, each register looks like this at the beginning: [‘blue’, ‘red’, ‘green’, ‘yellow’].

Throughout the training, when you ask the program to show you a color, it chooses a random one from the register. This behavior never changes. It always selects a random color from the array. However, when you provide feedback, you change the array for that color. For example, if you ask the program to show you blue, and it shows you blue, and you select the “Yes” feedback from the dropdown, a “blue” choice is added to the register. So, if this happened on the very first trial, the “blue” register would change from [‘blue’, ‘red’, ‘green’, ‘yellow’] to [‘blue’, ‘red’, ‘green’, ‘yellow’, ‘blue’]. If, on the other hand, you ask for blue on the very first trial, and the program shows you green, and you select the “No” feedback from the dropdown, the 3 colors that are NOT green are added to the “blue” register. In that case, the “blue” register would change from [‘blue’, ‘red’, ‘green’, ‘yellow’] to [‘blue’, ‘red’, ‘green’, ‘yellow’, ‘blue’, ‘red’, ‘yellow’].

A little math work can reveal that positive feedback on the first trial moves the probability of randomly selecting the correct answer from 0.25 to 0.4. For negative feedback, there is still a strengthening of the probability, but it is much smaller: from 0.25 to about 0.29. These increases decrease over time, of course, as the registers fill up with color names. For positive feedback on the second trial, the probability would strengthen from 0.4 to 0.5. For negative feedback, approximately 0.29 to 0.3.

Thus, in some sense, you can do no harm here so long as your feedback matches the truth—i.e., you say no when the answer is incorrect and yes when it is correct. The probability of a correct answer from the program always gets stronger over time with appropriate feedback. Can you imagine an analogous conclusion being offered from education research? “Always provide feedback” seems to be the inescapable conclusion here.

But a limit analysis provides a different perspective. Given an infinite sequence of correct-answer-only trials \(\mathtt{C(t)}\) and an infinite sequence of incorrect-answer-only trials \(\mathtt{I(t)}\), we get these results:

\[\mathtt{\lim_{t\to\infty} C(t) = \lim_{t\to\infty}\frac{t + 1}{t + 4} = 1, \qquad \lim_{t\to\infty} I(t) = \lim_{t\to\infty}\frac{t + 1}{3t + 4} = \frac{1}{3}}\]

These results indicate that, over time, providing appropriate feedback only when the program makes a correct color identification strengthens the probability of correct answers from 0.25 to 1 (a perfect score), whereas the best that can be hoped for when providing feedback only when the program gives an incorrect answer is just a 1-in-3 shot at getting the correct answer. When both negative and positive feedback are given, I believe a similar analysis shows a limit of 0.5, assuming an equal number of both types of feedback.

Of course, the real-world trials bear out this conclusion. The data graphed above are from my 123 trials giving both correct and incorrect feedback. Below are data from just 67 trials giving feedback only on correct answers. The program hits the benchmark of 3 perfect scores in a row at Trial 53, and, just for kicks, does it again 3 more times shortly thereafter.


Of course, the toy problem here is not a student, and what is modeled as the program’s “cognitive architecture” is nowhere near as complex as a student’s, even with regard to the same basic task of identifying 4 colors. There are obviously a lot of differences.

Yet there are a few parallels as well. For example, behaviorally, we see progress followed by regress with both the program and, in general, with students. Perhaps our minds work in a probabilistic way similar to that of the program. Could it be helpful to think about improvements to learning as strengthening response probabilities? Relatedly, “practice” observably strengthens what we would call “knowledge” in the program just as it does, again in general, for students.

And, I think fascinatingly, we can create and reverse “misconceptions” in both students and in this toy problem. We can see how this operates on just one color in the program by first training it to falsely identify blue as ‘green’ (to a level we benchmarked earlier as mastery—3 perfect responses in a row). Then, we can switch and begin teaching it the correct correspondence. As we can now predict, reversing the misconception will take longer than instantiating it, even with the optimal strategy, because the program’s register will have a large amount of information in it—we will be fighting against that large denominator.

toy problem

‘No Logic in the Knowledge’

Marilyn Burns left this really nice comment over at Dan’s a while ago. It’s one of the few times in my recent memory where I’ve seen an attempt to produce an understandable and functional rationale for a “no-telling” approach that doesn’t mention agency or character. Instead, it references the amount of ‘logic in the knowledge’ being taught. And it’s balanced and sensitive to context to boot. Here’s part of it:

Explicit instruction (teaching by telling?) is appropriate, even necessary, when the knowledge is based in a social convention. Then I feel that I need to “cover” the curriculum. We celebrate Thanksgiving on a Thursday, and that knowledge isn’t something a person would have access to through reasoning without external input―from another person or a media source. There’s no logic in the knowledge. But when we want students to develop understanding of mathematical relationships, then I feel I need to “uncover” the curriculum.

In a nutshell, some concepts are connected in such a way as to make it possible for students to derive one (or many) given another. There is ‘logic in the knowledge,’ and so explicit instruction is not strictly necessary to make connections between nodes in those situations; students can possibly do that themselves.

logic in the knowledge

A good example of a convention that math teachers might think of is the order of operations. One might argue that there’s no logic there; you just have to know the agreed-upon order, and so we have to teach it directly. By contrast, manipulating numerators and denominators when adding fractions has a logic behind it—if you’re adding win-loss records represented as fractions to determine total wins to total losses, then adding across is just fine. But it’s usually not fine, because the denominator often represents a whole rather than another part. Students do not necessarily have to be told this logic to get it. They can be led to discover that one third of a pizza plus two thirds of the pizza can’t possibly mean three sixths, or one half, of the pizza. Then they can build models to pin down exactly what fraction addition does represent, along with the connections to the symbolic representations of those meanings.

Okay, So That’s a Good Start Anyway

Yet, while it seems reasonable to me to suggest that one must teach explicitly when there is ‘no logic in the knowledge,’ it’s too strong, I think, to suggest that when the logic is there one must not teach that way. (And I note that Ms Burns does not go this far in her comment.) Explanations do not make it impossible nor even difficult for students to traverse those conceptual nodes in ‘logic-filled’ knowledge—unless they are bad explanations or they are foisted on students who have a lot of background knowledge or both.

Regardless, it seems like a pretty good start to say that when deciding where on the “telling” spectrum we can best situate ourselves, we can think about, among other things, how well connected a concept is to other concepts (for the students), the conceptual distance between two or more nodes (for the students)—there are a lot of good testable hypotheses we might generate by staying in the content weeds while boxing out distracting woo.

Ultimately, what I think is pleasing about Burns’s comment is that it is the beginning of a good explanation. It provides a functional rationale—tied directly to content, but not ignoring students—for choosing one or another general teaching method. And it can be connected to other things we know about, such as the expertise-reversal effect and the generation effect. These are the kinds of explanations we should be producing and looking for in education in my humble opinion. It is not necessary to be a researcher or academic to generate or appreciate them.

Image credit: Mauro Entrialgo.

Help Me Explain These Results

Two questions from a survey I had in my research stash from a while ago (it was real; it just wasn’t public): (1) What is your greatest challenge as a math educator? and (2) What do your students struggle with the most? The responses for motivation and prior knowledge are interesting. Here is a visual for only 4 of the responses to each question:

I was curious to see how folks in the K12 Math Ed Community on Google+ would respond to these questions (the wordings of the choices below are paraphrases of the wordings in the larger survey):


The results show essentially the same shape for both groups: in both (on a superficial reading), educators’ main challenges don’t seem to be a fit for students’ main struggles. And, importantly, they could have been. That is, if boredom and anxiety were higher, one could conclude that these problems were generating the motivation challenge for teachers, but in neither group of results do ‘boredom’ and ‘anxiety’ match ‘lack of foundational knowledge’ as a source of student struggle—even when you add them together.

So, what’s the right spin on these results? Do you see the same disconnect I’m seeing?

P.S.: I was reminded just before posting this about the study explained in the video at the right. This quotation from the video seems to be related to the results above:

As the percentage of students in the first grade classroom with math difficulties increased, teachers tended to increase their use of . . . movement or music to teach mathematics or increased use of manipulatives or calculators.

Reading Between the Lines

Frankly, these results make me worry. I worry because I think that while motivation is a potential problem for every student, motivation is a primary problem mostly for rich and otherwise well-resourced students. I’ve hinted at this before. And I’m not alone in this worry. Former teacher and administrator Eric Kalenze writes, in his book Education Is Upside Down:

The divide continues to grow. America’s “have-not” students spend much of their school career being coaxed into engaging with learning (and not necessarily learning what will help them academically or institutionally), while the “have” students, pre-engaged in school tasks by virtue of birth into an alignment with mainstream institutional expectations, receive the academic rigor that will propel them into rewarding post-secondary study and lucrative careers.

Fifteen year veteran of the classroom and current dean of the School of Education at the University of Michigan has noticed too. Her presentation at the 2015 meeting of the National Council of Supervisors of Mathematics implored educators to be explicit in their instruction for the sake of equity:

Requesting is not the same as teaching; when rich mathematical tasks and situations are used and students are left to puzzle about them on their own, likely will privilege those who have had opportunities with “experimenting with possibilities” and overcoming “broader societal and cultural views of what mathematics is and who is good at it.” [Slide 17]

And this post, “Knowledge Equality”, by Lisa Hansel from E.D. Hirsch’s Core Knowledge Foundation makes a passionate call for an education system that prioritizes knowledge for all:

What I mean by knowledge equality is all children having equal opportunities to learn the academic knowledge that opens doors. The knowledge that really is power. The knowledge that represents the history of human accomplishment. The knowledge that stands the test of time because it is beautiful.

The knowledge that privileged children acquire at home, in libraries and museums, and in school.

While it certainly makes some sense for motivational techniques to be used to address deficits in foundational knowledge, it doesn’t make that much sense. And I see in these results yet another indication of what my own observations and conversations with teachers and other education stakeholders both on and offline point to: that schooling as an institution is being pulled increasingly toward serving the needs of the few and well connected and away from serving the educational needs of the many. It’s a view of education’s priorities that is being sold by business and technology “gatekeepers” and their accomplices rather than demonstrated and proven by careful public scientific work.

For the next school year, and the one after that, administrators and policymakers should summon the courage to refocus the energy and resources of schools toward the “boring” technical work of building all students’ foundational knowledge—and help their people develop an immunity to untested, unrealistic motivational jibber-jabber.

Image mask: Marc Smith