Sunday, May 8, 2011

A bad apple in Duckworth's IQ-Motivation meta-analysis?

A new paper by Duckworth et. al. (2011) that links motivation to performance on IQ tests has interested SailerBryan Caplan, and Tyler Cowen.  The regular media claimed the study weakens evidence around IQ testing.  It does not.  The study supports the predictive power of IQ tests but claims that they are a composite measure of intelligence and motivation.  If true, if motivation is more malleable, this could help explain the Flynn effect and make it easier to find genes associated with intelligence.  But after getting reviewing the paper's meta-analysis of randomized studies of motivation and IQ, I have my doubts about the robustness of "average" 0.64 improvement in SD  due to incentives.

The paper's meta-analysis of the 46 studies was very good but they neglected to make a "forest plot" that graphs the effect sizes and confidence intervals from each study.  So I did it.  The size of the effect circle is proportional to the inverse of the effect's standard error, i.e., bigger studies that get more weight have bigger circles.

Two things are apparent: the study outcomes are highly variable, i.e., they are heterogeneous, and there are only three large experiment (2, 3, and 4 in the graph) that showed motivation leading to a large improvement in IQ score and hence are very influential.

The three experiments were run on special ed kids and written up in one paper by Bruening and Zella (1978) while the first author was at the Oakdale Center for Developmental Disabilities.  A few years later Bruening admitted to fabricating data for other experiments on retarded kids at the same center and the case became a text book  example of scientific fraud.  Although there were no allegations of fraud on the 1978 paper, I re-ran the meta-analysis with out Bruening's data and found the the estimate (using the random effects model) was now 0.48 SD and was no longer significant ( p = .07).  This makes me uneasy about accepting Duckworth's results without further replication. 

14 comments:

jlovborg said...

Thanks. I've been meaning to look at the individual studies in Duckworth's paper myself, because her results simply seemed implausible. She is something of an anti-IQ warrior, and has published sloppy studies before, too.

It's interesting that when a non-specialist journal publishes a paper on IQ, it's almost always something that tries to question and minimize the significance of IQ.

Statsquatch said...

The paper is OK. It seems plausible that you can incentivize retards to improve a few points on an IQ test. I do not see any big implications, though. If Duckworth is the best anti-IQ warrior they have that side is in more trouble than I thought.

Steve Sailer said...

Thanks.

Griffinfuhrer said...

So, the papers that granted most of the weight to Duckworth's meta-analysis were conducted on the mentally retarded?

I've encountered Duckworth before. She did a paper a while ago presumably demonstrating that motivation trumped IQ as a casual variable in determining educational achievement. Have you seen this paper and have any comments on it also?

jlovborg said...

The paper is OK. It seems plausible that you can incentivize retards to improve a few points on an IQ test.

Yes, but Bruening's experiments produced a 2 SD improvement, which is not plausible, and is much larger than the effect sizes in the other studies.

I've encountered Duckworth before. She did a paper a while ago presumably demonstrating that motivation trumped IQ as a casual variable in determining educational achievement. Have you seen this paper and have any comments on it also?

I've read this one. The obviously problem in it is that the IQ range is restricted in the sample they use, while there was apparently no range restriction in their self-discipline measure. They briefly admit the range restriction problem at the end of the paper, but it is not mentioned in the abstract, even though it has a substantial effect on the study's results. I remember that none of the many news reports of the study mentioned the range restriction issue, and academic papers citing it generally disregard the problem as well.

Sackett et al. discussed the study in this article:

For example, Duckworth and Seligman (2005) were interested in the relative impact of self-discipline and IQ on a variety of indices of academic performance. Because standardized IQ scores were used to select the sample, the IQ measure was range restricted. The study’s abstract states that self-discipline accounted for more than twice as much variance in each of six outcomes than did IQ. That conclusion, however, was based on observed correlations and did not take range restriction into account. It is interesting that Duckworth and Seligman acknowledged the range restriction issue and documented the degree of restriction on the IQ measure in their discussion. They applied a range restriction correction to one of the six outcome measures (GPA) and reported that whereas the corrected IQ–GPA correlation (.49) was larger than the uncorrected value (.32), it remained lower than the selfdiscipline–GPA correlation (.67). Although this is true, note that their conclusion (self-discipline accounts for more than twice as much variance as IQ) no longer holds after one takes range restriction into account. In addition, we applied range restriction corrections to other outcomes; in the case of predicting procrastination (as measured by the time homework was begun), for example, IQ had a higher correlation (-.28 corrected, -.18 observed) with procrastination than did self-discipline (-.26) after we corrected for range restriction, which is clearly at odds with the authors’ conclusion.

Statsquatch said...

Griffenfuherer,

The Abstract of the Bruening paper says "children from special education classes." I interpreted that as retarded.

jlovborg,

Thanks for the references. I may check them out. In the authors defense, they had to include the Bruening papers but they could have mentioned.

Chris said...

I was just about to cite this article in a paper I am writing, but now I am not so sure... Will you be writing this up as an official, peer-reviewed comment piece on the article? There is obviously a question over the lack of independence of the data in the Bruening studies, if not the accuracy of the data.

Statsquatch said...

Chris,

This does not seem be worthy of a paper, maybe a letter to the editor, but since I have no training in psychology I should probably leave this to others. I can make my R-code available though.

TGGP said...

You misspelled Cowen.

Anon said...

Statsquatch:

Your post raises serious questions about the validity and conclusions of Duckworth et al.’s (2011) meta-analysis. The data for that analysis are here (in Excel format):

http://www.pnas.org/lookup/suppl/doi:10.1073/pnas.1018601108/-/DCSupplemental/sd01.xlshttp://www.pnas.org/lookup/suppl/doi:10.1073/pnas.1018601108/-/DCSupplemental/sd01.xls

To recap: The results of Duckworth's analysis appear to be attributable to a single article by Bruening and Zella (1978). When that article is omitted from the analysis, the effect of incentives disappears.

I hope the editor of PNAS reads your blog. My concern is that the results of Duckworth’s analysis may become another “Pygmalion effect”, which presents a hopeful message that is not supported by evidence (or must be seriously qualified). (Pygmalion effects are based on the belief that students will show improvements on ability tests if teachers expect such improvements.)

A meta-analysis by Snow (1995, Pygmalion and Intelligence? Current Directions in Psychological Science, Vol. 4, p. 169) raised serious questions about the validity of Pygmalion effects. Similar to your analysis of incentive effects, Snow noted that Pygmalion effects disappeared when extreme scores were omitted:

“The expectancy effect disappears when extreme scores are omitted. The heightened experimental regression line (and thus the experimental mean) in the total group appears to result solely from five children whose respective pretest-posttest scores were 17-110, 18-122, 133 202, 111-208, and 113-211. If the small average difference in verbal subscores added anything to the total score difference, it was because of one experimental child whose pretest-posttest verbal scores were 133-202 (the same child whose reasoning scores were also 133-202).” [Full article: http://www.jstor.org/stable/20182363]

Thanks for your thoughtful analysis.

Anon said...

Statsquatch:

Your post raises serious questions about the validity and conclusions of Duckworth et al.’s (2011) meta-analysis. The data for that analysis are here (in Excel format):

http://www.pnas.org/lookup/suppl/doi:10.1073/pnas.1018601108/-/DCSupplemental/sd01.xlshttp://www.pnas.org/lookup/suppl/doi:10.1073/pnas.1018601108/-/DCSupplemental/sd01.xls

To recap: The results of Duckworth's analysis appear to be attributable to a single article by Bruening and Zella (1978). When that article is omitted from the analysis, the effect of incentives disappears.

I hope the editor of PNAS reads your blog. My concern is that the results of Duckworth’s analysis may become another “Pygmalion effect”, which presents a hopeful message that is not supported by evidence (or must be seriously qualified). (Pygmalion effects are based on the belief that students will show improvements on ability tests if teachers expect such improvements.)

A meta-analysis by Snow (1995, Pygmalion and Intelligence? Current Directions in Psychological Science, Vol. 4, p. 169) raised serious questions about the validity of Pygmalion effects. Similar to your analysis of incentive effects, Snow noted that Pygmalion effects disappeared when extreme scores were omitted:

“The expectancy effect disappears when extreme scores are omitted. The heightened experimental regression line (and thus the experimental mean) in the total group appears to result solely from five children whose respective pretest-posttest scores were 17-110, 18-122, 133 202, 111-208, and 113-211. If the small average difference in verbal subscores added anything to the total score difference, it was because of one experimental child whose pretest-posttest verbal scores were 133-202 (the same child whose reasoning scores were also 133-202).” [Full article: http://www.jstor.org/stable/20182363]

Thanks for your thoughtful analysis.

Anon said...

Statsquatch:

Your post raises serious questions about the validity and conclusions of Duckworth et al.’s (2011) meta-analysis. The data for that analysis are here (in Excel format):

http://www.pnas.org/lookup/suppl/doi:10.1073/pnas.1018601108/-/DCSupplemental/sd01.xlshttp://www.pnas.org/lookup/suppl/doi:10.1073/pnas.1018601108/-/DCSupplemental/sd01.xls

To recap: The results of Duckworth's analysis appear to be attributable to a single article by Bruening and Zella (1978). When that article is omitted from the analysis, the effect of incentives disappears.

I hope the editor of PNAS reads your blog. My concern is that the results of Duckworth’s study may become another “Pygmalion effect”, which presents a hopeful message that is not supported by evidence (or must be seriously qualified). (Pygmalion effects are based on the belief that students will show improvements on ability tests if teachers expect such improvements.)

A meta-analysis by Snow (1995, Curr Dir Psych Science, Vol. 4, p. 169) raised serious questions about the validity of Pygmalion effects. Similar to your analysis of incentive effects, Snow found that Pygmalion effects disappeared when extreme scores were omitted:

“The expectancy effect disappears when extreme scores are omitted. The heightened experimental regression line (and thus the experimental mean) in the total group appears to result solely from five children whose respective pretest-posttest scores were 17-110, 18-122, 133 202, 111-208, and 113-211. If the small average difference in verbal subscores added anything to the total score difference, it was because of one experimental child whose pretest-posttest verbal scores were 133-202 (the same child whose reasoning scores were also 133-202).” [Full article: http://www.jstor.org/stable/20182363]

Thanks for the thoughtful analysis.

Statsquatch said...

Anon,

Sorry your comment got caught in the Spam filter. I may check out the Pygmilion results. I am more sympathetic to researchers with outliers in a well-constructed study than to bad meta-analysis. At least they did a little work.

LemmusLemmus said...
This comment has been removed by the author.