公共管理相关外刊内容阅读 | 数据也会说谎吗?

发布于 2021-10-18 14:27





Data don’t lie, but they can lead scientists to opposite conclusions

数据不会说谎,但却能让科学家们得出相反的结论

Analytical methods can also influence results

分析方法也会影响结论

One of the biggest concerns in science is bias—that scientists themselves, consciously or unconsciously, may put their thumbs on the scales and influence the outcomes of experiments. 

科学界最令人担忧的一个问题是,科学家们自己可能会有意识或无意识地将大拇指按在天平上,从而影响实验结果。

Boffins have come up with all sorts of tactics to try to eliminate it, from having their colleagues repeat their work to the “double blinding” common in clinical trials, when even the experimenters do not know which patients are receiving an experimental drug and which are getting a sugar-pill placebo.

研究人员想出了各种各样的方法试图消除此类问题,从让他们的同事重复他们的工作,到临床试验中常见的“双盲实验”(即实验人员也不知道哪些患者服用了实验药物,哪些患者服用了糖丸安慰剂)。

But gathering the data and running an experiment is not the only part of the process that can go awry. The methods chosen to analyse the data can also influence results. The point was dramatically demonstrated by two recent papers published in a journal called Surgery. 

但收集数据和开展实验并非科学研究过程中唯一可能出错的部分。分析数据所选择的方法也会影响结果。最近发表在《外科》杂志上的两篇论文戏剧性地证明了这一点。

Despite being based on the same dataset, they drew opposite conclusions about whether using a particular piece of kit during appendix-removal surgery reduced or increased the chances of infection.

这两篇论文基于相同的数据集,最终得出了截然相反的结论,即在阑尾切除手术中使用特定的试剂盒是会减少还是增加感染的几率。

A new paper, from a large team of researchers headed by Martin Schweinsberg, a psychologist at the European School of Management and Technology, in Berlin, helps shed some light on why. 

柏林欧洲管理与技术学院的心理学家马丁·施温斯伯格领导的一个大型研究小组发表的一篇新论文有助于解释其中的原因。

Dr Schweinsberg gathered 49 different researchers by advertising his project on social media. Each was handed a copy of a dataset consisting of 3.9m words of text from nearly 8,000 comments made on Edge.org, an online forum for chatty intellectuals.

施温伯格通过在社交媒体上宣传他的项目,从而召集了49名不同领域的研究人员。他们每人都收到了一份数据集副本,其中包含390万字符的文本,这些文本来自于 Edge.org(一个为健谈的知识分子提供的在线论坛)上的近8000条评论。

Dr Schweinsberg asked his guinea pigs to explore two seemingly straightforward hypotheses. The first was that a woman’s tendency to participate would rise as the number of other women in a conversation increased. 

施温斯伯格让这些研究人员去探索两个看似简单的假设。其一,女性参与谈话的倾向是否会随着其他女性参与者人数的增加而增加。

The second was that high-status participants would talk more than their low-status counterparts. Crucially, the researchers were asked to describe their analysis in detail by posting their methods and workflows to a website called DataExplained. That allowed Dr Schweinsberg to see exactly what they were up to.

其二,地位高的参与者是否会比地位低的参与者更积极地参与谈话。最重要的是,研究人员要将他们的研究方法和工作流程发布在 DataExplained 网站上,从而详细描述他们的分析过程。这使得施温斯伯格能够清楚地了解他们是如何进行研究的。

In the end, 37 analyses were deemed sufficiently detailed to include. As it turned out, no two analysts employed exactly the same methods, and none got the same results. 

最终,有37位研究人员详尽记录了研究过程,从而被纳入统计。结果证明,没有任何两位研究人员采用完全相同的方法,也没有任何两个人得到同样的结果。

Some 29% of analysts reported that high-status participants were more likely to contribute. But 21% reported the opposite. (The remainder found no significant difference.) 

约29%的研究人员表示,地位高的参与者可能会更积极地参与谈话。但有21%的研究人员得出了相反的结论。(其余的研究人员没有发现显著差异。)

Things were less finely balanced with the first hypothesis, with 64% reporting that women do indeed participate more, if plenty of other women are present. But 21% concluded that the opposite was true.

相比第一个假设,第二个假设更具差异性,64%的研究人员认为,如果有很多其他女性在场,女性确实会参与得更多。但21%的人认为情况恰恰相反。

The problem was not that any of the analyses were “wrong” in any objective sense. The differences arose because researchers chose different definitions of what they were studying, and applied different techniques. 

问题不在于这些分析在任何客观意义上都是“错误的”。之所以会出现此类差异,是因为研究人员对他们所研究的内容选择了不同的定义,并应用了不同的技术。

When it came to defining how much women spoke, for instance, some analysts plumped for the number of words in each woman’s comment. Others chose the number of characters. Still others defined it by the number of conversations that a woman participated in, irrespective of how much she actually said.

例如,在定义女性发言的数量时,一些研究人员倾向于统计单词数,其他人采用了字符数的指标,还有一些人则根据女性参与对话的次数进行定义,而不管她实际说了多少。

重难点词汇:

placebo [pləˈsiːboʊ] n. 安慰剂;无效对照剂

counterpart [ˈkaʊntərpɑːrt] n. 对应的人(或事物);副本

irrespective [ˌɪrɪˈspektɪv] adv. 不考虑,不顾


本文来自网络或网友投稿,如有侵犯您的权益,请发邮件至:aisoutu@outlook.com 我们将第一时间删除。

相关素材