Incentive structures for scientists are interfering
with scientific output
Goodhart's Law: '
When a measure becomes a target, it ceases to be a good measure'. Universities
optimise for grades instead of knowledge. Politicians seek popularity, not the
public good. Tomatoes are bred into heavy, flavorless sacks of water. Soviet
Nail factories, when instructed to produce a certain number of nails per month,
produced tiny, useless nails. Science is no different.
Science began
small and informal but is now performed on an industrial scale. Industrialisation
requires quantification and targets, so during the 20th century, funding bodies
began to assess scientists based on metrics such as their citation index – the
number of times their work has been cited by others [1]. Today, scientific
achievement is synonymous with publication in the most cited journals – an
aspiring scientist must ‘publish or perish’.
Imagine you are
trying to maximise your publication rate and citations without any regard for
its utility to other scientists or the public. How do you go about it? What are
the scientific equivalents of tiny nails?
Fraud is the
obvious answer and fraud is indeed on the rise [2]. A close second to straight
up fraud is selective publishing - perform many studies and report only the
positive results. Here, statistics provide a host of ways to 'massage the
data', particularly in fields such as biology and the social sciences where
one's peers often lack mathematical expertise.
The majority of
scientists, however, are too scrupulous or cautious for the above. How else can
you optimise your publication record for maximising citations? One way is to
produce 'minimum publishable units' - the smallest measurable quanta of
information acceptable to a journal. You can exaggerate the importance and/or
novelty of each MPU by exploiting tenuous links to human disease, neglecting
relevant prior research, and good old-fashioned hyperbole. You can also
exaggerate the certainty of your conclusions by not performing replicate
experiments, or experiments that might disprove your hypothesis. Finally, you
can increase your citations by citing yourself and your friends whenever
possible.
Now, as people
go, scientists are a fairly principled bunch and most try to avoid these practices
as much as they can. However, Goodhart's law is deeply embedded in the system [3].
PhD students must publish, and quickly, to be competitive. Senior scientists
are in a constant battle for funding and job security. Even those with tenure
employ scientists on short-term contracts, who need publications, and quickly,
for their next grant. All of these factors drive the desire to publish ‘tiny
nails’. When biological research is proving to be so profoundly unreliable to
the private sector [4], something is very wrong.
It has been
suggested that the scientific literature is 'self-correcting' - that fraud or
lax experimentation gets discovered eventually. But science is becoming more
and more expensive, and replication is becoming increasingly difficult to
perform. Correction may or may not take place, but in the meantime, the
public's faith in science has been eroded, and with it any and all benefits
that science brings. We can do better. The engine of knowledge could run more
smoothly. So what ought to be done about it?
Biologically
Determined has a number of suggestions. We will resist the temptation to
exaggerate their novelty.
Firstly, shallow
measures such as the number of publications or their citations index need to be
dropped, or at least relied on less. This will require a shift in the culture
of science, and is to a large extent already underway. Journals like PLoS One
are providing a valuable service by publishing articles regardless of their novelty.
Systems should be organised to publish negative results and replications, which
should be taken into account when evaluating researcher performance.
The emphasis on
short-term contracts and career pressure in science should also be removed.
Private sector firms like Google or Valve who require creative intellectual
output have listened to the scientific research on the matter - a stressed
brain produces tiny nails [5]. Science should do the same by providing more
long-term contracts, and reforming the current system [6] in which huge numbers
of PhD students are trained, providing cheap labour before leaving the field,
disappointed and embittered. This would have the side benefit of efficiency -
repeatedly training fresh PhD students in the same techniques is inefficient,
and discourages streamlining and automation of protocols. It would also allow
scientists to perform innovative, and therefore risky, experiments that the
current system discourages.
Finally, and
probably most importantly, studies need to be reproduced in an organised
manner. Several organisations exist or have been proposed [7] to accomplish
just this. Ultimately, it will be necessary for us to accept that the stamp of
'peer reviewed' cannot, and does not, amount to a vote of absolute certainty.
Scientists make sophisticated judgment calls in evaluating peer-reviewed
evidence, and it is these judgments that need to be communicated to the public.
Organised reproduction of papers would provide a further level of confidence.
Different fields
tend to have their own standards of certainty. Physics for instance, has much
stricter statistical standards than the soft sciences, while fields like
psychology are plagued by retractions and low confidence results [8]. While
reproducibility will always be easier to achieve in the ‘harder’ sciences, this
should mean that higher, not lower standards are held to in the soft sciences.
Fields such as dietary science, and psychology, that offer directly actionable
advice to the public, should be less willing to offer shaky conclusions. The
media’s reporting of such studies will always be imperfect, and scientists must
act with this in mind.
Much of this is
accomplishable from within science. But it is dangerous to assume that science
can be entirely self-regulating. Scientists, especially those at the top, are quite
conservative about its structure. For instance, in recent years governments
have begun to demand publication in open access journals. Individual
researchers, who would otherwise grumble about the system but still publish
behind ridiculous pay walls, must no longer do so or risk losing funding. A
change made possible only by outside regulation.
Science is in
many ways the last remaining sacred cow of our age. And in many ways this is
justified – scientists are not, after all, in it for the cars or money, nor are
they afforded particularly high social status. But scientists are still human
beings and as susceptible to incentive structures as the rest of us. The public
bodies that fund scientific research must ensure that the scientific market is
properly regulated. It is after all, their money.
1 – Hirsch index.
The most commonly scientific output measure
2 – Fraud on the
Rise
3 – Poor practice
in Science
4 – Lack of
reproducibility of biomedical results
5- Creativity in the Private Sector
6 – economist
7- Organizations
publishing replications, negative results
8 – Retractions
in Psychology