How Do We Gather Scientific Knowledge

October 29, 2011

In an enlightening talkVictoria Stodden takes us in a walk to
discuss the scientific method and how do we gather scientific


The full talk is available at:


Stodden introduces the principles of scientific research
by going back to 
Roger Bacon in 1267 and his concepts of: 

  • Verification of conclusions by Direct Experiment

  • Importance of Independent Verification

  • Recording experiments with enough detail that others could reproduce the work


Continuing with Francis Bacon in 1620:

  • Introducing the idea of inductive reasoning:
    going from experimental observations to generalizations.

  • and the influence of this philosophy in the Royal Society of London,
    around 1660, where the first 
    Scientific Journal: “Philosophical Transactions”
    was created in 1665.

Stodden then bring us back to the present to make the points that:

“Scientific Computation is becoming central to the Scientific Method 

  • Changing how research is conducted in many fields

  • Changing the nature of how we learn about our world”


and share her conjecture that:


“Today’s academic scientist probably has more in common 
with a large corporation’s information technology manager
than with Philosophy or English professor at the same 


Then pointing out that the pervasive use of computation in
scientific research, unfortunately is not being accompanied
by an equal effort for making available the 
source code and
materials that were used during the research process.


In particular, there is a lag on making scientific data publicly
available under the terms of 
Open Data.


She then brings our attention to a significant contemporary problem:


Relaxed practices regarding the communication of 
computational details is creating a credibility crisis 
in computational science, not only among scientist 
but as a basis for policy decision and in the public mind.”


Questions are also raised about whether modern peer-reviewed
Journals are really providing an effective platform for scientific
discussion or not. 

As an example, she presents the case of the cancellation of
Clinical trials at Duke, and how the deficiencies in the 
computational practices of the original papers were not 
detected during peer-review, due to the superficial way in
which peer-review is currently conducted.

The emergence of Computational Research as a third approach
to the scientific process 
(besides inductive and deductive reasoning)
is challenged by the 
lack of open sharing of data and source code.
Therefore most 
published computational result are
simply impossible to replicate….


Stodden surveyed the reasons behind the lack of willingness to
share data and code on the part of a community of authors and 
found them to include:


  • Time required to clean and document

  • Time required to deal with questions from users

  • Preocupation about not receiving attribution

  • Possibility of pursuing patents

  • Legal barriers (e.g. Copyrights)

  • Potential loss of future publications

  • Competitors may gain an advantage

  • Web / Disk space

  • and…The Pursuit of Tenure…


while the top reasons to share were:


  • Encourage Scientific Advancement 

  • Encourage sharing with others

  • Be a good community member

  • Set a standard for the field

  • Improve the caliber of research

  • Get others to work in the problem

  • Increase in publicity

  • Opportunity for feedback

  • Finding collaborators



She closes with a discussion on:


  • How do we deal with large bodies of source code ?

  • How do we deal with massive data ?

  • When we share software, who will maintain it ?

  • The need for tools on data provenance.

  • How to train users on the proper use of shared code ?

  • The fragility of software


A very interesting talk for anyone involved
in the practice of scientific research:

Leave a Reply