

Critical Thinking in Critical Care Medicine
Study Design... Why would that matter?!
Understanding validity and risk of bias in published journal articles.

When asking questions in medicine, each type of question has a different study design that provides the best type of information.
For evaluating therapeutical interventions, a well done Randomized Controlled Trial (RCT) will provide the best data. If performed appropriately, randomized groups of patients will be expected to have the same prognosis except for the intervention used in one of the groups. This equal prognosis includes known and unknown characteristics that we didn't even consider.
For harm questions, having equal groups except for the intervention, make RCTs excellent at showing harm, but only if frequent enough. The big caveat to using RCTs for harm being, that it is unethical to randomize for harm, and harm can only be found as an unwelcome side effect of a potentially beneficial intervention.
For infrequent harms, Cohort Studies can provide good data. In the case of very rare events, Case Controls are the only way of answering those questions. However, in general, the lower the study type is on the pyramid of evidence, the less we can trust the data and it could completely change with a better study type.
For diagnostic questions, a prospective cohort with undiagnosed patients at risk of the disease we are interested in will provide the best data. Diagnostic uncertainty is a must to avoid bias and find out all the test characteristics, sensitivity, specificity, Likelihood Ratios and area under the curve of the ROC (receiver operator curve) in the population at risk. In brief, every patient in which we think of the possible diagnosis should have the new test and the reference standard (FKA gold standard).
For prognostic questions, the control group in the RCTs will provide the best answer (as long as they follow the patients long enough and they have enough of the important outcomes). Most prognostic questions are however answered with Cohorts Studies, due to cost and inability to perform RCTs for a long enough period of time.
Pooling the data together for an answer
When we are looking at the big picture, we need to try to obtain as much evidence as possible together. Remember every RCT or study could have biases or random error we are not aware off, even after doing our due diligence. Practice should never completely change based on only one study, no matter how exciting or well done. The more data we have, the less likely the outcomes are going to change in the future.
The way we look at the big picture is with Systematic Reviews (SR). Authors try to get most of the evidence available for a specific question. They have exhaustive and reproducible search strategies. The limitation for systematic reviews is the fact that their quality depends not only on their validity, but most importantly, the quality of the data available in the literature. In well-done SRs, authors will analyze the primary literature and assign quality modifiers based on that. An example of excellent site for SR with metanalysis is the Cochrane Library
What is the Pyramid of Evidence:
The Pyramid of evidence is a graphic representation of study design quality and risk of bias. It has evolved recently due to the previous versions leading to heated arguments. It assigns higher quality and potential validity to the top study designs and more risk of bias to the bottom ones. In the past there have been disagreements about what should be at the top - SRs or RCTs. The new Pyramid takes into account that SRs are a category of its own. Many people have argued that a large number of SRs are proven wrong by large RCTs. I believe this happens when the original data is poor, therefore making the output of the SRs similarly poor.
Of note the smudged mixed colors borders between the levels signals that there might be some overlap in which a well-done lower type of study design could be better than a poorly done higher quality one.