Scientific Evidence Portal
Definitions Of Terms Used In The Scientific Studies
Epidemiology is the branch of medical science that performs statistical analysis upon the incidence and prevalence of
disease within a population in order to uncover possible sources of epidemics.
- Case Control Study
Case Control studies are divided by two groups; those with a condition/disease (Case) and those without (Control),
comparisons can then be made between risks for each group by quantifying exposure to possible causes; this is usually
done by questioning the participants of the study.
It can be difficult to select suitable participants for the control group due to known and unknown variables and the
problems of recall bias. Additionally case control studies are not reliable in the event that the 'case' is already
deceased as questions of exposure must then be sought via third parties.
Case-control studies are utilized when it is only feasible to observe differences of postulated toxic exposures in groups of
people with or without disease. Case-control studies are necessarily retrospective and their weakness is that they do not
measure differences of disease incidence. The incidence is 0% in the controls and 100% in the cases, and therefore such
studies infer risks as differentials of exposure, and not as true differentials of incidence. In other words, these studies
try to guess the exposure of the subjects to passive smoke - guess based solely on what the subjects say to remember. Then
they infer (guess) the risk purely on the basis of the difference of exposure remembered by the subjects during the
interview. Finally, the disease that already exists - and that could have been caused by any combination of co-factors -
is attributed to passive smoke! The overwhelming majority of the studies on passive smoke is retrospective and case control,
and this is the "mountain of evidence" we keep hearing about. It is on this methodological quality that it is said that
passive smoke "kills" or "hurts" others - and this is the claimed basis of every single smoking ban in the world.
- Cohort Study
A cohort study is one of the better methods of analyzing epidemics within populations because it starts with a group
of healthy people and follows them over many years; it can therefore be far more accurate in determining exposures and
response without the effects of recall bias. These studies however require large population samples and due to the long
latency of some diseases they have to be conducted over many years and even several decades.
- Cross Sectional Study
A cross sectional study is a 'survey' of affected participants at a given point in time. Although these studies are
very cheap to conduct in comparison to other types they can be very unreliable as it is difficult to account for
the effects of recall bias and the changes in people's habits over time.
- Ecological Study
Ecological Studies gather data on groups or populations for comparison. These types of study are very cheap and fast and
are therefore prone to inaccuracy due to the plethora of confounders.
- Meta Analysis
A meta analysis is the statistical pooling of estimates from individual studies. They have been heavily criticised by many
in the scientific community due to the fact that they can only hope to achieve any kind of accuracy if each of the
individual studies included was performed according to the same methodology; this very rarely occurs in reality.
- Prospective studies
Prospective studies identify groups of subjects and follow them over time, often many years.
- Retrospective studies
Retrospective studies identify groups of subjects with different incidences of diseases and attempt to reconstruct their
- Risk Ratios
A risk ratio compares the probability of an event in each group of the study as a function of alteration of risk.
It is important to remember when analyzing risk ratios that they are relative to another group or 'base' level and in some
studies they are referred to directly as 'relative risk'. This figure of relative risk alone is therefore meaningless
without knowing the 'base' risk which can lead to ambiguity (A 40% increased risk tells us nothing without asking
'40% more than what?'). For this reason a 'relative risk' is not really a measure of risk at all and the term is very
misleading; in other words a risk ratio can convey how likely the finding is to represent a possible effect, but it cannot
in isolation convey the size of that effect. A 100% increase in the context of a risk ratio is a very small number that could
be explained solely by the quality of the data, methodology of the study, various biases and confounders and it would therefore
be considered a weak statistical association.
Frequency of the disease (incidence) that appears in people exposed
RR = -------------------------------------------------------------------------
Frequency of the disease (incidence) that appears in people not exposed
If the incidence of disease is the same in the exposed and in the non-exposed groups, the ratio is 1 and there is no change
in risk. If the incidence of disease is greater in the people exposed, the ratio becomes greater than 1 and the risk has
increased. Conversely, the risk is smaller than 1 when the incidence is lower in the exposed people, implying that the
exposure protects from the disease.
Important Points To Remember When Analyzing Risk Ratios.
- Risk ratios do not establish a cause/effect relationship, only a statistical correlation
- Risk ratios between 0.5 and 2 do not reach a standard level of statistical significance; only a 'weak statistical
- Risk ratios are relative; an RR of 1.4 (40%) is only meaningful if absolute figures are quoted in the study, if an
event is very unlikely to begin with it will still be very unlikely after a 40% increase.
- Risk ratios can only be judged in combination with other data i.e. absolute data, confidence intervals,
study size, adjustments for confounders, study methodology and the spread of data (width).
- Odds Ratio
An odds ratio compares the probability of an event in each group of the study as the odds of that event occurring. The odds
ratio, like a risk ratio, is also relative and so care needs to be taken when analyzing a result. (see Risk Ratios)
In statistics there is no real way of knowing whether a result is occurring due to coincidence of random sampling; statisticians
can quantify this effect by calculating probabilities. A P-value represents the probability of obtaining values of a
statistic that is as large or larger in magnitude than the observed statistic. A Null Hypothesis is typically used to describe
no difference or effect. A P value ranges from zero to one, the smaller the value the more unlikely that a result has occurred
by chance; the epidemiological standard level of significance is 0.05 (corresponds to a 95% confidence interval); many
studies will give <0.05 to indicate significance. A P-value of 0.03 would indicate that you would expect to find results
smaller than those observed in 97% of experiments and higher results in 3% of experiments. It is important to note that low
P-values do not demonstrate causation, they only show that a strong statistical association is not due to chance.
Important Points To Remember When Analyzing P-Values.
- A study that does not include P-values or confidence intervals has not adequately shown the probability that their
results occurred by chance
- P-values are only tests of probability of the result being as large or larger than the given risk; they do not give us
the risk in the first place
- P values alone can be misleading
- Confidence Intervals
Confidence Intervals indicate the lack of precision in the estimate of risk; the scientific standard of measuring this
'uncertainty' is 95%; that is to say that there is a 1 in 20 chance that the real risk lies outside our range of data.
e.g. 95%CI (1.11 to 1.88) tells us that 19 times out of 20 the result would lie somewhere between 1.11 and 1.88.
Important Points To Remember When Analyzing Confidence Intervals.
- The range of the data (width) indicates that the true result could lie anywhere within that range to the specified
level of confidence
- Values of less than 1 suggest negative (protective) effects; A value of 1 indicates no effect (no risk)
- Widths that span 1 (e.g. 0.88 to 1.44) are insignificant because the true value could equally be a decreased risk,
no risk or an increased risk.
- Widths that are close to 1 can only demonstrate a weak statistical association
- As with the other results Confidence Intervals can only be effectively judged along side the study methodology, accounting
for confounders and sample size.
- Confounders, Concomitant Factors
Factors and circumstances that contribute to the occurrence of the disease. As we shall see later, many diseases are
multifactorial, meaning that they can be caused by different factors acting by themselves or in concert with others –
as opposed to mono-factorial diseases that have one cause, and whose risks can be reliably measured.
- Statistical significance
Statistical significance: a numeric coherence indicating only that the data show either benefit or risk as opposed to both,
as it happens for the large majority of the studies on passive smoke. Careful now, don't let them con you: "statistical
significance" does not mean that the date are accurate, nor does it mean that the risk/benefit exists, not that it's big.
In short, a risk with "statistical significance" (or "statistically significant") does not mean significant risk, as "public
health" and antismoking con men activists want us to believe so that we keep on hating smokers.