Importance of biological replicates in biological research
Tania, a PhD student is
conducting experiment on chickpea plants to check a specific gene expression
level upon providing phosphorous deficiency stress. She is repeating this experiment
for the third time and she could not reach to a conclusion yet, as her data
shows specific pattern of gene expression difference among the control and stressed
(she got higher expression in the stressed plants) but there is no statistical significance
in this difference which is making her data scientifically invalid.
Most of the researcher may have
faced similar kind of problem during experiment. This happens mostly when the
sample size or sample number is low due to low number of biological replicates,
which is also called experimental replicates.
The number of sample sets you are
taking during an experiment is called replicates, which indicates for how many
samples you are conducting the same experiment? In biological research, these
replicates are called 'biological replicates'. For the researcher, the number
of sets of the samples matters a lot for a valid statistical analysis of the
data obtained from the experiments. A valid statistical analysis proves how
much accuracy and robustness are there in your experiment and how reproducible
your data is. Reproducibility is an obvious factor for your publication to go
in high impact journals. Therefore, replicates are crucial for scientific
research. Popularly, replicates are called N number, which actually reflects
the population size or number of individual for statistical analysis. This
number is also important to get the statistical significance of the difference
of two or multiple data sets or significance of correlation of two data
sets. The following points show why high biological replicate numbers are
so important for a biological science researchers and how one should chose the replicate
numbers.
Replicate numbers
ensure the robustness of your experimental data. Suppose, you are replicating
your experiments for 10 data sets, and you are getting a specific data pattern
for 7 sets and reverse pattern for rest 3 sets, there is higher chance for the
data pattern for the first 7 replicates to be significant, even you are
including the 3 sets showing reverse pattern. Those 3 sets will be called as
outlier which may be resulted due to the experimental error or sample’s
impurity.
High replicate
number helps to reduce chances of repetition of the experiments. If you
conduct the experiment once with high replicate numbers then it is likely to
get a robust data and there will be no need to repeat the whole experiment for
the second time.
High replicate
number gives you flexibility to remove the outliers. Go back to the first
point, where 3 outliers among the 10 data sets were showing reverse pattern.
Now you have the flexibility to use the 7 data point providing similar data
pattern and to remove three outliers. This will give you higher chance to
obtain significant result.
There is no
fixed number for biological replicates. But for a valid statistical analysis a
minimum number of three is required as biological replicates. There is no limit
of maximum number for biological replicates.
Always try to take
more than three biological replicates for the samples which are collected after
an uncontrolled treatment. For example, if you are harvesting a plant sample
after real insect feeding or you are harvesting field samples for the plants or
crops, you cannot control the condition, means you cannot control the hunger of
insect or you cannot control the temperature, humidity or soil microbes of the
field. In that case you need to take more than three biological replicates,
preferably 4 to 10 or more to get robust data.
Pooling is a nice
practice for collecting biological replicates. You may pool your samples to reduce
sample size. For example, if you are doing an expensive experiment ( e.g.
RNAseq, or Untargeted LC-MS) where you want to reduce your sample numbers but
want to remove batch effect maximally, then you may pool three or more samples as
a single sample and collect more two pool similarly. Hence, you can collect (3 samples
X 3 pool) = 9 samples but you have to perform three experimental replicates,
which will be cost effective for you but chances of hampering data robustness
will be less.

Comments
Post a Comment