Writing Better Bioinformatics Methods: Best Practices for Manuscripts

At the end of a long experiment, preparing manuscript text to describe your bioinformatics analyses can be a daunting task. Which details should be included to satisfy reviewers and support fellow researchers? Here, we outline the most important content for a high-quality computational Methods section.

In the era of high-throughput sequencing and machine learning, the quality of a manuscript depends heavily on its computational Methods section. This section is also surprisingly difficult to write well! Software versions, workflow logic, analysis parameters, and other critical details are sometimes missing from publications, making even the most careful analysis impossible to reproduce.

At Genevia Technologies, we routinely provide publication-ready computational Methods descriptions for our clients. This post summarises the best practices we recommend for describing computational workflows, with a focus on genomics and transcriptomics datasets.

Writing Better Bioinformatics Methods: At a Glance

  • Explain the logic of the analysis, not just the tools
  • Report software, versions, and non-default parameters
  • Make QC decisions explicit (metrics, thresholds, exclusions)
  • Document model design, covariates, and significance criteria
  • State where code, data, and custom pipelines are publicly available
  • Use supplements strategically to support reproducibility

Why computational methods matter

Reproducibility is a shared expectation across reputable journals, research institutions, and funding agencies. In bioinformatics, reproducibility depends on careful documentation of software, versions, reference files, and analytical logic.

Since the requirements for reporting bioinformatics methods can differ significantly between journals, it is important to be aware of best practices and follow them while writing. Even if your target journal does not require all of the details you provide, including them will ensure that your manuscript is as reproducible as possible.

Transparent reporting benefits everyone:

  • Reviewers can assess whether the appropriate methods, tools, and assumptions were used
  • Readers can reproduce the analysis or adapt it for a related research question
  • Authors strengthen the credibility of their conclusions and reduce the risk of post-publication confusion or disputes

The goal is not to list every command ever executed, but to provide a clear, structured description of how the reported biological results were generated from the raw data.

Structuring computational methods

A strong bioinformatics Methods section is structured based on the original logic and progression of the analysis. The reader should be able to follow the journey of the data without having to guess which steps were performed, or in what order.

For most genomics and transcriptomics analyses, the following outline provides a strong foundation.

Data acquisition, processing, and quality control

This section should first describe how the raw data were generated and processed prior to downstream analysis. Documenting these steps establishes the starting material and demonstrates that the input data met quality expectations.

To begin, report the sequencing platform, library/read configuration, and approximate sequencing depth.

Also document:

  • Quality control tools and parameters, including adapter trimming and quality filtering thresholds
  • Read-level exclusions, such as host, contaminant, or rRNA removal
  • Post-QC metrics, including read retention rates and summary statistics

Alignment, quantification, and feature generation

Next, describe how reads were aligned or pseudo-aligned, the specific reference genome and annotation versions used (for example, GRCh38.p14 with GENCODE v45). Also describe how the features were quantified. This section should also include relevant tools, their versions, and any non-default parameters.

After discussing the preprocessing steps and how quality was assessed (for example, per-base quality scores or alignment rates above a threshold), also discuss how samples failing QC were handled. You can refer to other publications, particularly those from journals that prioritise computational methods, to find appropriate quality cutoffs for your dataset.

Normalisation and statistical analysis

With preprocessing out of the way, the next section should clearly explain how counts or other signals were transformed and compared to generate the described biological inferences.

This section should include:

  • Normalisation methods applied (e.g. library-size normalisation, TPM)
  • Tools, transformations, and visualisations applied for exploratory analysis (e.g. PCA)
  • The statistical framework and software packages used for comparative testing
  • Model design formulas, covariates, and contrasts of interest
  • Cutoff values applied for determining statistical or biological significance (e.g. p-values, fold changes)
  • Multiple testing correction methods

Use phrases like “default settings were used” with caution, and ensure that tool defaults are well-documented and appropriate for the analysis at hand. Additionally, input data for exploratory analyses and statistical comparisons may be different, so make sure your text clearly lists which inputs are used.

Downstream analysis and visualisation

Describe any additional analyses that informed the biological conclusions of the manuscript, such as clustering, pathway enrichment, or gene set analysis. For each one, specify the input data (e.g. counts, gene symbols), software, versions, and relevant parameters.

Figure generation methods are an important but commonly forgotten detail. For figures and plots, note the tools or libraries used and whether any data transformations (e.g. log-scaling or variance stabilisation) were applied prior to visualisation.

Citations and software versions

With the bulk of the Methods text out of the way, the next step is to ensure the text includes accurate citations and software version notes. Some researchers prefer to add these details throughout the writing process, while others fill them in after completing a first draft. Use whichever strategy works best for you.

Citations and versions are generally included with the software’s first appearance in the manuscript. Follow the developer’s citation instructions accordingly if available; the software’s documentation may include detailed citation instructions, particularly when there are multiple associated publications.

Also report exact version numbers (e.g. Seurat 5.4.0 instead of Seurat 5), as defaults and outputs can change meaningfully between minor releases. For tools without an associated publication, cite the software’s website or follow the developer’s citation instructions if present.

Data accessibility and public repositories

Most journals now require deposition of raw sequencing data in public repositories (such as SRA, ENA, or GEO) with a short section in the manuscript describing their availability.

When uploading data to public repositories, it is best to begin the process sooner rather than later! Some journals allow the data upload to be in progress during manuscript review, while others require authors to provide a study ID at submission. Public data repositories also have specific metadata, quality, and formatting requirements that all included files must meet before being uploaded.

In the manuscript text, state where the raw data are deposited, the study accession number, and the provided file format. If custom scripts or pipelines were used, describe how they can be accessed (for example, via a Github repository) and the release version if applicable.

Manuscript supplemental materials

It can be challenging to decide which supplemental materials to include with a manuscript submission. Consider including the following as supplementary materials where appropriate:

  • Quality control reports
  • Sample metadata tables
  • Exploratory analysis figures
  • Outlier testing results
  • Count matrices or feature tables (can be uploaded to a repository like Zenodo or GEO if the filesize is large)
  • Statistic tables from differentially expressed gene/pathway testing
  • Pipeline diagrams or workflow summaries

These materials can improve reviewer confidence, reduce requests for clarification during peer review, and provide valuable context for other researchers.

Final thoughts

A well-written computational Methods section is not an afterthought; it is a core component of any credible manuscript containing high-throughput sequencing data. Clear structure, precise reporting, and thoughtful organisation make your work easier to understand and reproduce.

Even if your target journal does not have strict reporting requirements for computational Methods, following the best practices described here will only further increase the value of your contribution to the scientific literature.

Contact us

Leave your email address here with a brief description of your needs, and we will contact you to get things moving forward!

Antti Ylipää
Antti Ylipää CEO, co-founder Genevia Technologies Oy +358 40 747 7672