A tutorial for microbial genome-scale metabolic modelling: [3] understanding the SBML format

The systems biology markup language (SBML) is a community-driven, software and platform independent standard for expressing and exchanging systems models between different simulation and analysis software. It is defined using the unified modelling language (UML) and represented using the extensible markup language (XML)1. The SBML does not aim to produce model files that can be readily read by humans, but to provide different software with a unified medium for exchanging models. Each piece of software can then translate imported models into its own internal format1. It is important to understand the SBML for metabolic modelling and engineering because this data language has quickly become the most popular standard of model files since its first publication in 20031.

1. Incremental updates of SBML

The SBML community adopts an incremental strategy to define and update the language in order to maintain its stability and compatibility between different releases. As a result, SBML releases form an interconnected hierarchy, which is distinct to images that are created from independent layers. At the basis of this hierarchy, lays the first SBML release — SBML Level 1, published in 20031. I will be focusing on drawing an outline of this level hereafter. Readers are advised to read the article by Hucka et al. for detailed definitions of this level.

2. Model components at SBML Level 1

Components of a model element in each model file (usually named as *.xml or *.sbml) are hierarchically organised as follows.

  • Model: one model per file.
    • Compartment: a place of an finite volume where chemical species are located and reactions occur. SBML assumes chemical species are well stirred within every component1.
      • Species: chemical substance or entity.
      • Reaction: where associated rate laws (kinetic functions) are defined. Reactions are either reversible (reversible = “true”) or irreversible (reversible = “false”).
        • reactants
        • products
        • kineticLaw: a formula written in a plain string
      • Parameter: defined for symbols in the formulae of kinetic laws that are attributes of reactions. A parameter is either global to a model or local to a single reaction.
      • Unit definition: users can use a compositional approach to define units from SBML’s standard units1.
      • Rule: a mathematical expression applying constraints to parameters of reactions.

3. Model reconstruction

In statistics, we estimate parameters of mathematical models based on observations of random variables for model reconstruction. The reconstruction of metabolic models is essentially the same — we use knowledge of real reactions and cellular phenotypes to iteratively curate elements of a draft model until it can accurately predict the phenotype. Nonetheless, the reconstruction process is generally much more complicated than fitting a mathematical model because a large number of parameters and kinetic functions are involved.

Reference

Hucka, M. et al. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 19, 524–531 (2003).