# Software

# FastPros

## How to use FastPros

FastPros is a new iterative screening algorithm to identify sets of reaction knockouts which result in a target production under biomass production maximization. Starting from a reduced metabolic model of a specific organism, first *u _{TARGET}* of networks with all possible single and double reaction knockouts are calculated. Then, top N networks with regard to this score are chosen for the parent networks, where N stands for the total number of reaction sets. For each generation, all possible single reaction knockouts are further added to the parent networks, from which N networks with largest and increased

*u*were selected as the parent networks of the next generation. If

_{TARGET}*u*of a mutated network becomes a positive value, this network is excluded from the iterative screening, and the set of reaction knockouts is stored as the candidate knockout sets. The cycle of mutation and selection was continued until the number of iterations (i.e., the number of knockouts) reached a maximum number, to obtain various sets of reaction knockouts whose additions to the wild-type network result in positive

_{TARGET}*u*values.

_{TARGET}### Equipment

- COBRA Toolbox (http://opencobra.sourceforge.net/openCOBRA/Welcome.html)
- A computer capable of running MATLAB
- MATLAB (MathWorks, http://www.mathworks.com/)
- libSBML programming library (http://www.sbml.org)
- SBMLToolbox for MATLAB
- FastPros toolbox

### Set up

- Install MATLAB
- Install libSBML and the SBML toolbox
- Unpack COBRA Toolbox archive
- Unpack FastPros toolbox archive
- Stat a MATLAB session and add paths to the COBRA Toolbox and FastPros toolbox
- Initialize the COBRA toolbox using a function “initCobraToolbox”
- Load model with COBRA structure

### Time estimates

All time estimates for the functions below were predicted on a genome-scale metabolic model of *E. coli* named iAF1260 (Feist *et al*., 2007), containing 1,260 genes, 2,077 metabolic and transport reactions, 1,039 unique metabolites and a computer with GLPK and MATLAB on a 64-bit Windows machine with Intel Xeon 2.66 GHz processors. A COBRA function “optimizeCbModel”, which solves a flux balance analysis problem, didn’t work with MATLAB Parallel Computing Toolbox in our computer environment, so you may need to modify the source code if you want to use it with the toolbox.

### Run the test code of FastPros

The test code is run by following command:

>> testFastPros()

“testFastPros” loads a MATLAB matfile representing a core metabolic model of *Escherichia coli *(http://gcrg.ucsd.edu-/Downloads/EcoliCore), reduces the model by “reduceModelForFP”, and finally performs “FastPros” to find knockout strains for succinate production. Upon completion, it displays whether the test were completed successfully or not. The time was estimated at ~1 minute.

### Reduce model

Reduce model size for faster FastPros computation and create fields needed for FastPros by the following command:

>> [modelReduced, biomassRxn, targetRxn, oxygenRxn, reductionStatus] =

reduceModelForFP(model, biomassRxn, targetRxn, oxygenRxn, options)

The time was estimated at <10 minutes.

**INPUTS**

model | Structure containing following required fields to describe a COBRA stoichiometric model |

rxns | Reaction name abbreviation; reaction ID; order corresponds to S matrix. |

mets | Metabolite name abbreviation; metaboliteID; order corresponds to S matrix |

S | Stoichiometric matrix in sparse format |

b | RHS of Sv = b (usually zeros) |

c | Objective coefficient for corresponding reactions |

lb | Lower flux bound for corresponding reactions |

ub | Upper flux bound for corresponding reactions |

rev | Logical array; true for reversible reactions, otherwise false |

genes | List of all genes within the model |

rxnGeneMat | Matrix with rows corresponding to reactions and columns corresponding to genes |

grRules | Rules field in a format readable format |

metFormulas | Elemental formula for each metabolite |

biomassRxn | Reaction representing biomass objective function |

targetRxn | Reaction whose flux is to be maximized |

oxygenRxn | Reaction representing oxygen uptake |

**OPTIONAL INPUTS**

options

verbFlag | Verbose flag (default: false) |

loadFVAFlux | Load maximum and minimum fluxes of each reaction calculated by flux variability analysis (default: false) |

**OUTPUTS**

modelReduced | COBRA model structure added with the following generated fields. In the filed "rxns", combined reactions are represented using "/", as "reactionA/reactionB". |

rxnFormulas | Reaction formulas |

rxnAssociations | Reactions drived from original model |

rxnAssocMat | Matrix of reaction associations between reduced and original model (row: rxns in reduced model, column: rxns in original model) |

unqGeneSetsMat | Matrix of genes-geneSets associations |

geneSets | List of gene sets |

geneSetRxnMat | Matrix of geneSets-rxns associations (row: geneSets, column, rxns) |

geneSetAssocRxns | List of reactions associated with gene sets |

essentialGeneSets | Essential gene sets for the cell growth or the target producton |

oriRxns | Reactions in the original model |

reductionStatus | Status representing whether model reduction was success or not. 1: Model reduction was success. 2: Growth rate of wild type strain was changed by model reduction. 3: Growth rate of single knockout strains were changed by model reduction. |

biomassRxn | Reaction representing biomass objective function |

targetRxn | Reaction whose flux is to be maximized |

oxygenRxn | Reaction representing oxygen uptake |

### Perform FastPros

Perform FastPros algorithm using the following command:

>> [FastProsSolution, FastProsStatus, FastProsResult] =

FastPros(model, biomassRxn, targetRxn, oxygenRxn, options)

The time was estimated at ~3 hours (maximum 10 KO).

**INPUTS**

model | COBRA model structure containing the following required fields to perform FastPros. Some fields in the model are generated by the function "reduceModelForFP". |

biomassRxn | Reaction representing biomass objective function |

targetRxn | Reaction whose flux is to be maximized |

oxygenRxn | Reaction representing oxygen uptake |

**OPTIONAL INPUTS**

options

rxnList | Reaction list as knockout candidates (default: all reactions in the model) |

maxKoNum | Maximum knockout number (default: 10) |

selStrainNum | Knockout strain number to be selected as parent strain in the generation (default: number of allowable knockouts) |

selIncUtargetStrains | Select only strains whose u increased by the knockout (default: true)_{TARGET} |

verbFlag | Verbose flag (default: false) |

#### OUTPUTS

FastProsSolution | Solution structure of FastPros |

koNum | Knockout number of reaction or gene sets |

koGeneSetIDs | Each row represents IDs of knocked out gene sets. |

koGeneSets | Each row represents knocked out gene sets. |

koRxnSets | Each row represents reactions of knocked out gene sets. |

utarget | Each row represents u change by knockout of gene sets._{TARGET} |

prodRates | Each row represents production rates of target compound in the corresponding knockout strains |

flux | Each column represents flux distributions in the corresponding knockout strains |

FastProsStatus | Status structure of FastPros |

existEssentialKoStrains | Column i represents whether essential knockout strains exist in the i th generation. |

existProdKoStrains | Column i represents whether knockout strains with target production exist in the i th generation. |

firstProdKoNum | Number of knockouts when the first production strain was identified |

firstUnqProdKoNum | Number of knockouts when the first production strain without alternative optima of zero target production was identified. |

allowableKoGeneSetIDs | Gene set IDs of allowable knockouts |

allowableKoGeneSets | Gene sets of allowable knockouts |

allowableKoRxnSets | Reaction sets of allowable knockouts |

maxKoNum | Maximum knockout number |

maxProdStrain | Structure of the identified strain with the maximum target production rate |

model | COBRA model |

modelGetUtarget | COBRA model used for u calculation_{TARGET} |

selStrainNum | Knockout strain number to be selected as parent strain in the generation |

targetInfo | Information of target compound and reaction |

targetRxn | Target reaction whose flux is to be maximized |

targetRxnID | Target reaction ID |

theorFlux | Theoretical flux distribution for maximized target production |

theorProdRate | Theoretical maximum rate of target production |

time | Column i represents cumulative CPU time until the end of the i th generation |

FastProsResult | Result structure of FastPros |

essentialGeneSetCmbs | Each row represents essential gene set combinations to reach minimum cell growth threshold or target production |

koNum | Total knockout number of reaction or gene sets in a strain |

productionKoStrains | Same structure to solution |

selKoGeneSetIDs | Gene set IDs knocked out in strains selected as parent strains in the next generation |

selKoStrainUtarget | u in strains selected as parent strains in the next generation_{TARGET} |

### Example codes

Two example codes using FastPros are provided. “example1” aims to identify knockout sets for d-Lactate production, and “example2” aims for succinate production, using FastPros function.

### Reference

Feist,A.M. *et al.* (2007) A genome-scale metabolic reconstruction for *Escherichia coli* K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. *Mol. Syst. Biol.*, **3**, 121.