2. Simulation Workflow

[2]:

import pandas as pd
import os
import cometspy as c

from scarcc.preparation.find_directory import find_directory
from scarcc.preparation.metabolic_model import BasicModel

data_directory = find_directory('Data', os.path.abspath(''))
model_directory = find_directory('models', os.path.abspath(''))
sim_chamber_directory = find_directory('SimChamber', os.path.abspath(''))

E0, S0, all_components = BasicModel(flux_weighting=True, model_directory=model_directory).load_ES_models()
alpha_table = pd.read_csv(os.path.join(data_directory, 'alpha_table_m1.csv'), index_col=0)

p = c.params()
p.set_param("defaultKm", 0.00001) # M
p.set_param("defaultVmax", 10) #mmol/gDw/hr
p.set_param("maxCycles", 30)
# p.set_param("maxCycles", 150)
p.set_param("timeStep", 1)
p.set_param('writeFluxLog', True)

[3]:

S0.medium
smedium = S0.medium
smedium['EX_bulk_ac_e'] = 2
S0.medium = smedium

[3]:

{'EX_ca2_e': 10,
 'EX_cl_e': 10,
 'EX_cobalt2_e': 10,
 'EX_cu2_e': 10,
 'EX_bulk_ac_e': 3,
 'EX_mg2_e': 10,
 'EX_fe2_e': 10,
 'EX_fe3_e': 10,
 'EX_mn2_e': 10,
 'EX_mobd_e': 10,
 'EX_nh4_e': 10,
 'EX_ni2_e': 10,
 'EX_o2_e': 10,
 'EX_pi_e': 10,
 'EX_zn2_e': 10,
 'EX_so4_e': 10,
 'EX_k_e': 10}

[8]:

S0.medium
smedium = S0.medium
smedium['EX_bulk_ac_e'] = 0
smedium['EX_gal_e'] = 2
S0.medium = smedium

[9]:

current_gene = 'folP'
Sgn = S0.slim_optimize()
from scarcc.preparation.perturbation import alter_Sij
with S0:
    alter_Sij(S0, alphas=22.8, genes=current_gene, ko=False)
    print(S0.slim_optimize()/Sgn)

0.7634558054931522

[7]:

current_gene = 'folP'
Sn = S0.slim_optimize()
from scarcc.preparation.perturbation import alter_Sij
with S0:
    alter_Sij(S0, alphas=22.8, genes=current_gene, ko=False)
    print(S0.slim_optimize()/Sn)

0.7879863087652861

After configuring basic parameters, we can start to simulate the combined antibiotics. The following code will simulate the combined antibiotics for all the possible combinations of the folA and folP inhibitors. The simulation results will be saved in the directory specified by the variable sim_chamber_directory.

2.1. Multiprocessing to derive results

run_sim_workflow is a function that accept list of gene combinations and map each to perform simulation. Then takes the simulation output data to generate the classification of gene combination effect, concatenating all results and save to excel spreadsheet.

df_containers stored all the data frames and break down of it is summarized in the last section

This example provides a minimum arguments being passed to the function

method_list = [‘m1]
- as suffix to the alpha_table and all the output data frames
supply either SG_list or DG_list or both
simulation_kwargs are passed to the class SimulateCombinedAntibiotics directly, will be discussed in the section “Breakdown of steps”

Prerequisite:

alpha_table correspond to chosen method list stored in data_directory, or run get_alpha_biomass_df before calling run_sim_workflow

Failed due to insufficient memory

If multiprocessing module fails, it could be the reason of insufficient memory being allocated OR amount of processes being set for too high. Please try setting max_cpus to lower values.

[ ]:

from scarcc.sim_engine.simulation_workflow import run_sim_workflow

SG_list = ['folA', 'folP'] # Optional, Example: ['folA', 'folP', 'folC']
DG_list = [['folA', 'folP']] # Optional, Example: [['folA', 'folP'], ['folC', 'folP'], ['folA', 'folC']]
method_list = ['m1']
simulation_kwargs = {
    'E0': E0,
    'S0': S0,
    'base': sim_chamber_directory,
    'p': p}

df_container = run_sim_workflow(method_list=method_list, data_directory=data_directory,
                                DG_list=DG_list, **simulation_kwargs)

reading results from existing files
biomass, growth_rate, normalized_growth_rate, drug_response_classification, flux data derived from m1 were saved in d:\scarccpy\Data

[7]:

from scarcc.sim_engine.simulation_workflow import run_sim_workflow

SG_list = ['folA', 'folP'] # Optional, Example: ['folA', 'folP', 'folC']
DG_list = [['folA', 'folP']] # Optional, Example: [['folA', 'folP'], ['folC', 'folP'], ['folA', 'folC']]
method_list = ['m1']
simulation_kwargs = {
    'E0': E0,
    'S0': S0,
    'base': sim_chamber_directory,
    'p': p}

df_container = run_sim_workflow(method_list=method_list, data_directory=data_directory,
                                DG_list=DG_list, **simulation_kwargs)

reading results from existing files
biomass, growth_rate, normalized_growth_rate, drug_response_classification, flux data derived from m1 were saved in d:\scarccpy\Data\flux_analysis_m1.csv

Customization of simulation parameters

custom_simulation_kwargs are passed to the class SimulateCombinedAntibiotics

Note : For a same set of single gene target that have run before, if “BM_SG_m1.csv” and “flux_SG_m1.csv” already exist in Data directory, and want to test new set of gene combinations, user can omit the SG_list argument. The program will use the available data that already exist.

[6]:

from scarcc.sim_engine.simulation_workflow import run_sim_workflow
custom_simulation_kwargs = {
    'E0': E0,
    'S0': S0,
    'base': sim_chamber_directory, # directory to run each simulation in multiprocess
    'p': p,
    'checker_suffix': None, # checkerboard suffix for excel spreadsheet
    'ko': False, # knockout reactions if reaction is encoded by target gene

    # layout configuration parameters
    'co': True, # Run coculture simulation
    'mono': True, # Run monoculture simulation
    'mono_E': True, # Run E0 monoculture  simulation
    'mono_S': True, # Run S0 monoculture simulation
    'Smono_carbon_source': 'bulk_ac_e', # Carbon source for S0 monoculture
    'carbon_source': None, # Dictionary for specification of carbon source for each culture
    # comets model specifications
    'initial_pop': 1e-8, # Initial population for simulation
    'obj_style': 'MAX_OBJECTIVE_MIN_TOTAL', # pfba optimization
    'SG_list': ['folA', 'folP'], # List of gene targets to simulate
}

df_container = run_sim_workflow(method_list=['m1'], data_directory=data_directory,
                                DG_list=[['folA', 'folP']], max_cpus=12, **custom_simulation_kwargs)

biomass, growth_rate, normalized_growth_rate, drug_response_classification, flux data derived from m1 were saved in d:\scarccpy\Data\flux_analysis_m1.csv

2.2. Detailed break down of steps

pseudo code for simulation procedure

with E0, S0:
    alter_Sij(...)
    set_comets_models(...)
    set_layout_object(...)
    sim_culture(...)

Note: To prepare genes to perform simulation, see Prepare Genes and Prepare Alpha

Warning

Modification of stoichiometry should be within context manager (with E0, S0:) such that in each run, the metabolic model restores to unperturbed state

2.2.1. Step 1: Stiochiometric Scaling

Reduction of biomass yield under perturbation (see Stoichiometry Scaling)

Note : If both target gene in gene combination leads to the inhibition of the same reaction, the inhibition effect to that particular reaction is set to the maximum one rather than scaled twice

[10]:

# Outside context manager ONLY for illustration of workflow, use with E0，S0: during implementation
from scarcc.preparation.perturbation import alter_Sij
current_gene = 'folA'
alpha_table.columns = [col.replace('_alpha', '') for col in alpha_table.columns]

alter_Sij(E0, alphas=alpha_table.loc[current_gene, 'E0'], genes=current_gene, ko=False)
alter_Sij(S0, alphas=alpha_table.loc[current_gene, 'S0'], genes=current_gene, ko=False)

[10]:

['DHFR', 'DHFR_v1']

2.2.2. Step 2: Construction layout object for cultures with perturbed metabolic models

[12]:

from scarcc.sim_engine.simulation_configuration import LayoutConfig
# E0, S0 are perturbed
base_nutrient = [
    "ca2_e", "cl_e", "cobalt2_e", "cu2_e","fe2_e", "fe3_e", "k_e", "mg2_e",
    "mn2_e", "mobd_e", "ni2_e", "o2_e", "pi_e", "so4_e", "zn2_e", "nh4_e"]

layout_kwargs = {
    'E0': E0, 'S0': S0, 'carbon_source_val': .1, 'base_nutrients': base_nutrient,
    'nutrients_val': 100, 'Smono_carbon_source': 'bulk_ac_e'
}
layout_config = LayoutConfig(**layout_kwargs)
layout_config.set_comets_model(E0, S0)
co_layout, E0_layout, S0_layout = layout_config.set_layout_object()

Or pass different carbon sources inside a dictionary for each cultures with ‘co’, ‘mono_E’, ‘mono_S’ as keys

[25]:

layout_kwargs['carbon_source'] = carbon_source = {'co': 'lcts_e', 'mono_E': 'lcts_e', 'mono_S': 'bulk_ac_e'}
layout_config = LayoutConfig(**layout_kwargs)
layout_config.set_comets_model(E0, S0)
co_layout, E0_layout, S0_layout = layout_config.set_layout_object()

2.2.3. Step 3: Run simulation with layout

[13]:

sim = c.comets(co_layout, p)
sim.run()


Running COMETS simulation ...
Done!

Extract biomass and flux data frames from COMETS sim object

[40]:

print('biomass: \n', sim.total_biomass.head(),
      '\n E0 fluxes:\n', sim.fluxes_by_species['E0'].head(),
      '\n S0 fluxes:\n', sim.fluxes_by_species['S0'].head())

biomass:
    cycle            E0            S0
0      0  1.000000e-08  1.000000e-08
1      1  1.000000e-08  1.000000e-08
2      2  1.000000e-08  1.000000e-08
3      3  1.000000e-08  1.014568e-08
4      4  1.047327e-08  1.014568e-08
 E0 fluxes:
    cycle  x  y  CYTDK2  XPPT  ...  MOCOS  BMOGDS2  FESD2s  OCTNLL  DHFR_v1
0      5  1  1     0.0   0.0  ...    0.0      0.0     0.0     0.0      0.0
2     10  1  1     0.0   0.0  ...    0.0      0.0     0.0     0.0      0.0
4     15  1  1     0.0   0.0  ...    0.0      0.0     0.0     0.0      0.0
6     20  1  1     0.0   0.0  ...    0.0      0.0     0.0     0.0      0.0
8     25  1  1     0.0   0.0  ...    0.0      0.0     0.0     0.0      0.0

[5 rows x 2716 columns]
 S0 fluxes:
    cycle  x  y   3HAD180  ...  OA4L_ST       OA5L_ST  OA4VL_ST  DHFR_v1
1      5  1  1  0.000092  ...      0.0  1.990201e-05       0.0      0.0
3     10  1  1 -0.008731  ...      0.0  5.364746e-07       0.0      0.0
5     15  1  1 -0.000136  ...      0.0  6.088823e-06       0.0      0.0
7     20  1  1 -0.000117  ...      0.0  5.213432e-06       0.0      0.0
9     25  1  1 -0.000151  ...      0.0  6.747233e-06       0.0      0.0

[5 rows x 2549 columns]

2.3. Integration of all steps

2.3.1. Streamlined Process for Simulation Workflow

The implementations illustrated above are integrated in the class SimulateCombinedAntibiotics

takes a target gene or a tuple of gene combination as input
performs a simulation, and returns the biomass and flux results

[3]:

from scarcc.sim_engine.simulation_workflow import SimulateCombinedAntibiotics
alpha_table = pd.read_csv(os.path.join(data_directory, 'alpha_table_m1.csv'), index_col=0)
alpha_table.columns = [col.replace('_alpha', '') for col in alpha_table.columns]
cas = SimulateCombinedAntibiotics(current_gene=['folP','folA'], E0=E0, S0=S0, alpha_table=alpha_table,
                                p=p, mono=True, base=sim_chamber_directory)

biomass_df, flux_df = cas.get_BM_df()


Running COMETS simulation ...
Done!

Running COMETS simulation ...
Done!

Running COMETS simulation ...
Done!

[20]:

from scarcc.sim_engine.output import MethodDataFiller
df_container = {}
df_container['checkerboard', 'SG']  = {'biomass': SG_biomass_df,
                            'flux': SG_flux_df}
df_container['checkerboard', 'DG']  = {'biomass': DG_biomass_df,
                            'flux': DG_flux_df}
mdf = MethodDataFiller(df_container, data_directory=data_directory)
mdf.fill_container()
mdf.write_to_csv()

C:\Users\wongt\AppData\Roaming\Python\Python311\site-packages\scipy\optimize\_minpack_py.py:906: OptimizeWarning: Covariance of the parameters could not be estimated
  warnings.warn('Covariance of the parameters could not be estimated',

biomass, growth_rate, normalized_growth_rate, drug_response_classification, flux data derived from checkerboard were saved in d:\scarccpy\Data\flux_analysis_checkerboard.csv

2.4. Display of data frames inside `df_container`

All the result data frames can be queried with tuple key: (method_chosen, XG)

where method chosen is any method in the method_list and XG is either ‘SG’ or ‘DG’

[5]:

df_container[('m1', 'DG')].keys()

[5]:

dict_keys(['biomass', 'flux', 'growth_rate', 'normalized_growth_rate', 'drug_response_classification'])

“growth_rate” and “drug_response_classification” are generated in class MethodDataFiller
- To import, run:
  
  from scarcc.sim_engine.output import MethodDataFiller
- Initialize a df_container and pass to MethodDataFiller:
  
  pass a df_container with keys “(method_chosen, XG)” for each XG, provide a dictionary with keys “biomass” and “flux” and the values as the corresponding data frame

Example:

df_container = {}
df_container['m1', 'SG']  = {'biomass': SG_biomass_df,
                            'flux': SG_flux_df}
df_container['m1', 'DG']  = {'biomass': DG_biomass_df,
                            'flux': DG_flux_df}
MethodDataFiller(df_container)
mdf.fill_container()
mdf.write_to_csv()

SG data frames

[30]:

pd.set_option('display.max_columns', 3)
[print(k, ': \n', df.head(), '\n') for k, df in df_container[('m1', 'SG')].items()]

biomass :
        E0_Normal_coculture  S0_Normal_coculture  E0_Normal_monoculture  \
cycle
0             1.000000e-08         1.000000e-08           1.000000e-08
1             1.000000e-08         1.000000e-08           1.754153e-08
2             1.000000e-08         1.000000e-08           3.077053e-08
3             1.000000e-08         1.029133e-08           5.397622e-08
4             1.094643e-08         1.029133e-08           9.468255e-08

       S0_Normal_monoculture  E0_folA_coculture  S0_folA_coculture  \
cycle
0               1.000000e-08       1.000000e-08       1.000000e-08
1               1.077695e-08       1.000000e-08       1.000000e-08
2               1.161426e-08       1.000000e-08       1.000000e-08
3               1.251663e-08       1.000000e-08       1.014568e-08
4               1.348911e-08       1.047327e-08       1.014568e-08

       E0_folA_monoculture  S0_folA_monoculture  E0_folP_coculture  \
cycle
0             1.000000e-08         1.000000e-08       1.000000e-08
1             1.332445e-08         1.041132e-08       1.000000e-08
2             1.775410e-08         1.083955e-08       1.000000e-08
3             2.365636e-08         1.128541e-08       1.000000e-08
4             3.152081e-08         1.174959e-08       1.045757e-08

       S0_folP_coculture  E0_folP_monoculture  S0_folP_monoculture
cycle
0           1.000000e-08         1.000000e-08         1.000000e-08
1           1.000000e-08         1.304388e-08         1.041559e-08
2           1.000000e-08         1.701428e-08         1.084844e-08
3           1.014085e-08         2.219323e-08         1.129929e-08
4           1.014085e-08         2.894858e-08         1.176887e-08

flux :
                                    Unnamed: 0  cycle  x  y  CYTDK2  XPPT  \
Gene_inhibition Species culture
AB              E0      coculture           0     25  1  1       0     0
                S0      coculture           1     25  1  1       0     0

                                   HXPRT  NDPK5    SHK3Dr  NDPK6  ...  \
Gene_inhibition Species culture                                   ...
AB              E0      coculture      0      0  0.042396      0  ...
                S0      coculture      0      0  0.099704      0  ...

                                         pe2_ST        pg2_ST      clpn2_ST  \
Gene_inhibition Species culture
AB              E0      coculture           NaN           NaN           NaN
                S0      coculture  8.020000e-07  2.200000e-07  7.330000e-09

                                         pa2_ST        ps2_ST     12dgr2_ST  \
Gene_inhibition Species culture
AB              E0      coculture           NaN           NaN           NaN
                S0      coculture  2.090000e-09  2.090000e-09  9.840000e-08

                                   OAL_ST  OA4L_ST  OA5L_ST  OA4VL_ST
Gene_inhibition Species culture
AB              E0      coculture     NaN      NaN      NaN       NaN
                S0      coculture     0.0      0.0      0.0       0.0

[2 rows x 3101 columns]

growth_rate :
                  E0_coculture  S0_coculture  E0_monoculture  S0_monoculture
Gene_inhibition
Normal               0.074819      0.040716        0.547661        0.074824
folA                 0.021745      0.009762        0.287016        0.040308
folP                 0.021060      0.009690        0.265734        0.040718

normalized_growth_rate :
                  E0_coculture  S0_coculture  E0_monoculture  S0_monoculture
Gene_inhibition
Normal               1.000000      1.000000        1.000000        1.000000
folA                 0.290640      0.239757        0.524075        0.538707
folP                 0.281477      0.237998        0.485216        0.544185

[30]:

[None, None, None, None]

DG data frames

[8]:

[print(k, ': \n', df.head(), '\n') for k, df in df_container[('m1', 'DG')].items()]

biomass :
        E0_folA.folP_coculture  S0_folA.folP_coculture  \
cycle
0                1.000000e-08            1.000000e-08
1                1.000000e-08            1.000000e-08
2                1.000000e-08            1.000000e-08
3                1.000000e-08            1.000954e-08
4                1.000217e-08            1.000954e-08

       E0_folA.folP_monoculture  S0_folA.folP_monoculture
cycle
0                  1.000000e-08              1.000000e-08
1                  1.000217e-08              1.003103e-08
2                  1.000434e-08              1.006215e-08
3                  1.000650e-08              1.009337e-08
4                  1.000867e-08              1.012469e-08

flux :
                                      cycle  x  y  CYTDK2  XPPT  HXPRT  NDPK5  \
Gene_inhibition Species culture
folA.folP       E0      coculture       25  1  1     0.0   0.0    0.0    0.0
                S0      coculture       25  1  1     0.0   0.0    0.0    0.0
                E0      monoculture     25  1  1     0.0   0.0    0.0    0.0
                S0      monoculture     25  1  1     0.0   0.0    0.0    0.0

                                       SHK3Dr  NDPK6  NDPK8  ...  \
Gene_inhibition Species culture                              ...
folA.folP       E0      coculture    0.042396    0.0    0.0  ...
                S0      coculture    0.099704    0.0    0.0  ...
                E0      monoculture  0.042396    0.0    0.0  ...
                S0      monoculture  0.295397    0.0    0.0  ...

                                           pe2_ST        pg2_ST      clpn2_ST  \
Gene_inhibition Species culture
folA.folP       E0      coculture             NaN           NaN           NaN
                S0      coculture    8.016775e-07  2.199246e-07  7.330820e-09
                E0      monoculture           NaN           NaN           NaN
                S0      monoculture  2.375154e-06  6.515772e-07  2.171924e-08

                                           pa2_ST        ps2_ST     12dgr2_ST  \
Gene_inhibition Species culture
folA.folP       E0      coculture             NaN           NaN           NaN
                S0      coculture    2.094520e-09  2.094520e-09  9.844244e-08
                E0      monoculture           NaN           NaN           NaN
                S0      monoculture  6.205497e-09  6.205497e-09  2.916584e-07

                                       OAL_ST  OA4L_ST   OA5L_ST  OA4VL_ST
Gene_inhibition Species culture
folA.folP       E0      coculture         NaN      NaN       NaN       NaN
                S0      coculture    0.000000      0.0  0.000000       0.0
                E0      monoculture       NaN      NaN       NaN       NaN
                S0      monoculture  0.000003      0.0  0.000004       0.0

[4 rows x 3100 columns]

growth_rate :
                  E0_coculture  S0_coculture  E0_monoculture  S0_monoculture  \
Gene_inhibition
Normal               0.074819      0.040716        0.547661        0.074824
folA.folP            0.000000      0.000000        0.000000        0.000000

                First_gene  E0_monoculture_First_gene  \
Gene_inhibition
Normal              Normal                   0.547661
folA.folP             folA                   0.287016

                 S0_monoculture_First_gene Second_gene  \
Gene_inhibition
Normal                            0.074824      Normal
folA.folP                         0.040308        folP

                 E0_coculture_Second_gene  S0_coculture_Second_gene  \
Gene_inhibition
Normal                           0.074819                  0.040716
folA.folP                        0.021060                  0.009690

                 E0_monoculture_Second_gene  S0_monoculture_Second_gene
Gene_inhibition
Normal                             0.547661                    0.074824
folA.folP                          0.265734                    0.040718

normalized_growth_rate :
                  E0_coculture  S0_coculture  \
Gene_inhibition
Normal                    1.0           1.0
folA.folP                 0.0           0.0

                 Predicted_additive_effect_E0_coculture  \
Gene_inhibition
Normal                                         1.000000
folA.folP                                      0.081808

                 Predicted_additive_effect_S0_coculture  E0_monoculture  \
Gene_inhibition
Normal                                         1.000000             1.0
folA.folP                                      0.057062             0.0

                 S0_monoculture  Predicted_additive_effect_E0_monoculture  \
Gene_inhibition
Normal                      1.0                                   1.00000
folA.folP                   0.0                                   0.25429

                 Predicted_additive_effect_S0_monoculture First_gene  \
Gene_inhibition
Normal                                           1.000000     Normal
folA.folP                                        0.293156       folA

                 E0_monoculture_First_gene  ...  E0_monoculture_Second_gene  \
Gene_inhibition                             ...
Normal                            1.000000  ...                    1.000000
folA.folP                         0.524075  ...                    0.485216

                S0_monoculture_Second_gene  po_diff_E0_coculture  \
Gene_inhibition
Normal                            1.000000              0.000000
folA.folP                         0.544185             -0.081808

                 po_diff_S0_coculture  po_diff_E0_monoculture  \
Gene_inhibition
Normal                       0.000000                 0.00000
folA.folP                   -0.057062                -0.25429

                 po_diff_S0_monoculture  Drug_comb_effect_E0_coculture  \
Gene_inhibition
Normal                         0.000000                       Additive
folA.folP                     -0.293156                   Antagonistic

                 Drug_comb_effect_S0_coculture  \
Gene_inhibition
Normal                                Additive
folA.folP                         Antagonistic

                 Drug_comb_effect_E0_monoculture  \
Gene_inhibition
Normal                                  Additive
folA.folP                           Antagonistic

                 Drug_comb_effect_S0_monoculture
Gene_inhibition
Normal                                  Additive
folA.folP                           Antagonistic

[2 rows x 24 columns]

drug_response_classification :
                 Drug_comb_effect_E0_coculture Drug_comb_effect_E0_monoculture  \
Gene_inhibition
Normal                               Additive                        Additive
folA.folP                        Antagonistic                    Antagonistic

                Drug_comb_effect_S0_coculture Drug_comb_effect_S0_monoculture
Gene_inhibition
Normal                               Additive                        Additive
folA.folP                        Antagonistic                    Antagonistic

[8]:

[None, None, None, None, None]