8.13. Introduction to Integrase Enumerator

In BioCRNpyler, Integrase_Enumerator is a way to modularly define DNA constructs that can be recombined into other forms using integrases. Integrases in general are proteins that bind to a specific sequence and then cause a recombination event between two such sequences. In the case of Serine Integrases, these sequences are called attP and attB. So one very simple way to express an integrase reaction (where Bxb1 is an integrase) would be like this:

\(attB + 2 Bxb1 \rightleftharpoons attB:Bxb1\)

\(attP + 2 Bxb1 \rightleftharpoons attP:Bxb1\)

\(attB:Bxb1 + attP:Bxb1 \xrightarrow{k_{int}} attL:Bxb1 + attR:Bxb1\)

(Bxb1 unbinding from attL and attR is not shown)

With this simple schematic, very complex dna manipulations can be performed.

[1]:

try:
    #dnaplotlib is a cool library for plotting DNAs.
    #please use my fork located at https://github.com/dr3y/dnaplotlib
    #to install it type: pip install git+git://github.com/dr3y/dnaplotlib.git@master
    import dnaplotlib as dpl # type: ignore[import]
    %matplotlib inline
    dpl_enabled = True
except (ModuleNotFoundError,ImportError) as e:
    dpl_enabled = False

def plotNetwork(inCRN,use_pretty_print=True,colordict = None,iterations=2000,rseed=30,posscale=1,export=False,layout="force",**keywords):
    try:
        from bokeh.models import (Plot , Range1d)
        from biocrnpyler.utils.plotting import render_network_bokeh
        import bokeh.plotting
        import bokeh.io
        bokeh.io.output_notebook() #this makes the graph appear in line with the notebook
        layout_str = "force"
        plot = render_network_bokeh(inCRN,use_pretty_print=use_pretty_print,\
            colordict=colordict,layout=layout_str,posscale=posscale,\
                iterations=iterations,rseed=rseed,**keywords) #this creates the networkx objects

        bokeh.io.show(plot) #if you don't type this the plot won't show
        if(export):
            plot.output_backend = "svg"
            bokeh.io.export_svgs(plot, "plot_file.svg")

    except ModuleNotFoundError:
        return None

[2]:

from biocrnpyler.components import RegulatedPromoter, Promoter, RBS, CDS, Terminator, Origin
from biocrnpyler.components import IntegraseSite, IntegraseRule, Integrase_Enumerator, DNA_construct
from biocrnpyler.mixtures import TxTlExtract
from biocrnpyler.utils.plotting import CRNPlotter, render_network_bokeh
#first, we define basic construct components

ptet = RegulatedPromoter("ptet",["tetr"],leak=True) #this is a repressible promoter
pconst = Promoter("pconst") #constitutive promoter
utr1 = RBS("UTR1") #regular RBS
gfp = CDS("GFP")
rfp = CDS("RFP")
t16 = Terminator("t16") #a terminator stops transcription
gen_ori = Origin("genome") #put this Origin part in the genome so it is not considered for intermolecular reactions
gen_ori.attributes = ["no_inter"] #this is what tells us that any dna that contains this part should not be considered for intermolecular reactions

#some parameters are useful also. note the "kint" parameter which determines the rate of recombination
parameters={"cooperativity":2,"kb":100, "ku":10, "ktx":.05, "ktl":.2, "kdeg":2,"kint":.05}
#here is where we define the integrase attachment sites

attP = IntegraseSite("attP","attP",integrase="Bxb1") #the first argument is the name of this attachment site, the second argument is the type of attachment site it is ("attP", "attB", "attL" or "attR") and the integrase denotes which integrase binds to this site (default is "int1")

attB = IntegraseSite("attB","attB",integrase="Bxb1") #we define two attachment sites, as one site doesn't do anything on its own besides bind integrases

bxb1_mechanism = IntegraseRule("Bxb1", reactions={("attB","attP"):"attL",("attP","attB"):"attR"})
#an integrase rule explains how the integrase functions. It is a dictionary which has valid combinations of sites for the keys, and the values are the names of the new sites generated. When two sites combine, they create two different sites, which are the result of the sites themselves getting recombined. In the real system, it's a bit like this:
#A==attB1:attB2==B + C==attP1:attP2==D -> A==attB1:attP2==D + C==attP1:attB2==B
#so then we say, attB1:attB2 is attB, attP1:attP2 is attP, attB1:attP2 is attL, attP1:attB2 is attR. the == represents a strand of DNA, labeled with A-D to keep them distinct
#If we are tracking what happens to the "A" strand, then we can simplify this a bit to say:
#A:attB + attP -> A:attL
#and likewise for the "C" strand,
#C:attP + attB -> C:attR
#this is what the "reactions" argument represents, and this is its default value.


bxb1 = Integrase_Enumerator("Bxb1", int_mechanisms={"Bxb1":bxb1_mechanism}) #we must also define an integrase enumerator. The default integrase is always "int1", but here we are specifying "Bxb1" as the integrase. The Integrase enumerator gets a name, and the mechanism we defined above.

#now that the parts are defined, we can put together our construct.
promoter_flip = DNA_construct([[t16,"reverse"],[rfp,"reverse"],[utr1,"reverse"],[attP,"forward"],[pconst,"forward"],[attB,"reverse"],[utr1,"forward"],[gfp,"forward"],[t16,"forward"],gen_ori])
promoter_flip.attributes = ["no_inter"]

#this is organized in a "promoter flipping" arrangment where the integrase causes an intramolecular reaction that results in the promoter facing the other direction, and transcribing a different set of genes.

myMixture = TxTlExtract(name = "txtl", parameters = parameters, components = [promoter_flip],global_component_enumerators=[bxb1],global_recursion_depth=2) #to activate integrase reactions you must add our `Integrase_Enumerator` as a global_component_enumerator in the mixture.
enumerated_constructs = myMixture.global_component_enumeration()
#then, we can plot the DNA constructs that will be present in our CRN:

if(dpl_enabled):
    plotter = CRNPlotter()
    for construct in enumerated_constructs:
        if(isinstance(construct,DNA_construct)):
            plotter.renderConstruct(construct)


#and, simulate the CRN
myCRN = myMixture.compile_crn()
print(len(myCRN.species))
print(len(myCRN.reactions))
for reaction in myCRN.reactions:
    rxcount = myCRN.reactions.count(reaction)
    if(rxcount>1):
        print(f"{reaction} found {rxcount} times!")
plotNetwork(myCRN,colordict = {"complex": "cyan", "protein": "green",
                     "dna": "white", "rna": "orange",
                     "ligand": "pink", "phosphate": "yellow",\
                     "nothing": "purple",\
                     "GFP":"lightgreen", "RFP":"red",\
                     "Bxb1":"yellow", "kint":"brown"},reactioncolordict = {"kint":"yellow","kb":"cornflowerblue","ktx":"orange"},use_pretty_print=False)

28
42

Loading BokehJS ...

/Users/murray/Library/CloudStorage/Dropbox/macosx/src/biocrnpyler/biocrnpyler/utils/plotting.py:203: UserWarning: Node keys in 'layout_function' don't match node keys in the graph. These nodes may not be displayed correctly.
  reaction_renderer = from_networkx(DGreactions, positions, center=(0, 0))
/Users/murray/Library/CloudStorage/Dropbox/macosx/src/biocrnpyler/biocrnpyler/utils/plotting.py:204: UserWarning: Node keys in 'layout_function' don't match node keys in the graph. These nodes may not be displayed correctly.
  species_renderer = from_networkx(DGspecies, positions, center=(0, 0))

[3]:

try:
    import numpy as np
    import matplotlib.pyplot as plt
    timepoints = np.linspace(0, 2000, 3000)
    x0 = {promoter_flip.get_species():6.0,
        "protein_Bxb1":5, "protein_RNAP":10., "protein_Ribo":50.}
    Re1 = myCRN.simulate_with_bioscrape_via_sbml(timepoints, initial_condition_dict = x0)
    if(Re1 is not None):
        ln1 = plt.plot(timepoints,Re1["protein_GFP"], label = "protein_GFP", color = "green")
        ln2 = plt.plot(timepoints,Re1["protein_RFP"], label = "protein_RFP", color = "red")
        unflipped = []
        flipped = []
        for a in Re1:
            if("attP" in a and "part" in a):
                unflipped += [a]
            elif("attL" in a and "part" in a):
                flipped += [a]
        ax1 = plt.gca()
        plt.ylabel("protein")
        ax2 = ax1.twinx()
        ln3 = ax2.plot(timepoints,Re1[unflipped].sum(axis=1), label = "initial construct", color='lightgreen')
        ln4 = ax2.plot(timepoints,Re1[flipped].sum(axis=1), label = "flipped promoter",color='pink')
        plt.title("Time trace of GFP -> RFP promoter-flipping system")
        plt.xlabel("time")
        plt.ylabel("dna",color="forestgreen")
        lns = ln1+ln2+ln3+ln4
        labs = [l.get_label() for l in lns]
        plt.legend(lns, labs, loc="lower right")
        #plt.legend(loc="lower right")
except ModuleNotFoundError:
    print('please install the plotting libraries: pip install biocrnpyler[all]')

../_images/examples_6._Integrase_Examples_3_0.png

Integrase reactions are depicted as yellow squares. You can also see that GFP goes up first, but then slows down as all the DNA is converted into the flipped form. The flipped DNA is also plotted

[4]:

plasmid1_construct = DNA_construct([t16,attP,attB,gfp],circular=True)
genome_construct = DNA_construct([pconst,attB,rfp,gen_ori])

recursion_depth = 1 #this determines how many times the components are asked about what integrase reactions they make

myMixture = TxTlExtract(name = "txtl", parameters = parameters, components = [plasmid1_construct,genome_construct],global_component_enumerators=[bxb1],global_recursion_depth=recursion_depth)

#the return_enumerated_components keyword causes the Mixture to also return all Components created during compilation
myCRN, enumerated_constructs = myMixture.compile_crn(return_enumerated_components = True)
#then, we can plot the DNA constructs that will be present in our CRN:

if(dpl_enabled):
    plotter = CRNPlotter()
    for construct in enumerated_constructs:
        if(isinstance(construct,DNA_construct)):
            plotter.renderConstruct(construct)

print(len(myCRN.species))
print(len(myCRN.reactions))

52
97

[5]:

try:
    import numpy as np
    import matplotlib.pyplot as plt
    timepoints = np.linspace(0, 2000, 3000)
    x0 = {plasmid1_construct.get_species():6.0, genome_construct.get_species():3,
        "protein_Bxb1":5, "protein_RNAP":10., "protein_Ribo":50.}
    Re1 = myCRN.simulate_with_bioscrape_via_sbml(timepoints, initial_condition_dict = x0)
    if(Re1 is not None):
        #ln1 = plt.plot(timepoints,Re1["protein_GFP"], label = "protein_GFP", color = "green")
        #ln2 = plt.plot(timepoints,Re1["protein_RFP"], label = "protein_RFP", color = "red")
        unflipped = []
        plasmid = []
        genome = []
        plasmid_inter = []
        dead_end = []
        dead_end2 = []
        integration_count = {}
        for a in Re1:
            if(("attP" in a) and ("attR" not in a) and  ("attL" not in a) and ("part" in a) and ("circular" in a) and ("attL" not in a)):
                #this is the unflipped plasmid
                plasmid += [a]
            elif(("attB" in a) and ("RFP" in a) and ("attL" not in a) and ("attR" not in a) and ("part" in a) and ("pconst" in a)):
                #this is the "genome" where the plasmid integrates
                genome += [a]
            elif(("attL" in a) and ("attR" not in a)  and ("circular" in a)):
                #this is the dead end byproduct
                dead_end += [a]
            elif(("attR" in a) and ("attL" not in a) and ("circular" in a)):
                #the other byproduct which cannot integrate
                dead_end2 += [a]
            elif(("attL" in a) and ("part" in a) and ("pconst" in a) and ("genome" in a)):
                int_count = a.count("attL")
                if(int_count in integration_count):
                    integration_count[int_count] += [a]
                else:
                    integration_count[int_count] = [a]
            elif(("attR" in a) and  ("circular" in a)):
                plasmid_inter += [a]
        plt.plot(timepoints,Re1[plasmid].sum(axis=1), label = "plasmid", color='lightgreen')
        plt.plot(timepoints,Re1[genome].sum(axis=1), label = "genome", color='black')
        plt.plot(timepoints,Re1[dead_end].sum(axis=1), label = "dead end 1", color='grey')
        plt.plot(timepoints,Re1[dead_end2].sum(axis=1), label = "dead end 2", color='darkgrey')
        plt.plot(timepoints,Re1[plasmid_inter].sum(axis=1), label = "plasmid intermolecular", color='red')

        for count, speciesnames in integration_count.items():
            plt.plot(timepoints,Re1[speciesnames].sum(axis=1), label=str(count))
        plt.legend()
        #plt.ylim(0,10)
except ModuleNotFoundError:
    print('please install the plotting libraries: pip install biocrnpyler[all]')

../_images/examples_6._Integrase_Examples_6_0.png

Plot the CRN

[6]:

plotNetwork(myCRN,colordict = {"complex": "cyan", "protein": "green",
                     "dna": "white", "rna": "orange",
                     "ligand": "pink", "phosphate": "yellow",\
                     "nothing": "purple",\
                     "GFP":"lightgreen", "RFP":"red",\
                     "Bxb1":"yellow", "kint":"brown"},reactioncolordict = {"kint":"yellow","kb":"cornflowerblue","ktx":"orange"},use_pretty_print=False)

Loading BokehJS ...

/Users/murray/Library/CloudStorage/Dropbox/macosx/src/biocrnpyler/biocrnpyler/utils/plotting.py:203: UserWarning: Node keys in 'layout_function' don't match node keys in the graph. These nodes may not be displayed correctly.
  reaction_renderer = from_networkx(DGreactions, positions, center=(0, 0))
/Users/murray/Library/CloudStorage/Dropbox/macosx/src/biocrnpyler/biocrnpyler/utils/plotting.py:204: UserWarning: Node keys in 'layout_function' don't match node keys in the graph. These nodes may not be displayed correctly.
  species_renderer = from_networkx(DGspecies, positions, center=(0, 0))

Integrase Enumerator Compilation Optimizations

Component Enumeration, especially with integrases, can take a long time. In the following cell Mixture.compile_crn(…) is called with a number of compilation directives that can dramatically speed up CRN compilation. The catch is that these directives can make the compilation process more fragile, prone to errors, and miss particular details. Each directive is explained below:

initial_concentrations_at_end (default = False): adds initial concentrations to the CRN at the very end of compilation. This means that only the Mixture’s ParameterDatabase is searched for concentrations, not Component’s or their ParameterDatabases. Setting to True speeds up compilation for CRNs with huge numbers of Species.
copy_objects (default = True): deep copies all Species and Reactions as they are added to the CRN. This ensures that changing a Component after the CRN is compiled does not effect the CRN. Setting to False significantly speeds up compilation, but means the compiled CRN may be subject to change.
add_reaction_species (default = True): Adds the Species inside each Reaction to the CRN whenever Reactions are added. This helps ensures no Species are missed. Setting to False speeds up compilation, in particularly for large numbers of reactions, but can occasionally result in errors.

[7]:

counts = []
depths = range(4)
# Add the genome origin so we don't compute intermolecular reactions for this test, or it will take too long
plasmid1_construct_test = DNA_construct([t16,attP,attB,gfp,gen_ori],
                                        circular=True)
for recursion_depth in depths:
    myMixture = TxTlExtract(name = "txtl", parameters = parameters, components = [plasmid1_construct_test,genome_construct],global_component_enumerators=[bxb1],global_recursion_depth=recursion_depth)
    myCRN, comps = myMixture.compile_crn(recursion_depth = recursion_depth,
                                         return_enumerated_components = True,
                                         initial_concentrations_at_end = True,
                                         copy_objects = False,
                                         add_reaction_species = False)
    print(recursion_depth, "C:", len(comps), "S:", len(myCRN.species), "R:", len(myCRN.reactions))
    counts.append((len(comps), len(myCRN.species),len(myCRN.reactions)))

try:
    import matplotlib.pyplot as plt
    plt.figure()
    plt.plot(depths, [c[0] for c in counts], label = "# Components")
    plt.plot(depths, [c[1] for c in counts], label = "# Species")
    plt.plot(depths, [c[2] for c in counts], label = "# Reactions")
    plt.legend()
    plt.xlabel("Recursion Depth")
except ModuleNotFoundError:
    print('please install the plotting libraries: pip install biocrnpyler[all]')

0 C: 14 S: 14 R: 12
1 C: 52 S: 36 R: 61
2 C: 246 S: 102 R: 303
3 C: 1272 S: 360 R: 1521

../_images/examples_6._Integrase_Examples_10_1.png