Over the last decades, technological advancements – vis-à-vis genome-editing - have led to more refined approaches to perform allele-specific manipulations. Among them, we count programmable endonucleases (e.g. TALEN, ZFN, CRISPR/Cas9).1 These tools not only allow engineering the genome at a higher precision level but also with greater feasibility and ease. Thanks to their versatility, these advancements enable better investigations and understanding of gene functions as well as the genomic underpinnings of human diseases. Having said that, genome editing can also result in undesired on-target sequence variants ranging from single base pair variants up to large structural rearrangements.2
Similarly, new methods and procedures helped improve transgenesis and facilitated the development of various genetically engineered models, further stimulating drug discovery and target validation. Nevertheless, the suitability of generated transgenic lines will often depend on the location of the transgene integration site(s) and the integrity of integrated transgene sequences since these (genetic) outcomes will either bring about desired or aberrant gene expression.3
Despite developments in targeted gene sequencing and whole-genome analysis techniques, the robust detection of all genetic variation (including structural variants) in and around (trans)genes of interest remains a challenge.4 In this Opinion & Review blog, we take a deeper dive into some of the most pressing genetic QC needs and highlight the increasing need for in-depth characterization of genetically engineered models to safeguard interpretable and reproducible experiments in genetic research.
I. A reliable method to precisely identify integration sites and detect structural variations
The traditional pronuclear microinjection – to generate transgenic rodent models - is known to drive random integration. On top of that, the insertion can occur at one or more loci, and does not preclude tandem repeat insertions (multiple copies, concatemerization). 1,5 Such possibility therefore reduces the certainty of reaching desired level of gene expression in generated transgenic lines.
Although the importance of insertion site discovery is widely undisputed, the identification of transgene integration sites was not routinely implemented due to the scarcity and/or inadequacy of discovery tools readily available. However, new opportunities quickly arose with the introduction of high-throughput sequencing. In fact, Genentech6 and The Jackson Laboratory (JAX)7 decided to apply our TLA-based solutions to QC several of their transgenic mouse lines. In both projects, the exact genomic positions of integration sites were precisely identified and unexpected structural variations (i.e. large genomic rearrangement) accompanying their engineering were also detected.
Figure 1. Schematic depictions of (TLA) sequencing coverage profiles resulting from different rearrangements.
In JAX’s project, TLA-based analysis also identified co-integration of bacterial sequences (E. coli) together with the transgene, in 10 of the 40 analyzed lines.7 Unequivocally, these observations emphasize the importance for careful quality control strategies as part of genetic research.
II. Relationship between insertion sites and transgene expression patterns
To date, there is also growing evidence regarding frequent interactions between transgenes and endogenous genes. For one, Professor Sanes’ team at Harvard University documented those complex interactions and provided new insights into insertion site-dependent transgene expression in the following study.8 Based on these results, the authors affirm that these data and discovery further “strengthen the argument that determination of insertion sites can be useful both for gene discovery and for assessing effects of transgene insertion that would otherwise go undetected.”
Figure 2. HB9-GFP locus identification. (A) Schematic representation of the HB9-GFP transgene. Black arrows indicate the positions of TLA primer sets.
(B) Genome-wide TLA coverage plots with Mnx1 primers. Peak at Chromosome 5 shows endogenous Mnx1 (black circle) and peak at Chromosome 15 shows inserted sequence (red circle).
(C) Genome-wide TLA coverage using GFP primers, showing a peak at Chromosome 15 (red circle).8
III. Mapping insertion site(s) of large vector constructs
Transgenesis can also be achieved via means of larger constructs, such as bacterial artificial chromosomes (BACs). The carrying capacity of those large genomic clones is in the range of several hundreds of kilobases. In a recent study led by NIH researchers, a transgenic mouse was created using a 245 kb BAC to enable tissue-specific Cre expression (i.e. exclusively in white and brown adipose tissues). However, the sheer size of their transgene as well as the limited information on its potential recombination sites has made difficult for the scientists to determine insertion site via standard sequencing methods. As a result, our capabilities were called upon for the in-depth genetic characterization of their Adipoq-cre BAC transgenic mice to determine the integrity of the integrated transgenic sequence.9
Figure 3. (a, b) Results of TLA mapping of the Tg(Adipoq-cre)1Evdr mouse BAC insertion point.For the analysis of large vectors (i.e. >50 kb), additional TLA primer pairs are added to ensure that sufficient sequencing coverage will be generated across the vector and its integration site(s). Given the 250 kb size of the BAC used to generate the transgenic mouse, 5 primers pairs internal to the transgene were used in this project and our data revealed integration of the transgene on mouse chromosome 9 between exons 6 and 7 of Tbx18.9
IV. Developing reliable genotyping PCR strategies to determine zygosity
To investigate the physiological and pathophysiological roles of alkylglycerol monooxygenase (AGMO), scientists at Medical University of Innsbruck generated a knockout mouse model for AGMO using EUCOMM stem cells. However, their genotyping results that did not line up with Mendelian distribution nor enzymatic data. This peculiar outcome motivated them to perform thorough genomic screening of their mouse model. In this study, both TLA and nanopore sequencing solved the enigma by revealing an unexpected tandem duplication event – following recombineering – at the exact locus of the homologous recombination.10 This paper further manifests the inability of conventional technologies to detect and reliably map more complex genome lesions in a cost-attractive manner, while TLA-based breakpoint sequences can be exploited to design and develop reliable PCR genotyping strategies.
V. Pitfalls of conventional targeted gene sequence and whole-genome analysis technologies
Our comparison table (technological overview) zeroes in on the depth and breadth of genetic insights that different technologies (including WGS, PCR-based/capture approaches, qPCR/ddPCR, Southern blot or FISH) are able to yield.
If the intention is to scrutinize the entire genome, then WGS is unarguably the best option. However, if the relevant (trans)genes are already known, this approach would then no longer be time-efficient nor cost-effective. Indeed, the large volume of data generated here would instead, render the analysis and interpretation of the data even more complex. PCR-based approaches, on the other hand, are suitable for point mutations and small indels. With that said, the approach becomes rather cumbersome for larger (trans)genes. On top of that, you will inherently miss anything that you are unaware of (given the hypothesis-driven nature of the technology). Regarding Southern blot, while it allows detecting structural variations (i.e. rearrangements), sequence information will however not be generated. As for FISH (fluorescence in situ hybridization), the technique leads to low resolution mapping of transgenes and will offer little information regarding mutagenesis at the integration site. Hence, data generated via Southern blot and FISH are also both incomplete.
A complete and unbiased method for the selection and QC of genetically modified models
It is important to realize that multiple techniques - used either for random or targeted integrations - can potentially set off undesired (off-target) integrations, multiple integration sites, unexpected integrations of backbone sequences as well as undesired sequence or structural variants in the integrated transgene sequence and surrounding host genome sequence. In turn, these events can have phenotypic consequences and can confound certain experiments. For instance, random integrations via traditional pronuclear injection to generate transgenic rodent models can potentially disrupt key endogenous gene functions. Moreover, awareness and concerns regarding genetic mosaicism have also become more prominent in biomedical research (which can lead to "false-positive" founders). Therefore, the identification of integration sites and the development of reliable PCR genotyping strategies are highly beneficial for proper genotype-phenotype correlation and for breeding optimization and intercrossing purposes.
All in all, adopting a robust analytical tool and making in-depth genetic characterization a more common practice in genetic research will unquestionably: (1) assure interpretable and reproducible experiments and (2) facilitate the cost-effective management of transgenic animal facilities.
As illustrated through the many peer-reviewed publications cited above, conventional technologies are either suboptimal or unable to fully resolve all the relevant genetic characteristics - following genetic manipulation - for thorough QC. As such, our TLA-based method truly represents a powerful and cost-effective QC solution to fully assess the consequences of your genetic engineering at a single nucleotide resolution level. In fact, JAX’s researchers claim that “TLA should be considered a “first pass” tool for integration locus discovery.” A sentiment that has been echoed by Dr. Welcker (Director Molecular Biology & Scientific Development at Taconic) who asserts that "TLA has become an indispensable tool to characterize transgene insertion events and complex gene targeting approaches."
To learn more about our customers’ experience regarding our TLA-based solutions and/or services, head to our Testimonials page! If you would like to get in touch to further discuss and/or assess how our proprietary TLA-based assays can best support you and your colleagues in your work, please reach out to our sales team (via email@example.com) and we will happily schedule a no-obligation consultation at your convenience.
 Gurumurthy CB, Lloyd KCK. Generating mouse models for biomedical research: technological advances. Dis Model Mech. 2019 Jan 8;12(1):dmm029462. doi: 10.1242/dmm.029462. PMID: 30626588; PMCID: PMC6361157.
 Michal Kosicki et al. Repair of double-strand breaks induced by CRISPR-Cas9 leads to large deletions and complex rearrangements. Nature Biotechnology 36(8):765-771 (2018)
 Christian S. Kaas et al. (2015). Deep sequencing reveals different compositions of mRNA transcribed from the F8 gene in a panel of FVIII-producing CHO cell lines Biotechnology Journal 10(7): 1081-1089
 de Vree, P., de Wit, E., Yilmaz, M. et al. Targeted sequencing by proximity ligation for comprehensive variant detection and local haplotyping. Nat Biotechnol 32, 1019–1025 (2014). https://doi.org/10.1038/nbt.2959
 Taconic Biosciences, Inc. Transgene Mapping Analysis by Targeted Locus Amplification Technology. https://www.taconic.com/pdfs/Transgene-Mapping-Analysis-A4.pdf
 Cain-Hom C, Splinter E, van Min M, Simonis M, van de Heijning M, Martinez M, Asghari V, Cox JC, Warming S. Efficient mapping of transgene integration sites and local structural changes in Cre transgenic mice using targeted locus amplification. Nucleic Acids Res. 2017 May 5;45(8):e62. doi: 10.1093/nar/gkw1329. PMID: 28053125; PMCID: PMC5416772.
 Goodwin, L. O., Splinter, E., Davis, T. L., Urban, R., He, H., Braun, R. E., Chesler, E. J., Kumar, V., van Min, M., Ndukum, J., Philip, V. M., Reinholdt, L. G., Svenson, K., White, J. K., Sasner, M., Lutz, C., & Murray, S. A. (2019). Large-scale discovery of mouse transgenic integration sites reveals frequent structural variation and insertional mutagenesis. Genome research, 29(3), 494–505. https://doi.org/10.1101/gr.233866.117
 Laboulaye, M. A., Duan, X., Qiao, M., Whitney, I. E., & Sanes, J. R. (2018). Mapping Transgene Insertion Sites Reveals Complex Interactions Between Mouse Transgenes and Neighboring Endogenous Genes. Frontiers in molecular neuroscience, 11, 385. https://doi.org/10.3389/fnmol.2018.00385
 Wong AM , Patel TP, Altman EK, Tugarinov N, Trivellin G & Yanovski JA. (2021). Characterization of the adiponectin promoter + Cre recombinase insertion in the Tg(Adipoq-cre)1Evdr mouse by targeted locus amplification and droplet digital PCR, Adipocyte, 10:1, 21-27, DOI: 10.1080/21623945.2020.1861728
 Sailer S, Coassin S, Lackner K, Fischer C, McNeill E, Streiter G, Kremser C, Maglione M, Green CM, Moralli D, Moschen AR, Keller MA, Golderer G, Werner-Felmayer G, Tegeder I, Channon KM, Davies B, Werner ER, Watschinger K. When the genome bluffs: a tandem duplication event during generation of a novel Agmo knockout mouse model fools routine genotyping. Cell Biosci. 2021 Mar 16;11(1):54. doi: 10.1186/s13578-021-00566-9. PMID: 33726865; PMCID: PMC7962373.