Document Type : Original Articles
Authors
1 Department of Microbiology, Faculty of Medicine, Shahed University, Tehran, Iran.
2 Razi Vaccine and Serum Research Institute (RVSRI), Agricultural Research, Education and Extension Organization (AREEO), Karaj, Iran.
Abstract
Keywords
Main Subjects
The DTaP vaccine is a traditional polyvalent vaccine which helps to prevent and protect against the life -threating diseases of diphtheria, tetanus, and pertussis in infants and young children (1,2). Diphtheria disease caused by infection with toxin-producing strains of Corynebacterium diphtheriae, a gram-positive bacillus. Tetanus disease caused by neurotoxins secreted by the Gram-positive bacillus, Clostridium tetani (1,2). Moreover, Pertussis is an acute and prolonged infectious cough illness and highly contagious human respiratory disease caused by Bordetella pertussis (B. pertussis), a fastidious gram-negative coccobacillus (3, 4). Because of the possible side effects of DTwP vaccine including severe allergic reactions following vaccination which can be life- threatening presenting with swelling of the face and throat, difficulty breathing, fast heartbeat, dizziness and weakness with cyanosis; since 1991,a promising DTaP vaccine containing acellular pertussis in place of killed pertussis has been licensed. However, countries using DTaP are now experiencing a resurgence of whooping cough, with the characteristic peak every2–5 years, similar to the pre-vaccine era. The lesseffective long-term protection (waning after 3-5 years) of DTaP has allowed the pertussis epidemic cycles to resume. Accordingly, pertussis continues to be a global concern, with widely increased incidencein spite of high vaccination coverage rates with DTaP in developed countries, mainly due to waning immunity and pathogen adaptation. Therefore, there is a need to develop a new generation of vaccine(s) capable of overcoming the weaknesses related to the current vaccines and improving disease control and reducing the incidence of disease (5). Pertussis toxin (PT) is an AB toxin produced by Bordetella pertussis with the Aprotomer (S1 subunit) as the toxic subunit and Boligomer as the pentamer that binds to the cell receptors. The S1 subunit consists of 234 amino acids and is immunodominant. Antibodies to the S1 subunit have been shown to neutralize pertussis toxin in vitro and to protect mice against B. pertussis infection through aerosol. The oligomer is composed of one each of the S2, S3 and S5 subunits and two S4 subunits. S2 and S3 mediate adherence of the toxin to host cells. Similar to the S1 subunit, antibodies against oligomer or S2 and S3 subunits confer protection against B. pertussis infection in animal models, but are less effective than antibodies to S1. Diphtheria toxin (DT) produced by Corynebacterium diphtheria is a secreted 535 amino acidprotein and responsible for the symptoms of the diphtheria disease. After synthesis, DT is proteolytically cleaved into two fragments, A and B. The catalytic domain is located on fragment A and the receptor binding and translocation domains are located on fragment B. Fragment B is responsible for the binding of DT to specific cell surface receptors and the translocation of fragment A into the cytosol. Fragment A catalyzes the ADP-ribosylation of elongation factor-2 resulting in inhibition of protein synthesis and cell death. Tetanus toxin (TT) is a 150 kDa protein produced by C. tetani as a single polypeptide which is subsequently cleaved into a 100 kDa a heavy B chain and a 50 kDa light A chain linked by two disulfide bonds. The carboxyl-terminal portion (TTC, 50kDa) of the heavy chain binds to GT1b gangliosides on neuronal cells. The light chain, a zinc endopeptidase, is then internalized and transported to the central nervous system and released at the presynaptic nerve terminals. TT acts by blocking the release of neurotransmitters to inhibitory synapses, thus causing excitatory circuits to become resulting in tetanic spasms. Toxin-neutralizing antibodies mediate immune protection against tetanus disease (6). DT and TT are well-characterized antigens currently used in licensed diphtheria-tetanus-inactivated and acellular pertussis (DTwP & DTaP) vaccines. In DTaP vaccines, the pertussis antigens included are a combination of pertussis toxin (PT), filamentous hemagglutinin (FHA), pertactin and fimbriae. FHA consists of two immunodominant domains, named as type 1 and type 2, corresponding to the carboxyl and aminotermini of the protein, respectively. Vaccines containing one or more of the PT and FHA antigens have been shown to be very similar in efficacy to those containing all four pertussis antigens (6). However, while the efficacy of DTaP vaccines is well documented, the vaccines are expensive to produce because the antigens are derived and purified from three different bacteria. Therefore, developing antigens that can be produced by a single organism will simplify vaccine production and reduce cost. The Production of the recombinant compounds can also be considered in terms of reducing the side effects of chemicals used, such as formalin, in toxin inactivation (7). The aim of this project was to design a single protein containing three immunodominant parts from Corynebacterium diphtheria, Clostridium tetani and Bordetella pertussis toxins with respect to a higher immune response and fewer side effects. The current study generated a fusion protein containing the N-terminal 179 amino acid fragment of the PT (S1 subunit), which has been shown to be protective in an animal infection model the full-length genetically detoxified DT, and the 50kDa tetanus toxin fragment C (TTC), which has also been shown to confer protection against toxin challenge assays in animal models. The work represents a first step in demonstrating a proofofprinciple (concept) that such an approach could be explored to introduce a new generation of DTaP vaccines.
2.1. Bioinformatics study
This research was conducted in the Research and Development department of Razi Vaccine and Serum Research Institute of Iran. Data preparation and classification, sequencing and repair, and selection of the best analysis methods were performed using BioEdit 6.6.2 and MEGA X software, and then the identification of the linear epitopes, spatial and non-continuous spheres, domains, and immunedominant regions of crystallized toxin proteins were studied in all three bacteria. Experimental spatial and linear epitopes were then examined from the IEDB site database for B lymphocytes, and collectively agreed epitopes were predicted in a variety of physicochemical and biological systems. DISCOTOPE, ELLIPRO, and BCTOPE servers were used to predict spatial and continuous epitopes. The approved gene structure was constructed by bioinformatics analysis, and then the desired gene was cloned in Escherichia coli strain DH5α (Gibco-BRL). The complete amino acid sequences of Bordetella pertussis toxin subunit 1 (P04977), Corynebacterium diphtheria toxin (Tox) (Q6KE85) and Clostridium tetani toxin (P04958) were obtained from the UniProtKB database (https://www.uniprot.org/uniprot/). The sequences were then inspected for signal peptides (SP), chains (domains), disulfide bonds ,and as well as binding and active sites by reading annotations (in case of required prediction) and analyzing in BioEdit 6.6.9 software. To find the conformational/discontinuous immunogenic regions, in addition to the sequence, a three-dimensional structure of the proteins was performed using a modeling strategy. PTXA was modeled by SWISSMODEL server (https://swissmodel.expasy.org/) based on homology modeling due to the presence of the three-dimensional structure with more than 40% similarity to the patterns in PDB (Protein 3D structure database). GMQE (Global Model Quality Estimation), in this server, is a quality estimation that combines properties from the target-template alignment and the template structure (9). In addition, for each model residue, the expected similarity to the native structure is shown in the “Local Quality” plot. Typically, residues with a score below 0.6 are expected to be of low quality. Moreover, different model chains were displayed in different colors. The local score was reported in the B-Factor column of the PDB file (10, 11), if the model downloaded. The local quality estimation plot and the comparison with the non-redundant set of PDB structures were also provided. In the "Comparison" plot (8), the model quality scores of individual models are related to the scores obtained for experimental structures of similar size. The x-axis is the protein length (number of residues) and the y-axis is the normalized QMEAN score. Each point represents an experimental protein structure. Black dots are experimental structures with a normalized QMEAN score within 1 standard deviation of the mean (|Z-score| between 0 and 1), and experimental structures with a |Z-score| between 1 and 2 are gray. Experimental structures that are even further from the mean are light gray. The actual model is shown as a red star. The mean and standard deviation of the experimental structures around the x-location of the star are used to calculate the QMEAN Z-score of the model, i.e. how many standard deviations are away from the model mean scores (10, 11). TetX and Tox 3D proteins were modeled using the threading method with tools available on the server I-TASSER (Iterative Threading Assembly Refinement) server (https://zhanglab.dcmb.med.umich.edu/I-TASSER/). It is a hierarchical approach to protein structure prediction and structure-based function annotation. Structural templates from the PDB were first identified by multiple threading approaches using LOMETS, with full-length atomic models constructed by iterative template-based fragment assembly simulations. Energy minimization processes and model validation were perfprmed using the GROMOS96 method implementation in the SWISS PDB Viewer software version 4 (http://www.expasy.org/spdbv/). Structural validation was also analyzed by assessment of the Ramachandran plot (PDB file as input) tool in the PROCHEK server using impalements of the SAVES v6.0 (UCLA-DOE LAB) server (https://saves.mbi.ucla.edu/). DiscoTope 2.0 (12)andElliPro (13)servers were used to predict discontinuous/conformational B cell epitopes from 3D protein structures in PDB format. CBTOPE (14) was also used to predict conformational epitopes using amino acid sequence as an input feature with a prediction accuracy of more than 85%. The CBTOPE server applied the Support Vector Machine (SVM) based method as mentioned above by using the amino acid composition generated from the query sequence(s) with overall accuracy (14). Specifically, Disco Tope was used to calculate and combine contact numbers (synonym for surface accessibility) and the propensity scores of residues in spatial proximity (12). Based on CASP (Critical Assessment of Protein Structure Prediction) experiments, DiscoTope (12) is more valid than other conformational epitope prediction servers. ElliPro has also been used to predict antibody epitopes, using an input protein structure in PDB format, based on solvent accessibility and flexibility with a score that is defined as a Protrusion Index ( PI) value averaged over epitope residues (13). In addition, the Kolaskar and Tongaonkar method (15) has been used to predict linear antigenic epitopes based on physicochemical properties of amino acid residues with 75% prediction accuracy. Similarly, Emini Surface Accessibility Prediction was also used for linear epitope prediction. ABCpred server was also used to predict linear epitopes based on machine learning approach (an artificial neural network (ANN) algorithm (https://webs.iiitd.edu.in/raghava/abcpred/ABC_method.html). Finally, for precise determination of immune-dominant epitopes targeted by neutralizing antibodies, predicted epitopes and regions/ residues with immunogenicity potential based on 3D structures of toxins, were analyzed to find consensus immunogenic regions in native proteins.In addition,the IEDB server (https://www.iedb.org/) was searched for previous experimental epitopes. The amino acid sequences of selected consensus epitopes (immunogenic regions) for three toxins were manually assembled. Different patterns of glycine and serine (gsgg, ggggs, ggg, ggss, ggssg, ggssgg, ssg) were used as flexible linkers. Eventually, the final preferred epitopes were designed and selected by considering the appropriate immunogenicity through peptide engineering, epitopes binding with appropriate linkers based on various criteria such as greater immunogenicity and physicochemical properties of the vaccine structure such as molecular weight, pH, half-life, solubility, stability etc. It then evaluated the essential tags, expression and structures in terms of immunogenicity and estimated occurrence of the desired characteristics in the body and proper processing. The order of fusion peptides (poly-epitope construct) was as TetX (tetanus toxin)-Tox (diphtheria toxin)-PTXA (from pertussis toxin A). The construct was also started with a secretory signal peptide attached. Post-translational modifications (PTMs) such as glycosylation, enzymatic cleavage sites, etc. were neglected due to the source of epitopes from bacteria. If it was necessary to maintain the structural conformation of immunogenic regions in the new synthetic protein compared to the native protein, this was considered during the design. Various physical and chemical parameters characterization such as theoretical isoelectric point (IP), molecular weight, total number of positive and negative residues, extinction coefficient, instability index, aliphatic index and Grand Average Hydropathicity (GRAVY) were estimated using the Expasy ProtParam server (https://web.expasy.org/protparam). The protein sequence of the cassette was reverse translated (back-translated) into DNA using online JCAT (Java codon adaptation tool) server (16) without further manipulation. JCAT is a multi-tasking tool for adapting target gene codon usage to the most sequenced prokaryotes and some eukaryotic gene expression. It is used to improve heterologous protein production and does not require the manual definition of the highly expressed genes. The reverse translation tool takes a protein sequence as input and uses a codon usage table to generate a DNA sequence by representing the most likely non-degenerate coding sequence. The settings also take into
|
||||
|
account the avoidance of unwanted cleavage sites for restriction enzymes. Finally,the NcoI RE (restriction enzyme) site (5'-C/CATGG-3') and BamHI RE site (5'-G/GATCC-3') and stop codon were added at the beginning and the end of the annotated cassettes, respectively, for insertion into the expression vector (16).
3.1. Modeling, Energy Minimization and Toxin Validation
The 3D structural PTXA model was generated using the Swiss Model Server based on 99.15% identity to the 1prt PDB ID template. The GMQE and QMEAN for PTXA toxin were 0.91 and -1.68, respectively. As the GMQE score is near 1, it shows a high degree accuracy of the modeled structure (Figure 1). Also, the protein size (residues) is in good position compared to other characterized 3D protein structures. The TetX and Tox sequences were modeled using the I-TASSER server, as the similarity between these sequences with templates in the PDB database was less than 36.97 % (PDB ID=3v0a) and 22% (PDB ID=6dkk), respectively. The predicted modeled structures (Figure 2) were further subjected to the constrained energy minimization procedure with a harmonic constraint of 100 kJ.mol-1.Å-2. It was applied to all the protein atoms using the steepest descent and conjugate gradient technique with Gromos96 43B1 parameter set, implemented in Swiss-PdbViewer, to eliminate the bad contacts between structural water molecules and protein atoms and to improve the stereochemistry of the model.
In addition, the stereochemical quality of the TetX and Tox structures was evaluated by analyzing the geometry of each residue and the overall structure geometry using the PROCHECK program. Ramachandran plot results showed that 90.9% of residues were in favorable region, 8.6% in allowed region and 0.4% (27 labeled residues (out of1313)) in outlier region (Figure 3). In parallel to Text, the stereochemical quality of Tox structure in Ramachandran plot showed 91.8% of residues in favorable region, 6.9% in allowed region and 1.1% (14 labeled residues (out of 511)) in outlier Figure 4. Taken together, these results showed that all modeled structures were suitable and valid. Energy minimization computation was performed in vacuum with GROMOS 96 43BQ parameters set without reaction field for three above modeled bacterial toxins. After energy minimization, the calculated total energy of the
|
model for PTXA, TetX and Tox were as E= -23447.215 kj/mol, E= -78556.586 kj/mol and E= -23549.840 kj/mol, respectively. Since the optimization was energetic from the initial state (negative ∆G), the free energy of the predicted structures is correctly released.
3.2. Conformational and Linear B-cell Epitope Prediction by Hybrid Approach
Tables 1-3 show the results of B-cell epitope residues identified by different methods. Figure.1 also shows the results of the SEPPA and ElliPro servers visualized and represented in 3D modeled structures. Informative 3D visualization of predicted consensus immunogenic regions in alpha and NetB toxin by ribbon and surface representations is shown in Figure 2.
|
|||||
3.3. Final Selection of Consensus Immunogenic Regions
Finally, the consensus immunogenic regions were selected on the three-dimensional structure of the modeled toxins based on the scores reported in the epitope prediction servers, especially servers predicting conformational/ discontinuous epitopes, servers agreement degree on the epitopes, the hydrophilicity degree, surface exposure and flexibility, the epitopes configuration or immunogenic region in 3D space, adaptability structure of new synthetic protein and respecting to the known experimental epitopes data. Informative sequence and 3D visualization of predicted consensus immunogenic regions for PTXA, TetX and Tox by ribbon representation are shown in Figures 5, 6
|
||||
|
and 7. For the prediction of linear epitopes, different methods were used to increase the prediction accuracy, although the Machine Learning method provides more reliable results. Again, prediction of spatial epitopes is much more important than linear epitopes since, many epitopes (more than 80%) that are immunogenic and recognized by Bcells are conformational/discontinuous compared to linear epitopes (19). The protein sequence of the fusion multi-epitope construct was also searched in the Gene Bank for corresponding similarities (Figure 8). The construct was arranged in the order Tetanus (TetX) -Diphtheria toxin (Tox)-Pertussis toxin (PTXa) from 5' to 3' direction.
|
||||||
|
||||||
3.4. Prediction of the Physicochemical Properties of Construct
To predict the in vitro function and properties of the candidate vaccine construct, the physicochemical properties were evaluated to make the critical correction(s) (if any). The results showed that the total number of amino acids was 546, the theoretical pI was 5.94, the total number of negatively charged residues (Asp + Glu) was 64, and the total number of positively charged residues (Arg + Lys) was 57. The estimated half-life was 30 hours (mammalian reticulocytes, in vitro), > 20 hours (yeast, in vivo) and >10 hours (Escherichia coli, in vivo). The instability index (II) was calculated 35.58 that implies and classifies as a stable
|
||||||
protein, since protein instability index of less than 40 is predicted as stable. Aliphatic index (AI) was computed 71.67 for the fused peptide, which above 40, reflects thermal stability of protein. Grand average of hydropathicity (GRAVY) was -0.475, which shows hydropathic synthetic of protein and reflects suitable solubility for fusion toxins construct.
3.5. Reverse Translation, Codon Optimization and Final Optimizations of Synthetic Fusion Protein
The final fusion construct protein sequence was finally designed as a cassette containing components of tetanus toxin (TetX)–diphtheria toxin (Tox)-pertussis toxin (PTXA) fusion peptides:
The estimated half-life was 30 hours in mammalian reticulocytes, in vitro, >20 hours in yeast, in vivo and >10 hours in Escherichia coli, in vivo. Reverse translation, changing RE codon optimized construct and codon usage adapted to Escherichia coli (strain K12) during conversion were performed. A polyhistidinetag Deca (8x)-histidine tag) optimized motif was tagged to the 3' end of the construct just before the TAA stop codon. The final Improved nucleotide sequence for construction into expression vector was as follows: GCcontent before and after optimization was 50.734 and 50.183, respectively. Trivalent fusion protein which designed as recombinant poly-epitope vaccine candidate showed optimal physicochemical properties. The construct sequence was further refined for absence of NcoI, BamHI and some other enzyme restriction sites. Moreover, NcoI (5'-C/CATGG-3') and BamHI (5'-G/GATCC-3') were added to 5' and 3’ ends, respectively. To avoid frameshift problems, the ATG start codon of the signal peptide was replaced by GC nucleotides. Consequently, the final nucleotide construct sequence after reverse translation and optimization was as follows:
Vaccination of Iranian children began in 1940s with the use of domestically manufactured vaccines against three deadly diseases: diphtheria, tetanus and pertussis. In rare cases of adverse reactions in children, complications may be related to the vaccine formulation methods as well as the cellular components of the bacteria used in the production processes (17, 18). Due to the side effects of using whole-cell or several proteins in vaccine composition including high fever, weakness, lethargy and anaphylactic shock in rare cases; recombinant DNA procedures have allowed fusions of genes in a simple way to concentrate the immune response, enhance easier production, and reduce side effects and costs (19, 20). Henceforth, it was decided to use modern methods of immunobioinformatics and cost-effective knowledge of recombinant protein to design better immunogenic antigens in a safe system, without the disadvantage of the chemicals side effects such as formalin used to inactivate toxins which is often allergenic (3).
Numerous studies have shown that recombinant bacterial vaccines successfully stimulate the immune responses, and their production, in contrast to conventional inactivated antigen processing and purification, allows high expression levels and improves vaccine safety (19, 20). Aminian et al. used two copies of the Bordetella pertussis toxin S1 fragment, Corynebacterium diphtheria (DT) and Clostridium tetani (TT) modified toxin fragments as a recombinant fusion protein-based vaccine candidate (6). They cloned the three desired sequences in orders to make a multi-step cassette construct, and then expressed it in an expression vector where the complexity was increased due to the nature of the multi-step procedure. It was concluded that the antisera produced against the recombinant fusion protein recognized the native PT, DT and TT, but the protective efficacy of the fusion protein in animal models was not examined. In another study, Torkashvand et al., used a double combination of the Bordetella pertussis toxin and FHA fragments in E. coli, which showed successful results in stimulating the immune responses (8). They showed an adequate systemic and mucosal immune response after administration of the recombinant fusion protein (F1S1) via the SC or IN route. Immunization with F1S1 as a fusion protein induced high levels of serum-specific IgG and lung IgA antibodies as well as Th1-specific T cell subpopulation responses which were considered as appropriate immune responses. In addition, the high production of recombinant F1S1 protein in E. coli was also reflected as another advantage of this fusion protein (3). Ruth et al., also developed an oral triplevaccine from diphtheria, tetanus and pertussis expressed in tomato plant and showed protective specific antibodies in mice (21). In the current study, the distribution of protein sequences and the finding of similar proteins, the detection of mutation sites along the protein sequence and the removed or deleted sequences were investigated using several databases and tools. Protein sequence annotation was performed and the sequence correctness was checked by UNIPROT. Sequence similarity, such as checking the location and amount of protein change and detection of mutation sites along the protein sequence and the amount of coverage of the removed sequence were investigated. Then, the viability of whole sequence was examined by using the BIOEDIT program and drawing the entropy curve. DISCOTOP, BCTOPT and PDB (protein data base) software were used to study the three-dimensional structure and the final model generated in the PDB format in the working method. Moreover, Expasy software and Kolaskar's algorithm were used to identify the physicochemical properties and investigate in terms of hydrophobicity, hydrophilicity and potential curvature of the fusion protein sequence or chimera in the form of TETx-TOX-PTxA. Modeling was performed using the SWISS MODEL program, and epitope prediction and B cell linearity or spatiality were performed using the IEDB Analysis Resource Program. The epitopes were ranked based on the score and values above the target proteins were selected, and then predicted by the ELIPRA server and conformational DISCOTOP of continuous and discontinuous epitopes were predicted. Finally,the epitopes attached and designed with serine and glycine linkers were checked and the desired structure was finalized. Once the structure was designed, the hydrophilicity curve and the processing of the chimeric protein in the body and the physicochemical properties of the candidate chimeric fragment were determined and the reverse translation and codon optimization were performed and ultimately the desired structure was ordered. Moreover,a construct design based on three diphtheria, tetanus and pertussis toxinantigens with the accession numbers of TOX (Q6KE85), TEX (P04958), and PTX (P04977), respectively, was performed. It was also included the cleavage sites required for its linkage and the nucleotide sequences encoding glycine + serine and other sequences required for expression and purification (7). The designed sequence was cloned with the XbaI and XhoI cleavage sites into the expression vector pET-28a (+). In order to position these genes together, researchers have used various methods, including the synthesis of the desired genes in full length, which is not onlyexpensive, but also time-consuming due to the synthesis complexity caused by the full length of the sequence. In another attempt, the initial cloning was performed in a non-expression vector and then subcloned, which was also time -consuming and complex due to the non-specificity of the vectors in the linkage stage. For the first time, using pET vectors and the designed genes, a specific pETsequence vector was aimed and constructed for use in the production of large amounts of the bioactive fusion protein production with tag linkage capability. In addition, the pET vector had a 6×His-Tag sequence, which normally does not affect the structure and activity of the recombinant protein, the g at the end of the C-tag in the expression structure corresponded to the codon of the fusion protein. This structure was consisted of six consecutive amino acids, allowing the recombinant fusion protein to be purified using resins. The use of the pETsequencing in the purification of the recombinant protein using the combined method of histidine tagging and purging significantly reduces the presence of contaminants in the final solution. Also, in this fusion protein construct, the intact protein sequence of Bordetella pertussis, Corynebacterium diphtheria and Clostridium tetani toxins final construct was used to analyze the of probable allergenicity reactions. Due to the side effects of using the whole proteins in vaccine production including local and systemic allergic reactions such as skin rash to anaphylactic shock in rare cases with high mortality rate, the use of bioinformatics solution should be considered in terms of cost effectiveness and speed (19, 22). Also, the use of a modified recombinant fusion protein can be considered in terms of reducing the side effects of chemicals such as formalin used in the inactivation process,which can be a potent local and systemic allergic substance. Finally, a fusion protein construct composed of immunogenic regions from three bacterial organisms in case DTP vaccine is novel potent, safe, cheap and broad-spectrum vaccine and may improve vaccine efficacy / effectiveness for prevention and eradication of the diphtheria, tetanus and pertussis infections in the future.