Welcome to Flow Pharma Inc.

## New Research: Paired SARS-CoV-2 spike protein mutations observed during ongoing SARS-CoV-2 viral transfer from humans to minks and Back to humans

### Abstract

A mutation analysis of SARS-CoV-2 genomes collected around the world sorted by sequence, date, geographic location, and species has revealed a large number of variants from the initial reference sequence in Wuhan. This analysis also reveals that humans infected with SARS-CoV-2 have infected mink populations in the Netherlands, Denmark, United States, and Canada. In these animals, a small set of mutations in the spike protein receptor binding domain (RBD), often occurring in specific combinations, has transferred back into humans. The viral genomic mutations in minks observed in the Netherlands and Denmark show the potential for new mutations on the SARS-CoV-2 spike protein RBD to be introduced into humans by zoonotic transfer. Our data suggests that close attention to viral transfer from humans to farm animals and pets will be required to prevent build-up of a viral reservoir for potential future zoonotic transfer.

The Abstract is as follows:

## 1. Introduction

Coronaviruses are thought to have ancient origins extending back tens of millions of years with coevolution tied to bats and birds (Wertheim et al., 2013). This subfamily of viruses contains proofreading mechanisms, rare in other RNA viruses, reducing the frequency of mutations that might alter viral fitness (Cyranoski, 2020). The D614G mutation, which lies outside the RBD, is an example of a fitness-enhancing mutation on the spike glycoprotein that became the most prevalent variant as the virus spread through human populations (Korber et al., 2020). The recently observed, and more infectious, D796H mutation paired with ΔH69/V70, also outside the RBD domain, first observed in January 2020, has spread throughout Southeast England (Kemp et al., 2020). Data from next-generation sequencing has shown that the SARS-CoV-2 viral genome mutates at about half the rate of influenza and about a quarter of the rate seen for HIV, with about 10 nucleotides of average difference between samples (Callaway, 2020). The Global Initiative on Sharing Avian Influenza Data (GISAID) (Elbe and Buckland-Merrett, 2017Shu and McCauley, 2017), has catalogued over 235,299 SARS-CoV-2 sequences to date from samples provided by laboratories around the world. This diversity is profound with well over 12,000 mutations having been shown to exist, with the potential for non-synonymous substitutions, insertions, or deletions resulting in amino acid changes which could result in structural and functional changes in virus proteins (Callaway, 2020Mercatelli and Giorgi, 2020Sheikh et al., 2020Pokhrel et al., 2021). While Coronaviruses initially developed in animals and transferred to humans, transfer back to animals and then back to humans again has recently been observed in in the Neovison vison species of mink, currently being raised in farms around the world.

Non-silent mutations in the SARS-CoV-2 spike glycoprotein gene can produce structural and functional changes impacting host receptor binding, and viral entry into the cell. Specifically, a mutation in the RBD region, residues 333–527 (Lan et al., 2020) can potentially either increase or decrease the affinity of spike protein for the human ACE2 receptor, directly influencing the ability of the virus to enter a cell and successfully infect the host. The receptor binding motif (RBM), located within the RBD from 438 to 506 (Lan et al., 2020), plays a key role in enabling virus contact with ACE2 and is an optimal target for neutralizing antibodies (Shang et al., 2020). Mutations in this region, which also corresponds to the complementarity-determining regions (CDR) of a potentially binding immunoglobulin, could affect the antibody’s ability to neutralize the virus (Greaney et al., 2020).

Here we examined the mutations that were associated with the transfer of SARS-CoV-2 from humans to minks and back to humans. This example of zoonotic transfer highlights the importance of paired mutations in the RBD domain and suggests potential challenges for sustained efficacy of neutralizing antibodies focusing on that region.

## 2. Methods

### 2.1. Viral sequencing analysis

All 235,299 available spike glycoprotein amino acid sequences available thru December 5th were downloaded from GISAID (Elbe and Buckland-Merrett, 2017Shu and McCauley, 2017). Before downloading, full genome nucleotide sequences were individually aligned via MAFFT (Katoh and Standley, 2013) to the WIV04 (MN996528.1) (Zhou et al., 2020) internal reference sequence. Aligned sequences were translated to amino acids based on spike glycoprotein reference sequence positions at 21563 to 25,384.

The downloaded spike glycoprotein file was re-aligned via MAFFT (Katoh and Standley, 2013) against WIV04 (MN996528.1) (Zhou et al., 2020) with position numbering kept constant. Sequences were split between human (206,591 sequences) and the Neovison vison species of mink (332 sequences).

A custom Python (Python 3, 2020) script was utilized to compare the variants seen in mink and humans differing from the WIV04 reference sequence. Only mutations with at least three occurrences in minks were retained. After initial examination of the whole genome, all mutations inside of, or within 25 amino acids of, RBD (333–527) (Lan et al., 2020) were also kept for analysis. The script was rerun using only sequences and meta-data matching those criteria. Thirteen human and three mink sequences were removed that were found to contain 25 or more gaps or mutations compared to the WIV04 reference. This was done to avoid the potential for improper mutation identification confounding downstream analysis. After all these filtering procedures, 782 human and 251 mink sequences remained for analysis.

The statistical package R (Core Team, 2017) was utilized to plot mutation frequency by geography, date, and species to reveal patterns indicative of zoonotic transfer. Sequence identifiers, and the respective authors, utilized for results are shown in supplementary table 1.

### 2.2. SARS alignment comparisons

The SARS-CoV-2 reference sequence WIV04 (MN996528.1) (Zhou et al., 2020) was aligned against a SARS-CoV-1 reference sequence (NC_004718.3) (He et al., 2004) via MAFFT (Katoh and Standley, 2013). Residue positions were visualized in Jalview (Waterhouse et al., 2009).

### 2.3. 3D structure visualization

The PDB file, 7A98 (Benton et al., 2020), was downloaded from RCSB.org. Positions of interest were visualized in MOE (Chemical Computing Group, 2020) for Fig. 1b.

Fig. 1. A: Illustration of positions of mutation variants described on the SARS-CoV-2 spike glycoprotein for the N-terminal domain (NTD), receptor binding domain (RBD), and receptor binding motif (RBM). B: Left: Crystal Structure (PDB ID: 7A98) of SARS-CoV-2 receptor binding domain (green) in complex with ACE2 (teal) with residues highlighted in red. Right: Interaction of highlighted residues (red) with ACE2 (teal). Interaction denoted with magenta clouds.

### 2.4. Molecular effect calculations

Molecular Operating Environment (MOE) (Chemical Computing Group, 2020) software was used with PDB 7A98 (Benton et al., 2020), and prepared with QuickPrep functionality at the default settings, to optimize the H-bond network and perform energy minimization on the system. Affinity calculations were performed using 7A98.A (spike protein monomer) and 7A98.D (ACE2) chains. Residues in spike protein (7A98.A) within 4.5 Å from ACE2 (7A98.D) were selected and the residue scan application was run by defining ACE2 (7A98.D) as the ligand. Residue scans and change in affinity calculations were also performed on L452. Stability calculations were performed by running the residue scan application using residues in RBD (331–524) of spike protein (7A98.A). Residue scans and change in stability calculations were also performed on Q314. The changes in stability and affinity (kcal/mol) between the variants and the reference sequence were calculated as per MOE’s definition (Chemical Computing Group, 2020). The potential effect of variants was predicted using PROVEAN (Choi and Chan, 2015) with R (Core Team, 2017), and SIFT (Ng, 2003). Residues in ACE2 protein (7A98.D) within 4.5 Å from spike protein (7A98.A) were selected and the residue scan application was run by defining the spike protein (7A98.A) as the ligand. The changes in affinity (kcal/mol) between mink, mouse, and hamster ACE2 sequences compared to human ACE2 were calculated as per MOE’s definition (Chemical Computing Group, 2020).

### 2.5. Phylogenetic tree

IQ-TREE 2 (Minh et al., 2020) was used to generate a phylogenetic tree via maximum likelihood calculations. The RBD region, plus 25 amino acids on each end, of the filtered, processed human and mink samples was imputed for analysis. The “FLU+I” model was chosen as the best model to fit, with subsequent ultrafast bootstrapping till convergence and branch testing. FigTree (http://tree.bio.ed.ac.uk/software/figtree/) was used to visualize and color the phylogenetic tree.

### 2.6. ACE2 sequence alignment

ACE2 protein sequences were obtained from Uniprot (The UniProt Consortium, 2019) for human (identifier: Q9BYF1–1), mouse (identifier: Q8R0I0–1), hamster (identifier: (A0A1U7QTA1–1), and from the NCBI protein database for mink (identifier: QPL12211.1). The alignment was visualized in Jalview (Waterhouse et al., 2009) and colored according to sequence identity. The similarity scores for the entire protein sequence and the spike receptor binding domain motif were calculated in MOE (Chemical Computing Group, 2020). Residues in ACE2 within 4.5 Å of spike receptor binding domain were defined as the receptor binding motif.

## 3. Results

The following six mutations met all the criteria described in the methods section and demonstrate the potential for zoonotic transfer between species. Mutations outside of the spike RBD and other proteins were examined but did not result in significant findings.

### 3.1. Netherlands

#### 3.1.1. F486L

The F486L mutation has a substitution of leucine for phenylalanine occurring within the RBM on the SARS-CoV-2 spike glycoprotein (Fig. 1 A, B). These amino acids are similar in physiochemical properties, with an aromatic ring being replaced by an aliphatic chain. This new variant in F486L, conserved across SARS-CoV-1 and SARS-CoV-2, was first seen via the strain RaTG13 in a Rhinolophus affinis bat sample collected in Yunnan, China during 2013. In 2017, this variant was also found in Manis javanica, a species of pangolin. F486L was not present in human sequences from the dataset at the start of the pandemic, but started to appear in minks at the end of April 2020. We found 125 sequences from mink samples with this mutation collected in the Netherlands since that time.

A sample submission date places the first known potential transfer back from minks to humans in August 2020, also in the Netherlands. Although sample collection dates are unavailable for these human samples, submission dates show a larger number of human samples with this variant were reported in the Netherlands in October and November 2020. One human sample in Scotland, collected in October 2020 shows that F486L may be viable alone and can occur de novo, without evidence of zoonotic transfer. Based on mutations seen with F486L, L452M and Q314K, and considering the potential for sequencing error, this case is likely not linked to the Netherlands sequences (Table 1).

Table 1. Mutations counted by species, location, and linkage with other mutations. Highlighted mutations illustrate inheritance patterns.

The L452M and Q314K variants were almost always observed to appear concurrently with F486L, apart from the first few known mink sequences (Fig. 2A). The only other sequences in the database with these variants outside the Netherlands were from two human samples collected in Michigan, coinciding with an October 2020 mink farm outbreak there (Michigan.gov, 2020). These latter human samples with F486L contain a second variant, N501T, which were also observed in four mink samples obtained in the Netherlands. The findings of some mutations with others (Fig. 3) suggest the potential requirement of a specific second mutation, L452M or Q314K, to be present in order for the virus to jump back from minks to humans.

Fig. 2. Total mutation counts stratified by mutation, location, and species (human and mink). Dates reported for human samples from the Netherlands are predominately submission dates, in contrast to the reported collection dates seen in meta data associated with sequences submitted from other countries.

Fig. 3. Cumulative counts for mutation F486L individually and in pairs with L452M or Q314K in the Netherlands.

#### 3.1.2. L452M

The L452M variant in the spike RBM involves substitution of methionine for leucine at position 452 (Fig. 1 A, B). The side chains for these amino acids are quite similar, with the addition of a sulfur. This variant appeared first in mink samples from the dataset dated July 2020 and was found to co-occur with F486L. We also observed that L452M and F486L occurred together in all human samples to date from the Netherlands (Table 1), submitted between August 2020 and through November 2020. The five human samples containing L452M that were collected outside the Netherlands were not paired with F486L in the sequence database (Fig. 2B), suggesting independent emergence.

#### 3.1.3. Q314K

Q314K is located close to the RBD, and has lysine substituted for glutamine at position 314 (Fig. 1A) resulting in the addition of a positive charge to the side chain. This new variant in Q314K is found in both SARS-CoV-1 and SARS-CoV-2. The earliest sequences in the database with this mutation were found in five human samples taken in Northern California and one human sample taken in Mexico in May 2020. This variant first appeared in human samples submitted from the Netherlands in October and November 2020 (Fig. 2C). The dataset showed that Q314K was first seen in minks starting in August 2020 and was always observed to occur with F486L, but only in the absence of L452M. The ten human samples from outside the Netherlands all had Q314K without the other variants described here (Table 1), suggesting it can occur de novo.

### 3.2. Denmark

#### 3.2.1. Y453F

Y453F is an RBM mutation substituting phenylalanine for tyrosine at position 453 that we observed present in a substantial number of human sequences in the database from samples submitted from around the world since the start of the pandemic (Fig. 1A, B). The substituted amino acid is similar to the reference with the change being less polar. This new variant in Y453F is found in both SARS-CoV-1 and SARS-CoV-2. The sequences analyzed for this study show this mutation appearing in minks starting in the Netherlands in April, and Denmark in June 2020. We saw rapid expansion of this mutation, present in 629 human sample sequences from Denmark, beginning in June 2020 (Fig. 2D, Table 1). Of the five samples collected outside the Netherlands or Denmark, one sample came from Utah, one of the top producers of mink fur in the United States and that was experiencing a SARS-CoV-2 outbreak on mink farms at the same time (Aleccia, 2020).

### 3.3. United States

#### 3.3.1. N501T

N501T is a mutation in the RBM and involves a change from asparagine to threonine at position 501 (Fig. 1A, B). This new N501T variant is found in both SARS-CoV-1 and SARS-CoV-2 and was first described in 2017 in a species of pangolin, where it was seen to co-occur with F486L. N501T was found only sixteen times in humans or minks in the samples we analyzed. Four mink samples with N501T were submitted from the Netherlands between April 2020 and June 2020. The twelve human sequences with the N501T variant were seen in samples collected from March to October 2020 but were not seen in samples collected in the Netherlands or Denmark (Table 1). As mentioned previously, two of the three sequences among the twelve collected outside the Netherlands with F486L contain this variant and were from Michigan, a state with documented mink farm outbreaks. Two of the twelve human sequences with N501T alone were found in Wisconsin on October 3rd. This variant was present in a human sample collected in Taylor county, where a mink farm was reported on October 8th to have confirmed cases, now with a second farm outbreak and over 5400 mink deaths from the virus (Kirwan, 2020Schulte, 2020).

#### 3.3.2. V367F

V367F at the RBD, position 367 (Fig. 1 A, B), results in addition an aromatic group (substituting phenylalanine for valine). This new variant in V367F is found in both SARS-CoV-1 and SARS-CoV-2. Our analysis identified this variant in 97 human samples collected around the world, not localized to any one region (Table 1). V367F was present in five minks and ten humans in samples collected in the Netherlands. We observed that this variant can occur de-novo without zoonotic transfer. Some cases in humans were seen in samples collected between August and October 2020, in Oregon, another state that was experiencing a mink farm outbreak at the same time (Bellware, 2020). We found nine human samples with this variant in Washington State between February and August 2020; although there were no reports on SARS-CoV-2 outbreak in minks here, there are a small number of farms (USDA, 2020).

### 3.4. Mutation effects

The six mutations we studied result from a Single Nucleotide Polymorphism (SNP), and are among those with the least potential consequence on the stability (kcal/mol, supplementary table 2) of the spike glycoprotein. Analysis for human ACE2 binding affinity (kcal/mol, supplementary table 3) determined that F486L, Y453F, N501T, and L452M likely cause minimal effects on the affinity of the spike protein to bind ACE2. The calculated difference in affinity score for the Y453F mutation indicates a slightly increased binding potential of this spike variant for hACE2 as compared to the reference amino acid sequence. Predicted effects of all six mutations illustrate neutral and tolerated changes (supplementary table 4) suggesting that the function of the protein is not drastically changed, and likely remains fit for replication and infection.

### 3.5. Phylogenetic tree

To examine the statistical relationships between variant sequences, a phylogeny via maximum likelihood estimation was created. Sequences with one of the described mutations were utilized by taking the protein sequences of the RBD, with an additional 25 amino acids on either side. Visualization reveals distinct grouping by mutation category (Fig. 4). These groupings further show that these mutations can occur individually or together. Human and mink sequences also group together, illustrating the similarities for SARS-CoV-2 between these two species. Once these mutations are present, additional mutations do not appear to be required to facilitate virus movement between humans and minks. A plot without these distinct groupings from the identified mutation categories in both humans and minks, could indicate additional variants positions present in samples.

Fig. 4. The phylogeny uses maximum likelihood with ultrafast bootstrapping till convergence and likelihood testing to generate the tree; shown if meeting high certainty threshold. Distinct, significant groupings were detected between all combinations of mutations, as opposed to a mixture of mutation type. Branch lengths to help determine differentiation or relationships are shown to scale at 0.003.

Bootstrap values for the plot completed early at 709 iterations due to a correlation coefficient at 0.996 as well as the absence of a new bootstrapped tree with better log-likelihood cutoffs. SH-like approximate likelihood ratio test (Guindon et al., 2010) testing was performed on each bootstrap. The bootstrap and Shimodaira–Hasegawa-like test values meeting the IQ-trees recommended threshold are shown in Fig. 4. Four of the 10 categories had branches meeting the strict threshold of 80% SH-like testing and 95% bootstrapping. The ultrafast bootstrap resampling and an additional SH-like branch test utilize imputed data to detect a significant difference from the original computed tree. Multiple full generations of the tree resulted in similar stratification of mutation categories. Bootstrapping takes about a 66% subset of a sequence’s positions to decrease time complexity, so the model will not contain certain mutations in some iterations. Since groupings generally have a one or two mutation difference max, there are slight differences from run to run in placement, but the observation of groupings remains true.

## 4. Discussion

The mutations outlined for SARS-CoV-2 show a pattern indicative of zoonotic transfer from humans to minks and back to humans.

The presence of F486L paired with either L452M or Q314K in humans and in minks indicates two separate transfer events between the species. These mutation pairs were observed in the Netherlands, but not elsewhere (Table 1). The mutation F486L did not make a correlated jump from minks to humans until L452M, Q314K, or N501T was simultaneously present as a second mutation (Fig. 3). This illustrates the possibility that multiple mutations are required to preserve fitness and facilitate inter-species transfer, particularly in relation to the host’s ACE2 protein. Mutations in viruses have been described previously to occur in pairings for functional purposes (Altschuh et al., 1987) and furthermore have been shown to have evolutionary relationships involving pairs of variants (Marks et al., 2011).

In Denmark, Y453F showed a pattern potentially arising from transmission to humans, then to minks, and back to humans again. Thousands of miles away in the US, isolated incidences of F486L, N501T, and V367F present evidence for the same type of transfer event, in the same species of mink. These data support the interpretation that paired mutations facilitate, or are required, for zoonotic transfer.

Although some submitted human sequences from the Netherlands do not have recorded collection dates, the submission timeline supports that the spread and transfer from minks to humans is occurring, as suggested by the submitters of the sequences (Oude Munnink et al., 2020), within the GISAID database (Elbe and Buckland-Merrett, 2017Shu and McCauley, 2017).

Current antibody-based therapeutics are focused on antibody binding to the SARS-CoV-2 spike protein RBD (Wu et al., 2020). Similarly, spike protein antibody response-based vaccines may be dependent on the stability of the primary amino acid sequence in the RBD to maintain their ability to generate neutralizing antibody responses (Poland et al., 2020). It is therefore critical to understand the extent to which SARS-CoV-2 mutations are occurring in regions targeted by antibodies.

The extent to which mutations in the RBD could have a beneficial or deleterious effect on viral fitness, on RBD binding affinity to ACE2, and/or on infectivity is also not known. The mutation N501T does not appear to be spreading rapidly and may be showing decreased fitness in humans. Y453F however, now present in 629 human samples from Denmark beginning in June 2020, may be conferring increased viral fitness, potentially facilitating its spread into human populations. Our calculations also suggest that the N501T variant decreases protein stability and affinity for hACE2, but that the change is minimal. The pairing of certain mutations should be tested in vitro in the future. A single mutation could have a combinatory effect when paired with another, producing a completely different effect. The data suggest that two mutations together may be required for bi-directional zoonotic transfer to occur. While the number of cases with these pairings is not growing exponentially around the world, a third mutation could occur, further changing binding affinity and stability characteristics.

In Denmark, 629 human samples and 85 mink samples have shown the presence of Y453F. As this mutation may escape antibody neutralization (Mallapaty, 2020), the emergence of this variant may increase resistance to monoclonal antibody therapy or convalescent sera therapy.

The evidence for SARS-CoV-2 zoonotic transfer using mutation analysis of human and other associate species that can carry the virus has helped identify how variants can arise. As there are multiple species that can be infected by SARS-CoV-2 to assist in this mutation potential (Dhama et al., 2020), vigilant continual sequencing of the virus in the coming months and years is of high importance. Several species known to be infected by this virus have not been sequenced to the same extent, and although a large number of viruses isolated from minks were sequenced in the Netherlands and Denmark, as of this writing mink sequences from the United States are not publicly available. Tracking viral mutations isolated from minks and other farm animals that might later appear in humans is important. In Oregon for example, the location of the mink farm with SARS-CoV-2 outbreak has not been disclosed (Loew, 2020). Similarly, an outbreak in mink farms in Canada (Ugen-Csenge, 2020) has provided only 4 samples for analysis, all collected on December 4th, 2020. The mutation F486L has been found in only one of the four mink sequences available suggesting that these and other farms need to be carefully watched for the appearance of the second mutation. This would likely be either L452M or Q314K, now thought to enable the jump of the SARS-CoV-2 back to humans, as described in Fig. 3. Furthermore, although L452R has become increasingly common in humans with lineages B.1.427/429, this mutation differs from the position 452 mutation described in minks, L452M, indicating that it may have a role in viral transmission or infectivity.

Humans and minks have as much as 92% amino acid sequence similarity between their respective ACE2 receptor proteins (Supplementary Fig. S1 & table 5). Of the residues of ACE2 that interact with the receptor binding domain of spike, mink and human are 83% similar while human and mice ACE2 are only 70% similar (supplementary table 5). Many different species have homologous ACE2, with a highly conserved binding site for the spike protein, demonstrating a wide range of potential hosts for a viral reservoir where new mutations may acquire new infectivity potential for humans (Damas et al., 2020Melin et al., 2020). Based upon changes in affinity calculations when the human residues are substituted for mouse ACE2 residues, the affinity of ACE2 for spike decreases in most cases (supplementary table 5). However, when mink residues are substituted for human ACE2 residues, the affinity is minimally affected (except for G354H; supplementary table 5). This may explain why mice do not get infected by SARS-CoV-2 whereas the virus thrives in human and mink hosts.

Indeed, a novel artificial intelligence algorithm has shown that minks, along with bats, could be a reservoir of SARS-CoV-2 (Guo et al., 2020) and samples from cats, dogs, ferrets, hamsters, primates, and tree shrews demonstrate that all of these species have been infected with SARS-CoV-2 (CDC, 2020). Multiple factors contribute to zoonotic transfer including ACE2 expression level and close contact with the same or other species and therefore viral sequences of animals that may be in contact with humans is of great importance.

The consequence of observed mutations can be analyzed in silico, at least initially. Stability (expressed in terms of energy as kcal/mol) impact on RBD ACE2 or RBD antibody binding can be predicted. Additionally, prediction algorithms can determine the extent to which a single mutation or a combination of mutations could positively or negatively alter the function of a protein. These and other computational methods have already been used to assist in rapid vaccine design (Cunha-Neto et al., 2017). Together, these methods can be utilized to identify optimal targets for vaccines and antibody therapeutics (Smatti et al., 2018Yong et al., 2019), especially as selection pressure from the widespread use of these interventions increases the potential for virus escape (Greaney et al., 2020Haynes et al., 2020).

## 5. Conclusion

Mutations in the RBD of SARS-CoV-2 have likely enabled zoonotic transfer beginning with the pandemic in China. We are now seeing zoonotic transfer most recently for minks in the Netherlands, Denmark, United States, and Canada. Whether these mutations become widespread and affect the efficacy of vaccines and monoclonal antibody cocktail therapeutics remains to be seen, but the potential exists for future mutations arising due to close contact between humans and a wide range of species. There is significant cause for concern that SARS-CoV-2 may evolve further with zoonotic transfer, which can involve various species, as it did at the origin of the outbreak. The emerging new variants acquired through such zoonotic transfer may facilitate human viral escape and the reduction in the efficacy of antibody-based vaccines and therapies. To help stop the pandemic, worldwide sequencing surveillance of samples from many species and correlating these with samples from humans will need to be performed as an early warning system for potential emergence of more virulent variants.

## Abstract

In December 2019, a novel coronavirus, termed severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), was identified as the cause of pneumonia with severe respiratory distress and outbreaks in Wuhan, China. The rapid and global spread of SARS-CoV-2 resulted in the coronavirus 2019 (COVID-19) pandemic. Earlier during the pandemic, there were limited genetic viral variations. As millions of people became infected, multiple single amino acid substitutions emerged. Many of these substitutions have no consequences. However, some of the new variants show a greater infection rate, more severe disease, and reduced sensitivity to current prophylaxes and treatments. Of particular importance in SARS-CoV-2 transmission are mutations that occur in the Spike (S) protein, the protein on the viral outer envelope that binds to the human angiotensin-converting enzyme receptor (hACE2). Here, we conducted a comprehensive analysis of 441,168 individual virus sequences isolated from humans throughout the world. From the individual sequences, we identified 3540 unique amino acid substitutions in the S protein. Analysis of these different variants in the S protein pinpointed important functional and structural sites in the protein. This information may guide the development of effective vaccines and therapeutics to help arrest the spread of the COVID-19 pandemic.

## Introduction

To curb the COVID-19 pandemic, many efforts have focused on preventing entry of the virus by inhibiting the interaction of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) with its human receptor, angiotensin-converting enzyme 2 (hACE2)1. Interaction of SARS-CoV-2 with hACE2 occurs via the Spike (S) protein on the viral envelope. Proteases cleave the S protein into S1 and S2 subunits2,3,4 to enable viral binding to hACE25 and viral entry by membrane fusion6. The S protein is a homotrimer and the S1 subunit of each of the monomers of the S protein contains the receptor-binding domain (RBD; Fig. 1a,b) in either the ‘open’ (active) or ‘closed’ (inactive) conformations7,8,9 (Supplementary Fig. S1a).

Functional regions in S protein and the RBD-hACE2 interaction site. (a) S protein homotrimer with ribbons colored according to legend, bound to hACE2 (red). Black dotted outline shown in (b). (b) RBD-hACE2 interface. (c) RBD-hACE2 interface highlighting residues in RBD within 4.5 Å from hACE2. (d) The number of variants per position across the entire sequence of S protein, highlighting specific functional regions. (e) The number of variants per position across RBD. Black dots indicate invariable positions.

Four main types of prophylaxis or therapeutic strategies, focusing on the S protein, have been employed: (1). Preventing proteolysis of the S protein10; (2). Competing with S1 binding to hACE2, using S1 or hACE2 protein fragments or peptides1,11,12; (3). Generating monoclonal or polyclonal antibodies against SARS-CoV-2 S protein or RBD, to be used as passive vaccines13; and (4) Active vaccines that generate an immune response, usually to the S1 subunit14,15,16.

Besides the RBD, the S protein of the coronaviruses, including SARS-CoV-2, has several other regions that are predicted to be relatively conserved due to their critical role for S protein functions. These regions include the trimer interface of S protein7,9, furin proteolysis cleavage sites5,6, glycosylation sites17,18, neuropilin-binding sites20,21,21 and linoleic acid (LA)-binding site9,22. These regions may be important for maintaining structural integrity, entry, and transmission of the virus and therefore are likely to serve as potential targets for development of prophylaxes and therapeutics.

Although SARS-CoV-2 undergoes mutations at a lower frequency than other viruses like influenza and HIV23, the emergence of several common variants of SARS-CoV-2 in human populations may generate resistance to current prophylaxis and therapeutics. Some of these mutations result in gain of fitness for the virus due to mutations in the S protein24,25,26,27. Early in the pandemic, in February 2020, a single missense mutation resulting in a change from aspartate to glycine in position 614 (D614G) emerged in Europe and became the dominant variant of the virus. The D614G variant has spread throughout the world and increased the transmissibility of SARS-CoV-2 by conferring higher viral loads in young hosts without an apparent increase in the severity of the disease28. With the emergence of new variants, such as B.1.1.7 (also known as the UK variant) and B.1.351 (also known as the South African variant) that have greater transmissibility and may escape antibody detection24,25,26,27,29 (Table 1), it is imperative to map other substitutions in the S protein sequence. Such substitutions may contribute to future variants that lead to increased transmissibility or to variants that evade prophylaxis or therapeutics. Particularly, amino acid substitutions in the RBD, including those that interact directly with hACE224,25,26,27,29 (Fig. 1c) may have an impact. Here, we aimed to identify regions on the S protein that are relatively invariant to guide prophylaxis and therapeutic development more efficiently.

## Results

### SARS-CoV-2 Spike protein

The SARS-CoV-2 S protein is 1273 amino acids long; it contains a signal peptide (amino acids 1–13), the S1 subunit (14–685 residues) that mediates receptor binding, and the S2 subunit (686–1273 residues) that mediates membrane fusion30. To identify areas in the S protein that are the least divergent as the virus evolves in humans, we obtained viral sequences from GISAID (Supplementary Table S1) that as of March 1, 2021, included 633,137 individual virus sequences isolated from humans throughout the world. As compared with the index WIV04 (MN996528.1, also known as the Wuhan variant or index virus) sequence of February 202031, the 1273 amino acid S protein8 had 3540 variants. This number of variants only includes filtered sequences (441,168) that are complete and do not contain an abnormal number of mutations (see “Methods”). As there are 3540 variants, on average, each position in the 1273 amino acid protein sequence has approximately three variants (Fig. 1d). However, some regions harbor 9 variants in a single amino acid position whereas others have no variants (Fig. 1d; Supplementary Table S3). Regions in S protein with 2 or fewer variants/position (marked in light blue, Fig. 1d,e) are more prevalent in the structurally critical trimer interface (46% of the amino acids; Fig. 1d, Supplementary Fig. S1b,c, see Supplementary Table S4), and in the RBD (56%, Fig. 1e, Supplementary Fig. S1b,c). There are a total of 123 positions that are entirely invariable (Supplementary Table S3).

### Receptor binding domain

Much of the prophylaxis and therapeutic efforts are focused on the RBD (amino acids 331–524). Among the 3540 variant sequences, we found only 22 invariant amino acids in the RBD (Fig. 1e, marked by dots under the position; Supplementary Table S3). Of those amino acid substitutions in the RBD, only 3% are predicted by PROVEAN software32 to be structurally or functionally damaging (Supplementary Table S2). Using PROVEAN, we also examined the predicted impact of the amino acid substitutions in the common more infective variants (B.1, B.1.1.7, B.1.351, B.1.427/429, B.1.256 and P.1; Table1) on the RBD structure and function and found that these variants are predicted to have a neutral effect, suggesting these variants are not decreasing the fitness of the virus.

### Furin proteolysis sites

We next examined other regions in the S protein for which functions have been assigned. Furin proteolysis at the S1-S2 boundary (681–685) and in S2 (811–815) exposes the RBD to enable hACE2 binding, and the S2 domain to initiate membrane fusion5. Recent studies show that these cleavage sites are not necessarily specific for furin-mediated proteolysis and that S protein may be processed by multiple proteases to open the RBD into the active conformation2,3,4,33. Consistent with these observations, both the furin proteolysis consensus sites and the arginines that are critical for proteolysis are not conserved in the S protein (Fig. 2a), in agreement with a prior analysis of furin cleavage site 134.

Furin cleavage sites, glycosylation sites, and NRP1 interaction site. (a) Furin cleavage sites in the S protein. Amino acid indicated below each bar indicate the sequence in the WIV04 index isolate. Variants in other SARS-CoV-2 isolates are indicated within the bars of the graphs using one letter abbreviations for the amino acids. (b) The 22 glycosylation sites in the S protein; indicated are the number of variants per position. (c) Glycosylation asparagine sites (numbered) are highlighted in pink in the S protein. (d) The number of variants in the proposed NRP1-binding site.

### Glycosylation sites

The S protein also has 66 glycosylation sites in each trimer, which facilitate protein folding and may lead to host immune system evasion18, as 40% of the S protein’s surface is shielded by glycans17. Surprisingly, with one exception, none of these glycosylation sites were invariable, suggesting that not all the glycosylation sites are essential for the S protein’s functions (Fig. 2b,c). The only asparagine serving as an invariable glycosylation site is N343 in the RBD, located more than 25 Å away from hACE2-binding site, and therefore unlikely to mediate receptor binding.

### Neuropilin-1 interaction site

Neuropilin-1 (NRP-1) is a transmembrane receptor that regulates angiogenesis35 and immune response36 and is expressed in many cell types36 such as the endothelium37, immune cells38, and neurons39. Interaction between NRP-1 and S protein was proposed to regulate SARS-CoV-2 transmission19,20,21. Proteolysis of furin cleavage site 1 in the S protein of the index variant by furin was found to expose a C terminal motif, RXXR (where R is arginine and X is any amino acid), known to be the binding motif in NRP-119,21. For example, a monoclonal antibody against the RXXR-binding site on NRP-1 reduced SARS-CoV-2 infectivity in culture21. Nevertheless, we found that the NRP-1 interaction-site in S protein is not conserved (Fig. 2d). Although the variants are predicted to have a neutral effect on the S protein structure (using PROVEAN analysis, Supplementary Table S2), 90% of the positions in the NRP1-interaction site have more than 2 variants (or an average of 4.3 variants/position; Fig. 2d).

### Linoleic acid-binding site

A fatty acid-binding pocket has been identified in the inactive conformation of S protein9 (Fig. 3a,b). The amino acids that make this pocket are conserved in other coronaviruses9 and are unchanged (less than 2 variants) in 75% of the positions (Fig. 3a,b). Furthermore, among the 20 amino acids that line this pocket, 71% of the identified variants are predicted to have a neutral effect using PROVEAN (Supplementary Table S2). Analysis of the LA-bonding site identifies a potential pharmacophore that may fit small molecules (Fig. 3c), perhaps by mimicking ω-3 fatty acids22.

The LA-binding site in the S protein. (a) Hydrophobic pocket forms the LA-binding site. Residues are colored by the number of observed variants per position using the same color scheme as previous figures. (b) The number of variants per position across the LA-binding site; black outlines indicate the positions that form the LA-pocket. (c) Pharmacophore of the LA-binding pocket. Orange spheres indicate aromatic or pi-rings. The magenta sphere indicates hydrogen bond donors. The cyan sphere indicates hydrogen bond acceptors. White dots represent dummy atoms in the pocket.

### Relatively invariable regions with unidentified function

We also identified another less variable region between residues 541–612 (Fig. 4a–d); 62% of the amino acid positions in this region have 2 or fewer variants and 12 positions are entirely invariable (‘Hot Region’; Figs. 1d and 4a,b). This less variable region is relatively hydrophobic, yet a substantial number of residues remain exposed in the open and closed conformations (Fig. 4c). Six residues, V551, T553, C590, V595, V608, Y612, in this relatively invariable region form a part of the largest hydrophobic patch in the protein measuring 370 Å2 (Fig. 4d,e). Five of these residues (excluding T553) along with other residues that make this hydrophobic patch tolerate very few mutations and almost all the mutations that are tolerated change to other hydrophobic amino acids (Fig. 4d). We examined this region using Site Finder in Molecular Operating Environment (MOE)40 and found that there is a binding site with a positive score for the propensity of ligand binding41, which encompasses several residues from this region (i.e. Cys590, Ser591, Phe592, Gly593) (Supplementary Fig. S1e). This hydrophobic region is also 81% identical between SARS-CoV and SARS-CoV-2, but less than 15% identical when comparing the SARS-CoV-2 sequence with that of MERS-CoV (Fig. 4f).

A relatively invariant (‘hot’) region in the S protein with no known function, identified by analyzing 441,168 individual virus sequences. (a)The number of variants per position across the less-variable, ‘hot’ region with un-assigned function. The red star identifies the proposed ‘latch’, Q564 residue. (b) The hot region identified in the 3-D structure of S protein (open conformation). (c) Invariant ‘hot’ region in S protein with un-assigned function depicted in both the open (left) and closed (right) conformations. Dark blue denotes invariant amino acids and light blue denotes positions with 1–2 observed variants. This region becomes exposed after S protein gets activated by proteases. (d) Number of variants in hydrophobic patch with unassigned function. Positions outlined in black are part of the ‘hotspot’. (e) Some residues in the hotspot (shown in d) are part of the largest hydrophobic patch (green, red ellipsoid) of S protein. Positive patches are highlighted in blue. Negative patches are highlighted in red. (f) Sequence identity between SARS-CoV-2 & SARS-CoV (81% identical), and SARS-CoV-2 & MERS-CoV (15%) in the ‘hotspot’. Dark blue denotes identical amino acid residues. Numbering corresponds to SARS-CoV-2.

## Discussion

While SARS-CoV-2 has a lower mutation rate than other viruses due to proof-reading mechanisms23, aspects such as a relatively high R0 of 1.9 to 2.642, comparatively long asymptomatic incubation and infection periods, and zoonotic origins, leads to high variability in mutations in specific regions compared to the original reference sequence. This has been illustrated with the divergence of 6 major lineages in the past few months (Table 1). Our analysis of the frequency of variants throughout the S protein of SARS-CoV-2 identified regions of high and low divergence, which may aid in developing effective prophylactic and therapeutic treatments. In this analysis of mutations in the S protein, we did not consider the frequency of a particular mutation or in how many countries the mutation was found. Such analysis, as was done for D614G43, may further aid in determining the potential improved viral fitness acquired by a particular mutation.

Protein glycosylation is essential for viral infection44. In SARS-CoV-2 S protein, there are 22 known N-glycosylation sites per monomer (Fig. 2b,c), but only one, asparagine 343, appears to be conserved. Furthermore, we found 156 positions in S protein that mutate to an asparagine residue in the existing 3540 variants that we analyzed, and many of them are exposed on the S protein (Supplementary Fig. S1d). We propose that some of these new asparagine residues may create new glycosylation sites on the S protein that can contribute to immune evasion. Such an impact on the immune evasion by changes in the positions of glycosylation sites of viral envelope proteins have been described for influenza viruses; e.g., H3N2 has numerous new N-linked glycans on the viral hemagglutinin that enabled the virus to escape antibody neutralization and evade the host’s immune system45. The formation of new glycosylation positions may also affect viral susceptibility to existing antibodies and to the immune response of infected individuals. A cryo-electron microscopy study has already suggested that coronaviruses mask important immunogenic sites on their surface by glycosylation46. Furthermore, recent work suggests that changes in glycosylation sites on the S protein of the virus may affect recognition of the S protein by other potential human proteins and receptors, inducing the toll-like receptors, calcitonin-like receptors, and heat shock protein GRP78, thus leading to a more severe inflammation that characterizes a more severe form of COVID-1947.

Additional sites on the S protein have been suggested to be critical for viral infectivity, including the trimer interface, the furin proteolysis sites and the NRP-1 binding site. However, our analysis suggests that not all these sites will be effective targets for prophylaxis and therapeutics. Specifically, the trimer interface is less accessible and therefore unlikely to be druggable. Another issue relates to the furin cleavage sites. As the viral S protein activation appears to require furin proteolysis2,3,4, protease-specific inhibitors are tested as a means to protect from infection48. However, our analysis suggests that this may not be an effective strategy, given the high variability of furin cleavage sites. This suggestion is consistent with previous data showing that other proteinases expressed throughout the body may work synergistically to activate the S protein2,33. Therefore, drugs that focus on inhibiting any single protease may not be effective preventative treatment against all SARS-CoV-2 variants. Similarly, the NRP1-binding site that is generated by proteolysis and the exposure of a C-terminal RXXR motif19,21 may not be a good target for treatment against all SARS-CoV-2 variants, unless such a motif is also created by other proteases.

Are there additional sites on the S protein that can be explored to identify new treatments of COVID-19 or prevention of infections by SARS-CoV-2? There might be a benefit in focusing on the LA-binding site that help stabilize the S protein in the inactive closed conformer. Small molecules that mimic LA and bind into the LA pocket may stabilize the S protein in the closed/inactive conformation, thus reducing infectivity (Fig. 3a–c). Therefore, exploring the LA pharmacophore (Fig. 3c) with small molecules that can hold the S-protein in closed conformation, thus preventing the presentation of RBD to hACE2, could be of great interest as this may reduce viral infectivity. Our data also suggest that it may be beneficial to develop passive and active vaccines that target the RBD, instead of the entire glycosylated S protein; the RBD is less variable relative to the whole S protein (compare Fig. 1e,d). However, similar to some of the common viral isolates, such as the South African, B.1.351, new amino acid substitutions in the RBD may evade such therapeutics; e.g., loss of immunoreactivity to monoclonal antibodies24.

Finally, our study suggests that drugs and antibodies targeting region 541–612, a relatively conserved and exposed region on the protein’s surface that we identified (Fig. 4a–d), warrant further study. Determining how druggable the pocket encompassing this region is (residues Cys590, Ser591, Phe592, Gly593), provided its solvent exposure, and whether modulating S protein by engaging this site will have a biological consequence is a challenge (Supplementary Fig. S1e). Very recently, Q564 within this region (star in Fig. 4a) has been proposed to act as a ‘latch’, stabilizing the closed/inactive conformation of the S protein49. The high degree of conservation of hydrophobicity in this region potentially indicates its role in membrane fusion and/or maintaining structural integrity. The sequence similarity between SARS-CoV-2 and SARS-CoV (Fig. 4f) further supports the importance of this region, especially as both viruses have a similar route of infection. Determining the role of this invariable region warrants a further study, as it may be another Achilles heel to target for anti-SARS-CoV-2 treatment.

## Materials and methods

### Database of S protein amino acid variants, the world regions from where the virus was obtained, and whether the sequence is predicted to be deleterious

A FASTA formatted file containing 633,137 S protein sequences was retrieved on 03/01 from the GISAID database. This file had previously been preprocessed by the database with the individual alignment of genomes to the WIV04 (MN996528.131) reference sequence, using mafft50, via the command “mafft –thread 1 –quiet input.fasta > output.fasta” with subsequent translation into protein from the S protein-coding region at 21,563 to 25,384.

For the analysis in this paper, only sequences sampled from humans were retrieved with the S protein sequences realigned via mafft50 against the WIV04 (MN996528.1,31) reference utilizing parameters ideal for a large number of highly similar protein sequences as well as using the option to maintain position numbering against the reference.

“grep -i “|Human|” input.fasta -A1 > output.fasta”

A python script (Supplementary Table S5) was generated to filter sequences based on set quality thresholds that included (1) 0 ambiguous protein positions; (2) 0 deletions or gaps outside of common deletions including position 69, 70 and 144/145; (3) only full-length pre-alignment of 1273 but down to 1270 in the event of the specified deletions; and (4) a maximum of less than 1% (13) amino acid substitutions from reference. These resulting 441,168 sequences (Supplementary Table S1), were chosen by the strict quality thresholds to remove low quality and potentially error prone sequences based on those that were incomplete, contain uncommon deletions, insertions, and have an unusually high number of mutations.

### Calculating number of variants

The raw data for variants in the S protein was read into R studio51 (v. 1.3.1093) and analyzed using the Tidyverse package52 (Supplementary Table S4). The number of unique variants was calculated for each position, excluding insertions. Graphs were created for specific regions and each position was color-coded according to the number of variants present in that position (i.e., 0 – no color, 1–2 is blue, 3–4 is yellow, > 5 is red). See sample code below:

Calculating variants:

df% > %

group_by(Position, .drop = FALSE)% > %

tally()

Graphing example:

ggplot(df) + #graph of RBD, works for diff colors

geom_col(data = subset(df, Position > 330 & Position < 525), aes(x = Position, y = (n), fill = as.factor(n))) +

ggtitle(“RBD”) +

scale_fill_manual(values = pal, name = ”Number”) +

labs(y = ”Number of Mutations”) +

theme(panel.background = element_blank(), text = element_text(size = 20))

For the functional regions, the proportion of positions with 2 or fewer observed variants was calculated. See formula below:

### Calculating predicted effect of variants in PROVEAN

The amino acid sequence of S protein from the reference EPI_ISL_402124 (WIV04; Wuhan31 ) sequence was uploaded to PROVEAN (http://provean.jcvi.org/index.php)32. Every variant observed in S protein was also uploaded to compare against the reference sequence. Each variant was either predicted to be ‘deleterious’ or ‘neutral’. The PROVEAN predictions were also read into R studio51 (v. 1.3.1093) and analyzed with the Tidyverse52 package for every region analyzed. The proportion of variants predicted to be neutral and deleterious were calculated for the functional regions analyzed in S protein. See Supplementary Table S2. Sample code below:

Calculating PROVEAN ratios:

table(dfProveanPrediction)% > % prop.table()% > % round(4) ### Protein structures Molecular Operating Environment (MOE) software40 was used to prepare the figures using PDB ID: 7A987 for Figs. 1a–c, 2c, 4b,c (left), e; Supplementary Fig. S1a (left), d, e, and PDB ID: 6ZB59 was used to prepare Supplementary Fig. S1a (right), Fig. 3a, c, 4c (right). ### Sequence alignment The Spike protein sequences from SARS-CoV-2, SARS-CoV, and MERS-CoV were uploaded to Jalview53. The Mafft alignment was then performed to align each amino acid sequence. ### Pharmacophore generation PDB ID: 6ZB59 was opened and prepared using the QuickPrep functionality at the default settings in MOE. Dummy atoms were created at the LA-binding site formed by chains 6ZB5.A and 6ZB5.C. AutoPH4 tool54,55 was used to generate the pharmacophore at the dummy atom site in the Apo generation mode. ## Data availability SARS-CoV-2 sequences are available from GISAID (Supplementary Table S1). Data used for this analysis are found in Supplementary Table S4 and attached source data file. Our FLOVID-20 team released a new research preprint showing that Rhesus Monkeys vaccinated with FLOVID-20 were free of pneumonia-like infiltrates characteristic of SARS-CoV-2 (COVID-19) infection and presented with lower viral loads relative to controls. It is submitted for peer review and publication. ## Abstract Background Persistent transmission of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has given rise to a COVID-19 pandemic. Several vaccines, evoking protective spike antibody responses, conceived in 2020, are being deployed in mass public health vaccination programs. Recent data suggests, however, that as sequence variation in the spike genome accumulates, some vaccines may lose efficacy. Methods Using a macaque model of SARS-CoV-2 infection, we tested the efficacy of a peptide-based vaccine targeting MHC Class I epitopes on the SARS-CoV-2 nucleocapsid protein. We administered biodegradable microspheres with synthetic peptides and adjuvants to rhesus macaques. Unvaccinated control and vaccinated macaques were challenged with 1 x 108 TCID50 units of SARS-CoV-2, followed by assessment of clinical symptoms, viral load, chest radiographs, sampling of peripheral blood and bronchoalveolar lavage (BAL) fluid for downstream analysis. Results Vaccinated animals were free of pneumonia-like infiltrates characteristic of SARS-CoV-2 infection and presented with lower viral loads relative to controls. Gene expression in cells collected from BAL samples of vaccinated macaques revealed a unique signature associated with enhanced development of adaptive immune responses relative to control macaques. Conclusions We demonstrate that a room temperature stable peptide vaccine based on known immunogenic HLA Class I bound CTL epitopes from the nucleocapsid protein can provide protection against SARS-CoV-2 infection in non-human primates. Cytokine release syndrome (CRS) is known to be a factor in morbidity and mortality associated with acute viral infections including those caused by filoviruses and coronaviruses. IL-6 has been implicated as a cytokine negatively associated with survival after filovirus and coronavirus infection. However, IL-6 has also been shown to be an important mediator of innate immunity and important for the host response to an acute viral infection. Clinical studies are now being conducted by various researchers to evaluate the possible role of IL-6 blockers to improve outcomes in critically ill patients with CRS. Most of these studies involve the use of anti-IL-6R monoclonal antibodies (α-IL-6R mAbs). We present data showing that direct neutralization of IL-6 with an α-IL-6 mAb in a BALB/c Ebolavirus (EBOV) challenge model produced a statistically significant improvement in outcome compared with controls when administered within the first 24 h of challenge and repeated every 72 h. A similar effect was seen in mice treated with the same dose of α-IL-6R mAb when the treatment was delayed 48 h post-challenge. These data suggest that direct neutralization of IL-6, early during the course of infection, may provide additional clinical benefits to IL-6 receptor blockade alone during treatment of patients with virus-induced CRS. ## Introduction Under normal circumstances, interleukin-6 (IL-6) is secreted transiently by myeloid cells as part of the innate immune response to injury or infections. However, unregulated synthesis and secretion of IL-6 has contributed to a host of pathological effects such as rheumatoid arthritis. (Swaak et al., 1988) Furthermore, IL-6 induces differentiation of B cells and promotes CD4+ T cell survival during antigen activation and inhibits TGF-beta differentiation, providing a crucial link between innate and acquired immune responses (Korn et al., 2008Dienz and Rincon, 2009). These actions place IL-6 in a central role in mediating and amplifying cytokine release syndrome (CRS), commonly associated with Ebola virus disease (EVD) infections. (Wauquier et al., 2010). CRS is known to be a factor in morbidity and mortality associated with acute viral infections including those caused by filoviruses and coronaviruses. For example, non-survivors of the West African EBOV epidemics exhibited significantly elevated levels of the overall inflammatory response cytokines and monokines compared to survivors (Ruibal et al., 2016). It is thought that prolonged exposure to elevated inflammatory cytokine levels is toxic to T cells and results in their apoptotic and necrotic cell death (Younan et al., 2018). Both lymphopenia and elevated serum Il-6 levels are found in Ebola virus infection and are known to be inversely correlated with survival in patients post-infection (Wauquier et al., 2010)and in mouse models of Ebola infection (Herst et al., 2020). However, IL-6 has also been shown to be an important mediator of innate immunity and important for the host recovery from acute viral infection (Yang et al., 2017). Elevated IL-6 levels are also observed in SARS-CoV-2 infections, severe influenza, rhinovirus, RSV infection, as well as in similar respiratory infections (Hayden et al., 1998Tang et al., 2016Kerrin et al., 2017Conti et al., 2020). Originally developed for the treatment of arthritis, α-IL-6R mAbs have been used to treat CRS as a complication of cancer therapy using adaptive T-cell therapies. (Lee et al., 2014Tanaka et al., 2016Ascierto et al., 2020). Warnings admonishing the use of IL-6 blockers in the context of acute infection are present in the package inserts for tocilizumab (Genentech, 2014), sarilumab (Sanofi, 2017) and siltuximab (EUSA, 2015). Early mixed results of CRS treatment with IL-6 blockers (Herper, 2020ClinicalTrialsGenetech, 2020ClinicalTrialsEUSA, 2020Taylor, 2020Saha et al., 2020), and our own observations of the role of IL-6 in morbidity and mortality associated with Ebola virus infection (Herst et al., 2020), led us to evaluate the clinical effects of treatment with not only antibody directed against the IL-6 receptor, but also with mAb directed to IL-6 itself. We report here on the observed differences between treatments with α-IL-6R mAbs and α-IL-6 mAbs in a mouse model of EBOV infection and comment on how IL-6 blockade may be relevant to the management and therapy for patients with Ebola infection as well as patients infected with SARS-CoV-2. ## Methods ### Virus Strain For in-vivo experiments, a well-characterized mouse-adapted Ebola virus (maEBOV) stock (Bray et al., 1998Lane et al., 2019) (Ebola virus M. musculus/COD/1976/Mayinga-CDC-808012), derived from the 1976 Zaire ebolavirus isolate Yambuku-Mayinga (Genebank accession NC002549), was used for all studies. All work involving infectious maEBOV was performed in a biosafety level (BSL) 4 laboratory, registered with the Centers for Disease Control and the Prevention Select Agent Program for the possession and use of biological select agents. ### Animal Studies Animal studies were conducted at the University of Texas Medical Branch (UTMB), Galveston, TX in compliance with the Animal Welfare Act and other federal statutes and regulations relating to animal research. UTMB is fully accredited by the Association for the Assessment and Accreditation of Laboratory Animal Care International and has an approved OLAW Assurance. BALB/c mice (Envigo; n = 146) were challenged with 100 plaque forming units (PFU) of maEBOV via intraperitoneal (i.p.) injection as described previously (Hodge et al., 2016Comer et al., 2019). Experimental groups of 10 mice each were administered rat anti-mouse-IL-6 IgG1 monoclonal antibody (BioXCell, BE0046, Lebanon, NH, RRID AB1107709) or rat anti-mouse-IL-6R IgG2 monoclonal antibody (BioXCell, BE0047, RRID AB1107588) at a dose of 100 μg in sterile saline via intravenous (i.v.) administration via an indwelling central venous catheter, or 400 μg via i.p. injection at 24, 48, or 72 h post-challenge. Antibody dosing was based on amounts previously reported to neutralize IL-6 and IL-6R in mice (Barber et al., 2014Liang et al., 2015). Antibody dosing was performed once for the i.v. group or continued at 72-h intervals for the i.p. groups resulting in a total of four doses over the 14-day study period as summarized in Figure 1 and Tables S2–S5 (Supplemental Materials). Control mice (n=36) were challenge with maEBOV in parallel, but were treated with antibody vehicle alone. Serum IL-6 measurements were performed in control rodents at necropsy as previously described (Herst et al., 2020). ### In Vivo Clinical Observations and Scoring Following maEBOV challenge, mice were examined daily and scored for alterations in clinical appearance and health as previously described(Lane et al., 2019). Briefly, mice were assigned a score of 1 = Healthy; score 2 = Lethargic and/or ruffled fur (triggers a second observation); score 3 = Ruffled fur, lethargic and hunched posture, orbital tightening (triggers a third observation); score 4 = Ruffled fur, lethargic, hunched posture, orbital tightening, reluctance to move when stimulated, paralysis or greater than 20% weight loss (requires immediate euthanasia) and no score = deceased (Table S1Supplemental Materials). ### Statistical Methods Descriptive and comparative statistics including arithmetic means, standard errors of the mean (SEM), Survival Kaplan-Meier plots and Log-rank (Mantel-Cox) testing, D’Agostino & Pearson test for normality, Area-Under-The-Curve and Z Statistics were calculated using R with data from GraphPad Prism files. The clinical composite score data used to calculate the AUC measures were normally distributed. The significance of comparisons (p values) of AUC data was calculated using the Z statistic. p values <.05 were considered statistically significant. ## Results Following maEBOV challenge, mice were dosed i.v. at 24, 48, or 72 h post-challenge with a single dose of α-IL-6R mAb, a single i.p. dose of α-IL-6R mAb 24 h after maEBOV challenge, or an initial i.p. dose of α-IL-6 or xtalpha-IL-6R mAb, followed by additional i.p. doses at 72-h intervals for a total of four doses. Mice were observed for up to 14 days as summarized in Figure 1. The average serum IL-6 concentration at necropsy for mice (n=5) challenged with maEBOV was 1092 ± 505 pg/ml, a concentration similar to that reported in a previous publication for mice challenged with 10 PFU of maEBOV (Chan et al., 2019). In mice not challenged with maEBOV the average serum IL-6 was 31 ± 11 pg/ml. The survival and average clinical score for mice receiving a single i.v. dose of α-IL-6R mAb is shown in Figure S1A, B (Supplemental Materials). Little to no effects on survival or clinical score were observed following maEBOV challenge and a single i.v. dose of α-IL-6R mAb. The survival patterns for i.v. mAb treated and untreated groups following maEBOV challenge were statistically different and most untreated mice succumbed to maEBOV infection by day seven (Figure S1Supplementary Materials). Because neither survival score alone or average clinical score represented the overall possible clinical benefits of mAb treatment, a secondary composite outcome measure was calculated from the quotient of mouse survival and the average clinical score for each day, similar to that previously reported (Kaempf et al., 2019). We then summed these scores across the last 12 days of observation to create an AUC Survival/Clinical Score (see Figure S1CSupplemental Materials). The Z statistic and significance level for this metric was calculated for each experimental condition. We found a minor clinical benefit (p < 0.01) when mice were given one 100 μg dose of α-IL-6R mAb via central venous catheter at 72 h after maEBOV challenge, relative to vehicle alone, using the experimental design described in Table S2 (Supplementary Materials). Since the maEBOV challenge was administered intraperitoneally and murine peritoneal macrophages represent a significant depot of cells (Cassado et al., 2015) able to produce IL-6 (Vanoni et al., 2017) following toll-like receptor activation, we next compared the activities of α-IL-6 and α-IL-6R mAbs administered intraperitoneally following maEBOV challenge (Figures 25). We observed significant differences in the AUC Survival/Clinical Score when α-IL-6R mAb was administered 48 h post-maEBOV challenge and then repeated three times at 72-h intervals. The most significant beneficial effect on the AUC Survival/Clinical Score (Figure 5) was seen when α-IL-6 mAb was administered beginning at 24 h post-maEBOV challenge, and then repeated three times at 72-h intervals. ## Discussion While EVD is classified as a viral haemorrhagic fever, there are many similarities between EVD and COVID-19, the disease caused by infection with SARS-CoV-2 that can present as an acute respiratory distress syndrome (ARDS) (Zhou et al., 2020Chen et al., 2020Huang et al., 2020aLescure et al., 2020). Like EVD, elevated IL-6 was found to be significantly correlated with death in COVID-19 patients (Ruan et al., 2020), suggesting that patients with clinically severe SARS-CoV-2 infection might also have a CRS syndrome (Huang et al., 2020b). Both EVD and COVID-19 (Younan et al., 2019Tan et al., 2020) are associated with lymphopenia. Since the severity of SARS-CoV-1 infection has been shown to be associated with increased serum concentrations of IL-6, clinical scientists have proposed non-corticosteroid based immunosuppression by using IL-6 blockade as a means to treat hyper inflammation observed in certain patients with SARS-CoV-2 infections (Wong et al., 2004Mehta et al., 2020a). The potential value of using IL-6 blockade to treat COVID-19 patients was discussed early during the 2020 SARS-CoV-2 outbreak (Liu et al., 2020Mehta et al., 2020b). Indeed, a recent (5/24/2020) search of ClinicalTrials.gov revealed at least 62 clinical trials examining the efficacy and safety of α-IL-6R mAbs and α-IL-6 mAbs for management of patients with COVID-19; 45 studies for tocilizumab (α-IL-6R mAbs), 14 for sarilumab (α-IL-6R mAbs) and 3 for siltuximab (α-IL-6 mAbs). Most of the studies involve the use of α-IL-6R mAbs and have shown promising results (summarized in Tables 12), but there is clear need for improvement. Using a mouse model of Ebola infection, we found clinical benefit when mice were administered multiple i.p. doses of α-IL-6R mAb 48 h after maEBOV challenge. At both earlier (24 h) and later (72 h) time points of initiation of administration of α-IL-6R mAb, we observed little to no effects on the clinical benefit score. Similarly, we found clinical benefit when α-IL-6 mAb was administered beginning at 24 h post-maEBOV challenge, and then repeated three times at 72-h intervals, but no benefit was observed if α-IL-6 mAb was initiated at 48 or 72 h post challenge. These data suggest that α-IL-6 mAb therapy may also have clinical benefits similar to α-IL-6R mAb particularly when given early during the course of maEBOV infection. Previous experiments in the murine EBOV system (Herst et al., 2020) suggest that some degree of activation of innate immunity and IL 6 release benefits survival post maEBOV challenge. It may be the case that the observed clinical benefits of α-IL-6 mAbs are associated with incomplete blockade of the Il-6 response particularly later than 24 post challenge. Overall our data suggest that human clinical trials evaluating the benefits of α-IL-6 mAbs versus α-IL-6R mAbs titversus combined early α-IL-6 mAb and later α-IL-6R mAb is warranted to evaluate the potential of IL-6 pathway blockade in the during Ebola or SARS-CoV-2 infection. Although antibody blood levels were not obtained during the mouse studies described here, we present a pharmacokinetic model based on literature values (Medesan et al., 1998EUSA, 2015Sanofi, 2017) shown in Table S5 in Supplemental Materials. Simulated PK curves for each of the three experiments described is shown in Figure 6. Dosing α-IL-6 mAb at 24 h after challenge produced a clinical benefit, whereas dosing α-IL-6R beginning at the same time point did not. The shorter terminal half-life of α-IL-6 mAb (T1/2 = 57h) versus α-IL-6R mAb (T1/2 = 223h), possibly due to isotype specific differences in glycosylation (Cobb, 2019) may help explain why giving α-IL-6 mAb early after infection provided the most observed clinical benefit. As can be seen from the simulated PK profile in Figure 6C, repeated dosing every 72 h, beginning 24 h after challenge, is predicted to maintain blood levels peaking at about 200 μg/ml. This is in contrast to blood levels predicted after similar dosing of α-IL-6R where the blood levels continue to increase over the study period. These differences seen in the simulated PK profiles may have allowed α-IL-6 mAb to partially block IL-6, allowing innate immunity to develop, while still providing sufficient blockade to reduce the deleterious clinical effects of IL-6 as the study progressed. In addition, it may be that the stoichiometry of α-IL-6 blockade versus α-IL-6R may favor achieving partial blockade early during the evolution of CRS given that the amount of IL-6 present may exceed the number of IL-6 receptors. It is also possible that IL-6 may act on other sites not blocked by α-IL-6R mAb, and that this may yield a potential advantage of using α-IL-6 mAb to treat CRS brought about by a viral infection. It may be possible to develop a controlled release formulation of α-IL-6 mAb to obtain a clinically beneficial effect from the administration of α-IL-6 mAb, α-IL-6R mAb, or a combination of both, after a single injection early during the course of SARS-CoV-2 infection. For example, Figure 6, bottom-right panel, shows various predicted controlled release PK profiles of α-IL-6 mAb that could be achieved by using delivery systems producing different first order rates of delivery from an injection depot of 20 mg/kg. Correlation of these release profiles with the AUC Survival/Clinical score described here in pre-clinical models could lead to the development of a single dose treatment mitigating the effects of CRS on the host. ## Concluding Remarks Although the previous reports of use of IL-6 blockers to treat CRS have shown mixed results, recent clinical data for α-IL-6 and α-IL-6R mAbs have shown early promise in human trials for treatment of severe influenza and corona virus infections (Gritti et al., 2020Xu et al., 2020). Pre-clinical studies and various ongoing clinical trials evaluating the potential benefit of IL-6 blockers, for example, early α-IL-6 mAb and later α-IL-6R mAb, for the treatment of patients with CRS may provide clinical correlation with the results presented here. ## Data Availability Statement The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation, to any qualified researcher. ## Ethics Statement The animal study was reviewed and approved by UTMB which operates under OLAW assurance number D16-00202(A3314-01). ## Author Contributions All authors contributed to the article and approved the submitted version. CM and TB performed the study under BSL-4 conditions and generated the data presented here. ## Funding This study was funded by Flow Pharma, Inc. which had no influence over the content of this manuscript or the decision to publish. ## 1. Introduction Development of safe and effective vaccines for some viruses such as HIV and EBOV has been challenging [19]. Although vaccine development has been almost exclusively focused on eliciting a humoral immune response in the host through inoculation with whole protein antigen [51][69][59][29], CTL peptide vaccines producing a T-cell response may offer an important alternative approach [23]. For HIV and EBOV and influenza in particular, the potential of CTL vaccines has been discussed [21][7][56]. Although computational prediction alone has been used for T-cell vaccine design [2][14], we saw a unique opportunity to see if a preventative EBOV T-cell vaccine could be successfully designed based on the specific epitopes targeted by survivors of documented EBOV infection. The notion of HLA restricted HIV control has been described [58]. Pereyra-Heckerman conducted an analysis of virus-specific CD8+ T-cell immunity in individuals living with HIV [43]. They reported that HIV controllers, individuals living with HIV not undergoing treatment who do not progress to AIDS, have CD8+ cells targeting different HLA restricted Class I epitopes on HIV compared with progressors, individuals with HIV who progress to AIDS in the absence of therapy. Pereyra-Heckerman suggested that this observation could guide the in silico development of a CTL vaccine for HIV and other diseases. Acquired immunity has been documented after EBOV infection [4]. Antibody as well as T-cell responses have been described [44]. Sakebe et al. have shown that of 30 subjects surviving the 2013–2016 EBOV outbreak in West Africa, CD8+ T-cells from 26 of those survivors responded to at least one EBOV antigen, with 25 of the 26 responders targeting epitopes on EBOV NP [50]. One of the most commonly targeted EBOV eptitopes on EBOV NP in the survivor group (targeted by CD8+ cells from four survivors) was NP41-60 (IPVYQVNNLEEICQLIIQAF). They also suggested that a CTL vaccine could be designed using epitopes targeted by CD8+ T-cells identified in these EBOV controllers. Human pathogen-derived peptide antigens that are also recognized by C57BL/6 T-cells have been previously described. These include peptides from vesicular stomatitis virus (VSV) RGYVYQGL [68], and human immunodeficiency virus (HIV) RGPGRAFVTI [5]. The existence of such epitopes makes a range of pre-clinical vaccine experiments possible without having to rely on non-human primates and expensive and complex-to-manage humanized mouse models. Wilson et al. showed that the EBOV nucleoprotein (NP) is an immunogen that provides protective, CTL-mediated immunity against EBOV in a C57BL/6 mouse model and that this protection was conferred by a peptide sequence within Ebola Zaire: NP43-53 (VYQVNNLEEIC) [73]. Wilson et al. came to this conclusion based on studying splenocytes harvested from mice vaccinated with Ebola Zaire NP using a Venezuelan equine encephalitis (VEE) vector. Their experiments showed that splenocytes from the vaccinated mice re-stimulated with NP43-53 had high levels of cytotoxic activity against target cells loaded with the EBOV NP peptide. Remarkably, NP43-53 also happens to be an 11 amino acid sub-sequence of the epitope identified by Sakebe et al. as most commonly favored for T-cell attack by survivors of the 2013–2016 EBOV outbreak in West Africa. We set out to see if we could drive CTL expansion directed against NP43-53 to occur after vaccinating C57BL/6 mice with Ebola Zaire NP43-53 (VYQVNNLEEIC), and to subsequently conduct an in vivo EBOV challenge study to see if this peptide was protective. We fabricated adjuvanted microspheres for this study as a room temperature stable dry powder using the Flow Focusing process to be 11μM in diameter so as to prevent more than one microsphere from being phagocytosed by any given antigen presenting cell (APC) at the same time [37]. By loading only one peptide sequence per microsphere, we maximized the peptide payload and mitigated the possibility of multiple, different peptide sequences being delivered to the APC simultaneously, which could possibly result in competitive inhibition at the motif which could interfere with antigen presentation and subsequent T-cell expansion (Supplementary Material Section 1). We also set out to see if a similar approach to a CTL vaccine design for SARS-CoV-2 would be feasible based on an analysis of the HLA binding characteristics of peptide sequences on SARS-CoV-2 nucleocapsid. ## 2. Results We used a previously described biodegradable dry powder, PLGA microsphere, synthetic vaccine platform adjuvanted with TLR-4 and TLR-9 agonists for this study [48]. In that article, we showed that the TLR-4 and TLR-9 agonists given together with a peptide in a mouse model did not produce T-cell expansion by ELISPOT and that microencapsulation of the peptide and the TLR-9 ligand, with the TLR-4 ligand in the injectate solution, was required to elicit an immune response to the delivered peptide antigen as determined by ELISPOT. That study also demonstrated that the microencapsulated peptides alone were insufficient to induce an adequate immune response without the presence of the TLR-4 and TLR-9 agonists administered as described. The TLR agonists used for this vaccine formulation are used in FDA approved vaccines and can be sourced as non-GMP or GMP material for pre-clinical and clinical studies. We show here that the H2-Db restricted epitopes VSV (RGYVYQGL) and OVA (SIINFEKL), when administered to C57BL/6 mice, each produce a CD8+ ELISPOT response to the administered peptide antigen with no statistically significant CD4+ response measurable by ELISPOT as shown in Fig. 2c, and d. We used this adjuvanted microsphere peptide vaccine platform to immunize C57BL/6 mice with NP43-53, the CTL Class I peptide antigen from the Ebola Ziare NP protein identified as protective by Wilson et al. [73]. Microspheres containing NP43-53 and CpG were prepared as a dry powder formulation and suspended before use in a PBS injectate solution containing MPLA, and administered intradermally via injection at the base of the tail into mice as described in a previous publication [48]. As illustrated in Fig. 1c, there was no statistically significant difference between the ELISPOT data for the vaccinated mice versus the response seen in the negative ELISPOT controls. Wilson reported that protection seen in her experiment was due to a peptide sequence within NP-43-53. We hypothesized that the NP43-53 epitope was inefficiently processed into MHC binding sub-sequences during antigen presentation. In order to explore possible H2-Db matches for peptide sequences contained within Ebola Zaire NP43-53 (VYQVNNLEEIC), we prepared three peptide vaccine formulations, each containing one of the three possible 9mer sub-sequences within NP43-53. These sequences are shown in Table 1. We then vaccinated, via intradermal (tail) injection, three groups of mice with microspheres containing one of the three 9mer sub-sequences of NP43-53 (6 per group). ELISPOT analysis was performed, stimulating harvested splenocytes with the three possible 9mer sub-sequences. Splenocytes from mice receiving the NP44-52 sub-sequence had a statistically higher ELISPOT response than mice vaccinated with the other two possible sub-sequence 9mers (P < 0.0001) as shown in Fig. 1a. This is consistent with the predicted H2-Db binding affinity of YQVNNLEEI as shown in Supplementary Material Table 3. Table 1. Class I peptides used in the study. NP43-53 is the Class I 11mer described by Wilson et al. which we found not to produce an immune response in a C57BL/6 mouse model. NP43-51, NP 44-52 and NP 45-53 are the three possible 9mer sub-sequences of NP43-53. We then loaded one population of adjuvanted microspheres with NP44-52 and a second population of adjuvanted microspheres loaded with VG19 from EBOV Zaire NP 273-291 (VKNEVNSFKAALSSLAKHG), a Class II epitope predicted to be relevant to NP43-53 based on the TEPITOPE algorithm using a technique described by Cunha-Neto et al. [14]. This peptide has a predicted favorable H2-Ib binding affinity as shown in Supplementary Material Table 5. We showed that vaccination of 6 mice with the adjuvanted microsphere vaccine loaded with VG19 and NP44-52 showed an ELISPOT response to NP44-52 whereas 6 mice vaccinated with adjuvanted microspheres not loaded with peptide did not (Fig. 1d). We also showed that mice vaccinated with VG19 alone did not show an ELISPOT response to NP44-52 (Fig. 2a) and, conversely, mice vaccinated with NP44-52 did not show a response to VG19 (Fig. 2b). We conducted a pilot study demonstrating that intraperitoneal injection of the adjuvanted microsphere vaccine produced a statistically superior immune response by ELISPOT compared with the same dose delivered by intradermal tail or intramuscular injection in C57BL/6 mice (Supplementary Material Section 2). Based on the data from that study, and the fact that the volume of the intraperitoneal space would allow larger amounts of microsphere suspension to be delivered, we chose to proceed with intraperitoneal administration for the challenge portion of this study delivering 20 mg of microspheres per dose. We dosed three groups of mice, ten mice per group, with the adjuvanted microsphere vaccine formulation containing NP44-52 and VG-19, with each peptide in a distinct microsphere population, and challenged these mice 14 days after vaccine administration with escalating IP administered doses of mouse adapted EBOV (maEBOV) (Group 3-100 PFU, Group 5-1000 PFU and Group 7-10,000 PFU). The composition of the vaccine used for the exposure study is described in Supplementary Material Section 3. A second set of three control groups of mice (groups 2, 4 and 6), ten mice per group (mock groups), received PBS buffer solution alone and served as control animals for the study and were similarly challenged with maEBOV. Group 1 animals served as study controls and received no PBS buffer, vaccine or maEBOV injections. All mice were sourced from Jackson Labs and were 6–8 weeks of age and 15–25 grams at the time of vaccination. The dosing regimen is outlined in Table 2. Table 2. C7BL/6 maEBOV challenge study dosing regimen with PBS (buffer) controls. All challenges were done with Ebola virus M. musculus/COD/1976/Mayinga-CDC-808012 (maEBOV) delivered IP. Mice in Group 1 received no injections. Peak mortality across all groups tested was seen in mice challenged with 1,000 PFU maEBOV versus PBS buffer control as shown in the survival curve in Fig. 3a. Clinical observation data shown in Fig. 3b and c and daily weight data shown in Fig. 3d and e show protection from morbidity in all active vaccinated mice exposed to 1,000 PFU maEBOV. Fig. 3. 1000 PFU post-challenge data (20 mg active adjuvanted microspheres via intraperitoneal injection versus PBS buffer solution) collected beginning 14 days after vaccination. PBS buffer mock-vaccinated mice showed mortality increasing from the 100 PFU to 1,000 PFU as shown in Fig. 4a and Fig. 3a. We saw a paradoxical effect in control animals with survival increasing between 1,000PFU (Fig. 3a) and 10,000 PFU (Fig. 5a). We believe this was caused by innate immunity triggered by the very large maEBOV challenge. All mice in all vaccinated groups across both experiments survived and showed no morbidity by clinical observation scores and weight data. Fig. 4. 100 PFU post-challenge data (20 mg active adjuvanted microspheres via intraperitoneal injection versus PBS buffer solution) collected beginning 14 days after vaccination. For each of the three challenge levels, the difference between the number of survivors in the vaccinated group versus the PBS control group was statistically significant by chi square (100 PFU P = 0.001; 1000 PFU P = 0.0003; 10,000 PFU P = 0.003). We saw what appears to be an innate immune response at the 10,000 PFU EBOV exposure level. It has been suggested that EBOV can mediate an innate immunity response through stimulation of TLR-4 [33]. Because the adjuvanted microsphere vaccine used in this experiment incorporates a TLR-4 agonist, we dosed 10 mice with adjuvanted microspheres without peptides and found the level of protection after exposure to 100 PFU EBOV to be statistically no different from that seen in PBS buffer controls (Supplementary Material Fig. 1). We conclude that level of protection conferred by the adjuvanted vaccine described in this study is dependant on delivering peptides with the microspheres. The data in Supplementary Material Fig. 1 also shows, in two separate experiments conducted months apart with the same 100 PFU maEBOV challenge dose and the same (active) vaccine formulation, that the vaccinated animals in both active groups had 100% survival and no morbidity by clinical observation. This provides some evidence that the protective effect of vaccination using this adjuvanted microsphere vaccine is reproducible. Serum samples from sacrificed animals exposed to EBOV who did not receive vaccine were quantitatively assayed for various cytokines using BioPlex plates. Animals having unwitnessed demise did not have serum samples collected. A Pearson Correlation Analysis was performed to assess relationships between specific cytokine levels and survival. The results are shown in Table 3. Table 3. Cytokines with statistically significant (positive or negative) correlation with survival in non-vaccinated mice are shown here along with (Pearson Correlation Analysis) p-values. Cytokine/Survival Correlations for Control Groups Cytokine p-Value Correlation Mo IL-6 0.050 Decreased with survival Mo MCP-1 0.019 Decreased with Survival Mo IL-9 0.015 Increased with survival Mo MIP-1b 0.009 Decreased with survival Mo IL-12(p40) 0.006 Increased with Survival Mo G-CSF 0.005 Decreased with Survival Mo IL-1b 0.005 Increased with Survival Mo IFN-g 0.003 Increased with Survival Mo GM-CSF 0.002 Increased with Survival Mo IL-12(p70) 0.001 Increased with Survival Mo TNF-a 0.001 Increased with Survival Mo IL-17 0.000 Increased with Survival Mo IL-10 0.000 Decreased with Survival We observed low levels of IL-6 in surviving mice. NHPs infected with EBOV have been determined by other researchers to have elevated levels of IL-6 in plasma and serum [27][17]. EBOV infected humans have also shown elevated IL-6 levels and these elevated levels have been associated with increased mortality [71]. Similarly, we observed low levels of MCP-1, IL-9 and GM-CSF in survivors. Increased serum and plasma levels of MCP-1 have been observed in EBOV infected NHPs [22][27][17] and elevated levels of MCP-1 were associated with fatalities in EBOV infected human subjects [71]. Human survivors of EBOV have been found to have very low levels of circulating cytokines IL-9 and elevated levels of GM-CSF have been associated with fatality in humans exposed to EBOV [71]. We saw increased levels of IFN-γ in survivors. Other vaccine studies have associated IFN-γ with protection [70][38]. We achieved protection against maEBOV challenge with a single injection of an adjuvanted microsphere peptide vaccine loaded with a Class I peptide in a region on EBOV nucleocapsid favored for CD8+ attack by survivors of the 2013–2016 West Africa EBOV outbreak. There is evidence that a CTL response could be beneficial in the context of a coronavirus infection. [13][41][64][11][28][36] Peng et al. have found survivors of the SARS-CoV-1 outbreak who had circulating T-cells targeting SARS-CoV-1 nucleocapsid two years after initial infection. [42] We decided to investigate the feasibility of designing a SARS-CoV-2 peptide vaccine targeting SARS-CoV-2 nucleocapsid. All available SARS-CoV-2 protein sequences were obtained from the NCBI viral genomes resource within GenBank, an NIH genetic sequence database [8]. Retrieved sequences were processed using multiple sequence alignment (MSA) via Clustal for the nucleocapsid phosphoprotein [34]. The nucleocapsid phosphoprotein sequences were trimmed down to every possible peptide sequence 9 amino acids in length. 9mers were chosen because they typically represent the optimal length for binding to the vast majority of HLA [1][18][3]. The resulting peptides were compared to the MSA to ensure than these sequences are conserved within all of the sequencing samples available and not affected by an amino acid variant that could complicate subsequent analysis, specifically the calculation of population coverage. A selection of HLA were selected to encompass the vast majority of the worlds population at over 97% coverage. Peptides were run through artificial intelligence algorithms, netMHC and netMHCpan which were developed using training data from in vitro binding studies. The pan variant of netMHC is able to integrate in vitro data from a variety of HLA to allow for predictions to be made if limited in vitro data is available for the specified target HLA [30][1]. This in silico analysis utilizes the neural networks ability to learn from the in vitro data and report back predicted values based on the imputed SARS-CoV-2 nucleocapsid phosphoprotein peptides. Peptides with a predicted HLA IC50 binding affinity of 500 nm or less in either of the algorithms, were included in the candidate list of targets for the vaccine [30][1][40]. A subset of these SARS-CoV-2 peptide sequences are present on SARS-CoV-1 nucleocapsid phosphoprotein and as a result had in vitro binding data in Immunology Epitope Database and Analysis Resource (IEDB) collected after a previous outbreak [65]. Predicted values of these peptides were cross referenced with actual in vitro binding measurements from identical 9mer peptides when that data was available. ## 3. Summary and discussion Most preventative vaccines are designed to elicit a humoral immune response, typically via the administration of whole protein from a pathogen. Antibody vaccines typically do not produce a robust T-cell response. [72] A T-cell vaccine is meant to elicit a cellular immune response directing CD8+ cells to expand and attack cells presenting the HLA Class I restricted pathogen-derived peptide antigen. [47] Difficulty in obtaining a reliable immune response from peptide antigens and the HLA restricted nature of CTL vaccines have limited their utility to protect individuals from infectious disease [77]. However, observations derived from individuals able to control HIV infection [43] and EBOV infection [50] demonstrating that control may be associated with specific CTL targeting behavior, suggest that there may be an important role for HLA-restricted peptide vaccines for protection against infectious disease for which development of an effective traditional whole protein vaccine has proved to be difficult. The adjuvanted microsphere peptide vaccine platform described here incorporates unmodified peptides making possible rapid manufacture and deployment to respond to a new viral threat. NP44-52 is located within the EBOV nucleocapsid protein considered essential for virus replication. This epitope resides in a sequence conserved across multiple EBOV strains as shown in Supplementary Material Fig. 6. A 7.3 Å structure for NP and VP24 is shown for context in Fig. 6[67]. A 1.8 Å resolution structure rendering for EBOV NP shown in Fig. 6b illustrates that NP44-52 is a buried structural loop, which is likely to be important to the structural integrity of the EBOV NP protein [16]. This structural role of NP44-52 likely explains its conservation across EBOV strains. CTL targeting of the EBOV NP protein has been described [42][64][28][49][24]. Nucleocapisid proteins are essential for EBOV replication [61]. Recent advances in T-cell based vaccines have focused on avoiding all variable viral epitopes and incorporating only conserved regions [7][25]. EBOV NP may be more conserved than nucleocapsid proteins VP35 and VP24 making it more suitable as a CTL vaccine target [9][73]. The nucleocapsid proteins in SARS-CoV-1 are also essential for that virus to function normally [10]. This suggests that a CTL vaccine targeting coronavirus nucleocapsid could be effective against SARS-CoV-1 or SARS-CoV-2. We have shown that an H2-Db restricted Class I peptide exists within the NP41-60 epitope identified by Sakebe et al. as the most commonly favored NP epitope for CD8+ attack by survivors of the 2013–1016 EBOV outbreak in West Africa. We have demonstrated, when delivered in conjunction with a predicted-matched Class II epitope using an adjuvanted microsphere peptide vaccine platform, NP44-52 protection against mortality and morbidity for the maEBOV challenge doses tested in a C57BL/6 mouse model. We accomplished this with an adjuvanted, microsphere-based, synthetic CTL peptide vaccine platform producing a protective immune response 14 days after a single administration. EBOV can cause severe pulmonary problems in exposed subjects [39]. These problems can be especially severe when the virus is delivered by aerosol [15][31]. Interaction of EBOV specific antibody, NHP lung tissue and EBOV delivered to NHPs via aerosol can produce a more lethal effect than in NHPs without circulating anti-EBOV antibody exposed to aerosolized EBOV (unpublished conference presentation). This suggests that a CTL vaccine may be more effective for prophylaxis against filovirus protection than an antibody vaccine if the anticipated route of EBOV exposure is via aerosol. Sakebe et al. identified A∗30:01:01 as the only HLA type common to all four survivors in their study with CD8+ targeting of NP41-60. The A∗30 supertype is relatively common in West Africa: 13.3% for Mali, 15.4% for Kenya, 16.3% for Uganda, and 23.9% for Mozambique [32]. Although peptide vaccines are by their nature HLA restricted, it may be possible to create a CTL vaccine directed against EBOV for use alone or in conjunction with a whole protein vaccine to produce an antibody response in tandem, by incorporating additional Class I peptides from epitopes targeted by controllers to broaden the HLA coverage of the vaccine. MHC binding algorithms hosted by the IEDB predict that YQVNNLEEI will bind strongly to the MHC of HLA-A∗02:06, HLA-A∗02:03 and HLA-A∗02:01 individuals (Supplementary Material Table 2[65]. HLA-DR binding database analysis also suggests that VKNEVNSFKAALSSLAKHG demonstrates sufficiently promiscuous binding characteristics cover that same population (Supplementary Material Table 4[65]. Taken together, a peptide vaccine based on YQVNNLEEI and VKNEVNSFKAALSSLAKHG could produce a cellular immune response in about 50% of the population of the Sudan and about 30% of the population of North America. The internal proteins located within influenza virus, in contrast to the glycoproteins present on the surface, show a high degree of conservation. Epitopes within these internal proteins often stimulate T-cell-mediated immune responses [57]. As a result, vaccines stimulating influenza specific T-cell immunity have been considered as candidates for a universal influenza vaccine [66]. SARS-CoV-1 infection survivors have been found to have a persistent CTL response to SARS-CoV-1 nucleocapsid two years after infection. [42] This suggests that the same approach could be applied to SARS-CoV-2 which has conserved regions in nucleocapsid which is located within the virus (see multiple sequence alignment in Supplementary Material Fig. 7 and Supplementary Material Fig. 8). Antigenic escape allows a virus to retain fitness despite an immune response to vaccination [20]. Picking conserved regions for vaccine targeting is an important part of mitigating this problem. Coronavirus spike protein, for example, may be particularly susceptible to mutation meaning that antigenic escape would be likely if the spike protein was targeted by a coronavirus vaccine, making it difficult to achieve durable protection. [74] A recent paper conducted a population genetic analysis of 103 SARS-CoV-2 genomes showing that the virus has evolved into two major types: L and S, with changes in their relative frequency after the outbreak possibly due to human intervention resulting in selection pressure [62]. We took all possible 424 9mer peptide sequences from the SARS-CoV-2 nucleocapsid protein sequences available and evaluated each peptide for HLA restriction using NetMHC 4.0 and NetMHCpan 4.0 [65][30][1]. We analyzed 9mer peptide sequences because these are often associated with superior MHC binding properties than Class I peptides of other lengths [63][18]. We found 53 unique peptides with predicted binding below 500nM from NetMHC 4.0 and/or NetMHCpan 4.0. These results are shown in Supplementary Material Table 6Supplementary Material Table 7Supplementary Material Table 8 and Supplementary Material Table 9. We proceeded to determine the predicted HLA population coverage of a vaccine incorporating all 53 peptides using median values of the ANN, SMM, NetMHC 4.0 and NetMHCpan 4.0 algorithms hosted by IEDB [65]. These 53 peptides, taken together, had predicted HLA coverage of greater than 97% of the world’s population as shown in Supplementary Material Table 10. We also calculated HLA coverage based on alleles specific to populations in China and found that coverage across those individuals could be expected to be within 3% percent of the world wide coverage estimate as shown in Supplementary Material Table 11. This same population coverage could be achieved with 16 of the 53 unique peptides as shown in Table 4. Table 4. This set of 16 unique peptides represents the minimum number required to achieve >95% world-wide population coverage. The starting position is within the nucleocapsid. Top binding affinity predictions chosen via NetMHC 4.0 or NetMHCpan 4.0. Peptide sequences colored in red have literature references as known in vitro binders to the predicted allele match (see text). Seven of the 53 peptides with a predicted HLA match have been tested in vitro for HLA binding affinity by various researchers [65]. These binding affinity assays were originally performed with the SARS virus during a previous outbreak. Specific literature references for these in vitro assays for each peptide sequence are as follows: ASAFFGMSR, LSPRWYFYY, QQQGQTVTK: [53], FPRGQGVPI: [53][26][46][60], GMSRIGMEV: [26][64][13][41][12], KTFPPTEPK: [53][26][45][60][6] and LLLDRLNQL: [41][13][12][64][78]. These seven peptides are shown in red in Supplementary Material Table 6 and Supplementary Material Table 7. The remaining 46 SARS-CoV-2 peptides listed in could also be further qualified as potential vaccine candidates by confirming MHC binding predictions by in vitro binding affinity and/or binding stability studies [54][52][26]. Another approach to evaluating the 53 SARS-CoV-2 candidate vaccine peptides though in vitro testing is also possible. As we have shown in this paper, a peptide targeted by EBOV controllers could form the basis of a preventative vaccine for EBOV. ELISPOT analysis of PBMCs taken from the peripheral blood of COVID-19 controllers and progressors to assess the presence of a differential response to the 53 peptides could lead to a broadly applicable protective CTL vaccine against SARS-CoV-2 by incorporating peptides into the vaccine that are more commonly targeted for CD8+ attack by the controllers versus the progressors. A peptide vaccine for SARS-CoV-2, unlike a typical antibody vaccine, is not limited to virus surface antigen targets. This provides opportunities to attack other targets on SARS-CoV-2 besides spike which may be prone to mutation [74]. In addition, a peptide vaccine mitigates the risk of Antibody Disease Enhancement (ADE) seen in the context of a non-neutralizing antibody response to a whole protein vaccine [75][55]. Also, neutralizing antibodies directed against spike protein in SARS-CoV-1 patients have been associated with an increased risk of Acute Lung Injury (ALI)[35]. Specifically, patients succumbing to SARS-CoV-1 were found to develop a neutralizing antibody (NAb) response to spike protein faster than survivors after the onset of symptoms and the NAb titers were higher in the patients who died compared with those who recovered[76]. To the extent to which antibody vaccines producing an antibody response against the spike protein in SARS-CoV-2 could increase the risk of ALI, this risk could also be mitigated by a using peptide vaccine as an alternative approach. The extent of the COVID-19 outbreak should allow many more controllers to be identified than the thirty individuals studied by Sakabe and the seven individuals identified in the Peng study [42][50]. Furthermore, Sakebe and Peng did not report progressor data perhaps because of the difficulty in obtaining blood samples from those patients. If researchers act now during the COVID-19 outbreak, perhaps controller and progressor blood samples could be collected and prospectively analyzed, quickly creating a database of optimal candidate Class I peptides for inclusion into a CTL vaccine with potentially broad HLA coverage for subsequent rapid manufacture and deployment. It would be interesting to see the extent to which the peptides favored by controllers appear on SARS-CoV-2 nucleocapsid, making SARS-CoV-2 a second example, across two different viruses, of controllers exhibiting CTL attack preferentially on the nucleocapsid protein. ## Declaration of Competing Interest CV Herst, Scott Burkholz, Lu Wang, Peter Lloyd and Reid Rubsamen are employees of Flow Pharma, Inc. all receiving cash and stock compensation. Alessandro Sette, Paul Harris, William Chao and Tom Hodge are members of Flow Pharma’s Scientific Advisory Board. Alessandro Sette has received cash and stock compensation as an SAB member. Richard Carback and Serban Ciotlos are consultants to Flow Pharma, both receiving cash and stock compensation. John Sidney works with Alessandro Sette at the La Jolla Institute of Allergy and Immunology. Flow Pharma, Inc. has previously contracted with the La Jolla Institute of Allergy and Immunology to support other research not related to this study funded under STTR contract CBD18-A-002-0016. Reid Rubsamen, CV Herst, Scott Burkholz, Lu Wang, Peter Lloyd, Richard Carback, Serban Ciotlos and Tom Hodge are named inventors on various issued and pending patents relating to Flow Pharma’s technology. All of the rights to all of these patents have been assigned by each of the inventors to Flow Pharma. Shane Massey, Trevor Brasel, Edecio Cunha-Neto and Daniela Rosa have nothing to declare. ## Acknowlegements All animal handling was done in accordance with NIH and institutional animal care and use guidelines by Aragen Bio-sciences in Morgan Hill California and the University of Texas, Medical Branch, Galveston Texas working in conjunction with the Galveston National Laboratory. The research was funded by Flow Pharma, Inc. The threat posed by severe congenital abnormalities related to Zika virus (ZKV) infection during pregnancy has turned development of a ZKV vaccine into an emergency. Recent work suggests that the cytotoxic T lymphocyte (CTL) response to infection is an important defense mechanism in response to ZKV. Here, we develop the rationale and strategy for a new approach to developing cytotoxic T lymphocyte (CTL) vaccines for ZKV flavivirus infection. The proposed approach is based on recent studies using a protein structure computer model for HIV epitope selection designed to select epitopes for CTL attack optimized for viruses that exhibit antigenic drift. Because naturally processed and presented human ZKV T cell epitopes have not yet been described, we identified predicted class I peptide sequences on ZKV matching previously identified DNV (Dengue) class I epitopes and by using a Major Histocompatibility Complex (MHC) binding prediction tool. A subset of those met the criteria for optimal CD8+ attack based on physical chemistry parameters determined by analysis of the ZKV protein structure encoded in open source Protein Data File (PDB) format files. We also identified candidate ZKV epitopes predicted to bind promiscuously to multiple HLA class II molecules that could provide help to the CTL responses. This work suggests that a CTL vaccine for ZKV may be possible even if ZKV exhibits significant antigenic drift. We have previously described a microsphere-based CTL vaccine platform capable of eliciting an immune response for class I epitopes in mice and are currently working toward in vivo testing of class I and class II epitope delivery directed against ZKV epitopes using the same microsphere-based vaccine. ### 1. INTRODUCTION As of Fall 2016, the Zika Virus (ZKV) pandemic continues its northward spread in the Americas. The CDC estimates at least 4,100 cases in the United States and up to 29,000 cases in Puerto Rico. Those cases in Puerto Rico include 672 pregnant women (1). Using a data-driven global stochastic epidemic model to project past and future spread of the ZKV in the Americas, it has been estimated that the large population centers of Florida, New York, and New Jersey will be seeing significant numbers of imported cases (acquired by travel) of ZKV infection (2) by the end of Fall 2016. In South America, the new case rate of ZKV infection is tapering off, however, researchers in Brazil warn that official statistics may significantly underestimate the size of the ZKV epidemic based on improved serological tools that have become recently available. In any event, when a significant proportion of the population is infected with a viral infection and become immune, the epidemic can migrate to an area with a larger susceptible individual pool. Given the alarming news that significant brain defects were detected in newborns of 42% women infected with ZKV during pregnancy, including the third trimester (29%) (3), the public health threat of ZKV in pregnant women is even higher than expected before. Taken together, recent estimates put 1.65 million childbearing women in the Americas at risk of ZKV infection. As yet no phase II trials of a ZKV vaccine have been initiated. We review critical aspects of the unique pathogenesis of ZKV infection which will need to be considered when evaluating the efficacy of such vaccines and designing next iterations of possible ZKV vaccines to improve vaccine efficacy. In this article, we will also highlight details of the vaccines currently under consideration for Phase I and Phase II clinical trials, develop the argument that vaccines that evoke antibody responses need careful scrutiny, outline the rationale why our group is focusing on developing a “pure” CTL vaccine, and enumerate many of the challenges that will need to be overcome to develop an effective ZKV CTL vaccine. ### 1.1. Genome and protein structure of ZKV ZKV is a small enveloped plus strand RNA virus belonging to the genus Flavivirus, which includes many human pathogenic viruses, such as Dengue virus (DNV), yellow fever virus (YFV), West Nile Virus (WNV),m and hepatitis C virus (HCV). ZKV has a 10.8 kb RNA genome, containing a single open reading frame flanked by a 5-UTR (106 nt long) and a 3-UTR (428 nt long). The open reading frame encodes a polyprotein precursor, which is processed into three structural proteins [capsid (C), premembrane (prM), and envelope (E)] and seven non-structural proteins (NS1, NS2A, NS2B, NS3, NS4A, NS4B, and NS5). The viral E protein is the major surface glycoprotein of flavivirus, and the non-structural NS3 and NS5 encode essential enzyme activities for viral reproduction. The E protein is divided into three discernible domains (Domain I, Domain II, and Domain III). Domain I is involved in the envelope structure organization, and Domain II and Domain III are related to the monomers interaction and receptor binding, respectively (4). ### 1.2. Protective Immune Responses to Flaviviruses: Role of T Cells Significant information is available about the protective role of T cell responses against other flaviviruses of clinical importance. Prevention of infection is achieved primarily by neutralizing antibodies but T cell responses (both CD4+ and CD8+) are of utmost importance for virus clearance. Cytotoxic CD8+ T cells are critical to eliminate virus-infected cells while CD4+ T cells provide help to cytotoxic CD8+ T cells and antibody production (5, 6). DNV-specific CD8+ T cells play a protective role in natural DNV infection both in humans and in animal models (7) and polyfunctional CD8+ responses are associated with protection against disease (8). CD8+ T cell immunity has been shown to be protective against WNV infection (9). Vaccination with a tetravalent DNV vaccine elicits CD8+ T cell responses against highly conserved epitopes (10). Similar, the live-attenuated 17D-based YFV vaccine elicits potent and long-lasting CD8+ T cell responses (1113). Progress toward understanding the role of CD4+ T cell immunity in flavivirus infection is recent. YFV 17D-204 vaccination and adoptive transfer experiments demonstrate that CD4+ T cells contributed to protection against virulent YFV (14). Similar CD4+ responses have been found to be critical for protection against DNV challenge (15) and for the prevention of encephalitis during WNV infection (16). More recently, the CTL response in a murine ZKV model has shown to be crucial for protection against ZKV infection, both in CD8 depletion experiments in mice and passive transfer of memory CD8+ T cells to naive mice exposed to infection. Furthermore, deletion of the CD8a/gene leads to 100% death after infection. This CD8+ T cell response is cytotoxic, polyfunctional, and targeted to several H-2D-restricted epitopes (17). ### 2. SPECIFIC POTENTIAL ADVANTAGESOF CTL VERSUS ANTIBODY VACCINEFOR ZKV ### 2.1. Caveats of Antibody-InducingZKV Vaccines Following the acute phase infection of ZKV (with or without clinical symptoms), the persistence of biomarkers of ZKV infection (e.g., viral RNA in semen) suggest that some cells may be chronically infected. The wide distribution of types and anatomical locations of cells permissive for ZKV infection, sometimes beyond the easy reach of antibodies (e.g., blood–brain barrier), suggest that a cell mediated immune response will be critical for immune surveillance of chronically infected cells. While there can be little doubt that a ZKV vaccine stimulating a neutralizing antibody response will be a key resource in limiting viremia during the acute phase of ZKV infection, there are some concerns regarding the exact nature of the antibody response provoked. The exact pathological mechanism which drives Guillain–Barré syndrome (GBS) remains unknown although there seems to a general consensus that ant glycolipid antibodies play an important role, although not every GBS patient develop this type of antibody. As discussed earlier, there is an increased incidence of GBS associated with ZKV infection (18, 19), but it is not known whether antiganglioside antibodies have a role in this specific comorbidity of ZKV infection. Each of the four different DNV serotypes (DNV 1–4) provoke cross-reactive antibody responses that may contribute to the increased disease severity observed following subsequent infection with a different serotype. The first DNV infection is either subclinical or result in a mild disease, and results in long lasting immunity to the serotype. The next DNV infection, if initiated by a different serotype, can induce severe, potentially lethal disease termed Dengue hemorrhagic fever/Dengue shock syndrome (20, 21). The immunopathogenesis of severe disease is not completely understood. One model, termed antibody-dependent enhancement (ADE), works as follows: anti-DNV antibodies evoked by the primary infection, which were once neutralizing but are not with the current serotype, bind the second serotype viral particles and promote antibody mediated phagocytosis by myeloid antigen-presenting cells which in turn become infected serving as a future reservoir for infectious virions with impaired functional activity (22). Of note are recent reports demonstrating that preexisting anti DNV abs can enhance ZKV infection (23, 24). Conversely, preexisting serum anti-ZKV antibodies were able to enhance DNV infection in vitro (25). This is due to the high serological crossreactivity between both flaviviruses which may not be cross-neutralizing. This crossreactivity is so relevant that it has delayed the development of highly specific, non-DNV crossreactive serodiagnostic tests for ZKV infection. An additional concern for flavivirus vaccination-induced pathogenic antibodies in humans came from the recent reports of severe DNV breakthrough infections requiring hospitalization, after vaccination of seronegative volunteers with an antibody-inducing DNV attenuated virus tetravalent vaccine (Dengvaxia®), a phenomenon possibly related to ADE (26). This is a special concern since epidemics of both flaviviruses occur simultaneously in the same regions (27). Their research using a mouse model exhibiting much of the same symptoms/pathology of Dengue fever in humans, concluded “a sub-protective humoral response may, under some circumstances, have pathological consequences.” This group has since shifted their focus to inducing CD8+ T cell-mediated immunity to DNV (7, 28–31). Furthermore, the possibility that preexisting non neutralizing anti-ZKV antibody-dependent enhancement could facilitate infection of fetal–mother interface tissues and contribute to fetal ZKV infection has not been excluded yet. Of note, currently studied ZKV candidate vaccines currently in the pipeline, either in the preclinical or phase I trial (one ongoing trial) phases, aim to elicit antibodies and are all based on whole envelope proteins, or whole inactivated or live attenuated virus (32). Preclinical studies using vaccines encoding whole ZKV preM/E proteins in DNA form, using adenovirus vectors, or whole inactivated ZKV in non-human primate models have been able to elicit neutralizing antibodies and protection after ZKV challenge (33, 34). Taken together, these findings suggest caution in needed in the development of whole protein ZKV vaccines where evoked antibody responses that are not neutralizing may possibly enhance infection or be pathogenic (i.e., autoimmune) or could facilitate infection of maternal–fetal interface tissue. ### 2.2. Epitope-Based T Cell Vaccines Given the concerns with antibody-inducing flavivirus vaccines, one possible alternative would be to harness the power of the T cell immune response in protecting against flavivirus infection, as mentioned above. A recent report has shown that CD8+ T cell prevent antigen induced antibody-dependent enhancement of Dengue disease in a murine model and several studies have identified DNV T cell epitopes appropriate for inclusion in a T cell-based vaccine (31, 3538). Another recent study shows the critical role of CTL response for protection against ZKV infection in a mouse model; this article identifies ZKV H-2D restricted epitopes recognized by CD8+ T cells from infected mice (17). Recent clinical trials have demonstrated the efficacy of T-cell-inducing vaccines against a number of diseases (39), but immunization with whole proteins may favor responses to regions subject to antigenic drift and immune escape. A way to counteract this is to focus the response into specific desirable epitopes. The T cell epitope-based vaccine approach may target the immune response only to desirable and relevant epitopes, instead of the whole protein. Relevant epitopes include those that come from conserved viral protein regions, and/or where mutations could lead to reduced viral fitness, and those that bind to multiple MHC variant molecules—thus potentially recognized by the majority of the target population—while avoiding regions that are poorly immunogenic, variable and subject to antigenic drift, or that could cause a harmful response (40). These targeted immune responses could lead to increased potency, as well as increasing safety (41, 42). There are several ongoing clinical trials of T cell epitope-based Influenza vaccines aiming to be universal vaccines (43). Mapping and selection of potential immunogenic T cell epitopes is a crucial step that may be performed either with the aid of bioinformatics tools and experimental confirmation or by empirical approach using peptide library spanning the antigen full sequence. ### 2.3. Antigenic Drift: Parallels to ChronicHIV Infection and Implications for VaccineDesign In chronic HIV infection there exists a reservoir of latent, transcriptionally silent viral infection within the resting memory CD4+ T cell compartment and specific myeloid lineage cells (e.g., CD14+/CD16+ monocytes) [reviewed in Ref. (44, 45)]. The resting CD4+ memory cells have long life spans, can remain quiescent, and similar to some of the ZKV tissue targets such placental, neuronal, and gonadal tissues as recently described in mice (46), may reside in immune-privileged sites such as the B cell follicle of lymph nodes, allowing escape from existing immune surveillance mechanisms (47). While the mechanism that triggers active replication in HIV+ CD4+ memory cells is poorly understood, interruption of antiretroviral therapy is associated with the resumption of viral replication. Unfortunately, preexisting HIV-1-specific CD8+ T cell responses have shown to be ineffective [reviewed in Ref. (48)] due to viral evolution of CTL epitopes, resulting in a limited repertoire of effective of cytotoxic T cell-mediated immune responses (49) and progression to AIDS. In HIV infection there are selection pressures exerted by the cellular immune system which result in antigenic drift in new virons (50). A recent murine model study has demonstrated the potential importance of the CTL response to ZKV infection where H-2D restricted CTL epitopes were identified ( 17). Studies of HIV specific CTL responses in a subset of HIV+ individuals may also prove informative. HIV controllers (i.e., individuals who are HIV+ yet maintain low viral loads and do not progress to AIDS) have been carefully studied (51). HIV controller status is associated with the ability to develop CTL responses to regions of HIV proteins critical for maintenance of their structure–function (and viral fitness). Pereyra et al. (51) demonstrated that it may be possible to predict CTL class I epitopes favored by HIV controllers and suggested that CTL vaccines designed to evoke cellular immune responses to MHC class I restricted epitopes found within viral protein regions resistant to antigenic drift could lead to improved efficacy of HIV vaccines perhaps mimicking what happens naturally in HIV controllers. Our group has been inspired by these studies and has selected this general approach in the development of a CTL vaccine for ZKV. Flaviviruses mutate in response to immune system pressure, both by antibodies and T cells. It has been reported that HLA class I-binding residues of a CD8 + T cell epitope encompassing the conserved catalytic site of DNV NS3 protease suffer variation that can abrogate HLA class I binding, suggesting evasion of DNV from a specific CD8+ T cell response by antigenic drift (52). Antigenic drift in ZKV has not been thoroughly studied, but a phylogenetic analysis of contemporary human isolates show a common ancestor and as many as 34 amino acid substitutions relative to the common ancestors with most of the variation contained within the prM protein (53, 54), suggesting that ZKV does not undergo viral evolution as fast as HIV does. However, a recent phylogenetic study on 17 whole ZKV genomes from human isolates in the present epidemic has shown the mutation rate varies between 12 and 25 bases (0.12–0.25% of the polyprotein) per year since the 2013 Polynesia outbreak. The latest sequence shows 64 mutations; and overall, 62 non-synonymous amino acid changes were observed among all sequences analyzed, demonstrating that the ZKV continues to mutate at a rapid rate during the current epidemic (55). The rationale of focusing CTL attack to ZKV protein regions that are “intolerant” to amino acid substitutions thus remains sound ### 2.4. ZKV HLA Class I EpitopeIdentification: HLA Binding and StructuralEntropy Human class I epitopes have not yet been formally identified for ZKV. Some authors have published ZKV MHC class II epitope prediction based on MHC binding search engines alone (35, 56, 57). In order to generate a realistic list of MHC-1 binding peptides on ZKV E and M proteins, not only did we use a binding prediction tool, but we also performed matching known DNV class I epitopes to peptides on ZKV. This is warranted due to the antigenic similarity of DNV and ZKV, which display 44–68% sequence identity, as well as the reported crossreactivity to ZKV of DNV envelope specific antibodies (58). An additional layer of identification was the structural entropy analysis described in the next section. We generated the predicted ZKV epitope list using the sequence of ZKV Strain H/PF/2013 (GenBank Accession number: KJ776791.2) ( 59). This strain was isolated from an infected patient during the French Polynesia epidemic in 2013–2014. The E and M protein amino acid sequences were run through the MHC-I Binding Predictions tool available on IEDB (60). This tool combines data from multiple prediction methods, which include artificial neural networks stabilized matrices. Choosing only those alleles that occur in at least 1% of the human population, we generated a list of predicted epitopes for MHC-A and B alleles. Percentile rank is calculated by comparing a given predicted peptide’s IC50 (concentration of the query peptide which inhibits 50% of a reference peptide binding) against those of a random set drawn from (61) where smaller rank indicates higher affinity. The highest ranking MHC-A and -B alleles are presented in Tables 1 and 2. In order to maximize matching known DNV class I epitopes against the ZKV sequences, we were indiscriminate with respect to the DNV strain sequences. We used epitope sequence data from all of DNV strains 1–4, as downloaded from IEDB. Alignments between predicted ZKV epitopes and DNV were calculated using MAFFT (62) and webPRANK (63). A recent study by Stettler (58) indicated ZKV/DNV crossrecognition observed for antibodies may not also be present for T-cell epitopes. Because more work is needed on this topic, and in order to analyze a larger set of potential ZKV epitopes, the class I epitopes listed in Tables 1 and 2 are initially predicted, and only afterward aligned to DNV. Allowing for sequence divergence between DNV and ZKV, as well as keeping in mind the antigenic divergence between strains of ZKV, we did not require strict conservation between the predicted ZKV epitopes and the DNV epitopes they were compared to. As such, non-homologous but predicted epitopes were included in these tables. There are no table entries for epitopes matched to DNV but not predicted. Reported HLA specificities refer specifically to ZKV epitope predictions. ### 2.4.1. Computing Structural Entropy to Select Class IEpitopes for a CTL Vaccine X-ray crystallography can be used to generate a PDB file containing a complete mathematical representation of the threedimensional properties of a protein (64). Software is available which can take a PDB file as input and predict changes in the protein’s three-dimensional structure after specified amino acid substitutions. One example of such a program is FoldX, which compute whole-protein free energy changes resulting from these specified amino acid changes (65). Pereyra-Heckerman described an index they call structural entropy (SE) which codifies the extent to which a free energy change will occur after CTL escape at that epitope (51). A low SE indicates that at least one amino acid position in an epitope, a relatively high change in the protein’s free energy is expected to occur after mutations to one or more amino acids in that epitope. They analyzed class I epitope targets preferred by HIV controllers and reported that these individuals have a statistically significant preference to attack class I epitopes associated with a low SE. ### 2.4.1. Computing Structural Entropy to Select Class IEpitopes for a CTL Vaccine X-ray crystallography can be used to generate a PDB file containing a complete mathematical representation of the threedimensional properties of a protein (64). Software is available which can take a PDB file as input and predict changes in the protein’s three-dimensional structure after specified amino acid substitutions. One example of such a program is FoldX, which compute whole-protein free energy changes resulting from these specified amino acid changes (65). Pereyra-Heckerman described an index they call structural entropy (SE) which codifies the extent to which a free energy change will occur after CTL escape at that epitope (51). A low SE indicates that at least one amino acid position in an epitope, a relatively high change in the protein’s free energy is expected to occur after mutations to one or more amino acids in that epitope. They analyzed class I epitope targets preferred by HIV controllers and reported that these individuals have a statistically significant preference to attack class I epitopes associated with a low SE. much each substitution changes the protein’s energy relative to the wild type. We assume that the wild type is the most likely state, and compute the Boltzmann distribution from the free energy changes relative to the wild type, generating a distribution of probabilities for each substitution. Amino acids that do not cause large energy changes will have a high probability, large-valued entry in the Boltzmann distribution, and mutations which cause large energy changes will have low Boltzmann values. We can think of the values in the Boltzmann Distribution as measuring the “naturalness” of each mutation at the given site. The ZKV structural entropy data was generated using the PDB file ( 64) uploaded to RCSB by Sirohi et al. in March of 2016 (66). Protein structure data was available for ZKV E and M proteins only. The original DNA sequence used to generate this protein structure was based on the ZKV Strain H/PF/2013 (GenBank Accession number: KJ776791.2) (59). The ZKV E protein has been identified as the main source of H-2Drestricted MHC class I epitopes recognized by CD8+ T cells from ZKV-infected mice ( 17). SE data for class I epitopes identified on ZKV E are shown in Table 1 and for ZKV M in Table 2. Qualitative heat maps showing SE values computed using moving windows across all amino acids in ZKV E and ZKV M are shown in Figure 1. These heat maps are not based on the specific epitope sequences identified in the tables. They are qualitative and are intended to show the distribution of SE values throughout the proteins. Note that low SE regions, shown in blue, are in the minority. By raking the epitopes in order of SE in Tables 1 and 2, we list the epitopes predicted to be the best CTL targets based on Pereyra-Heckerman first. ### 2.5. Mapping of Potential Epitopes in ZKV Capable of Binding to Multiple HLA Class II Molecules The rational selection of CD4+ T cell epitopes in vaccine formulation is crucial for successful application of vaccination strategies that focus on induction of CD8+ T cell immunity, given the role of CD4+ T cell response in long-term maintenance of CD8+ T celldependent protective immunity. Recently, CD4+ T cells with cytotoxic features have been identified in PBMC from patients with chronic viral infections (6770). Bioinformatics tools for identification of HLA class II epitopes have been reviewed by Ref. (71). The TEPITOPE HLA-DR binding prediction algorithm (72) and the derived ProPred algorithm (73) use the concept that each HLA-DR pocket in the antigen-binding groove can be characterized by “pocket profiles,” a quantitative representation of the interaction of all natural amino-acid residues with a given pocket, creating a matrix incorporated in the TEPITOPE and ProPred softwares. For each HLA-DR specificity, the algorithms generated a binding score corresponding to the algebraic sum of the strength of interaction between each residue and pocket, which correlated with binding affinity. Peptide scores along a scanned protein sequence are normalized for each HLA-DR as the proportion of the best binder peptides (74). Since the software predicts binding to a significant number of HLA-DR specificities (25 in the case of TEPITOPE, 51 for PROPRED), it is also capable of predicting promiscuous peptide ligands each capable of binding to multiple HLA class II variant molecules ( 58). The TEPITOPE prediction algorithm has been successfully applied to the identification of dozens of promiscuous T cell epitopes frequently recognized in 59 antigenic proteins from several human pathogens including viruses, bacteria, protozoa, fungi, and helminths (HIV, SIV, CMV. M. tuberculosis, P. vivax, P. brasiliensis, S. mansoni), and in silico prediction correlated with promiscuity in HLA-binding assays and frequency of T cell recognition by exposed individuals (75). This has led to several epitope-based vaccines which were shown to be immunogenic using conventional or and HLA class II-transgenic mice (71, 76) and protective (77) in mice. The incorporation of a promiscuous CD4+ T cell epitope in a recombinant protein-based P. vivax vaccine led to significant increase in its immunogenicity (41). A recent study from our group in non-human primates showed that a HIV CD4+ T cell epitope-based DNA vaccine was highly immunogenic and induced significant responses to most encoded epitopes in all animals tested (unpublished observations). Vaccines encoding promiscuous peptides able to bind to multiple HLA-DR molecules may thus allow wide population coverage. Here, we used the TEPITOPE and ProPred algorithms to identify potential “promiscuous” CD4+ T cell epitopes—predicted to bind to multiple HLA-DR molecules—derived from conserved regions of ZKV majority/consensus E and M protein sequences from circulating strains in the recent epidemic in Brazil and Polynesia. ### 2.6. Selection of ZKV Sequences and Promiscuous HLA Class II Epitope Prediction The amino acid sequences derived from the ZKV strains BeH-818995 (Genbank accession number KU365777.1), BeH819015 (Genbank accession number KU365778.1), BeH815744 (Genbank accession number KU365780.1), BeH819966 (Genbank accession number KU365779.1), SPH2015 (Genbank accession number KU321639.1), and SSABR1 (Genbank accession number KU707826.1), isolated in Brazil; and the H/PF/2013 strain (Genbank accession number KJ776791.2) isolated in French Polynesia were assembled and aligned with Clustal W (MegAlign, DNASTAR, Madison, WI, USA, Figure 2). We scanned the generated consensus sequence with the TEPITOPE and ProPred algorithms. We selected ZKV M and E peptides (Table 3) whose sequences were predicted to bind to at least 2/3 out of the 25 or 51 HLA-DR molecules in the TEPITOPE or ProPred matrixes, respectively, corresponding to an inner nonamer core selected as the HLA binding motif with flanking amino acids added when possible at either or both N- and C-terminal ends, to increase the efficiency of in vitro peptide presentation to CD4+ T cells. ### 2.7. Potential Synthetic CTL Vaccine Platforms for Class I and Class II Epitope Delivery H-2D-restricted class I epitopes, when injected intradermally without adjuvants, produce a weak immune response in C57BL/6 mice. Methods have been described for eliciting immune responses to class I. For example, the target epitopes are linked together as a “string of beads” (78). In another example, the DNA corresponding to the desired string of epitopes is inserted in a modified vaccinia Ankara (MVA) vector. Immune responses have been elicited in mice using this technique ( 79). A DNA string has also been administered with electroporation (80). Immune responses in Macaques have been elicited in this manner (81). In order to add and subtract epitopes from the formulation used in these types of vaccines, new linker elements must be identified and proper presentation of the desired epitopes after “string-of-beads” processing by antigenpresenting cells confirmed (82). The use of a biodegradable, PLGA microsphere-based vaccine delivery platform allows one or more unmodified peptides to easily be incorporated into the vaccine formulation (83). The limitations of PLGA microsphere-based vaccines have been described in the literature. For example, double-emulsion sphere fabricating technologies may degrade the tertiary structure of the delivered antigen due to exposure to solvents or high temperatures used during spray drying processes (84). In a previous report, we manufactured our microspheres avoiding double emulsion sphere manufacturing technology using a precision spray drying process that operates at room temperature (85). In contrast to previous studies which incorporated only a single peptide epitope in spheres (86), we showed that it was possible to elicit an immune response from each of two epitopes delivered simultaneously, when the two epitopes were loaded into the same spheres or different spheres. This is an important consideration, especially because the HLA restricted nature of the class I epitopes being delivered will require the development of a “master vaccine” containing enough different peptide epitopes to cover a target population. The fact that the majority of the epitopes listed in the first four rows of Tables 1 and 2 have the best predicted HLA match as HLA*02 suggests that a vaccine directed against these class I epitopes could readily tested in Brazil where the frequency of HLA A*02 frequency varies from 21.7 to 47.5% between states ( 87). ### 3. CONCLUDING REMARKS The search for rapid development of safe and effective vaccines against ZKV is a global public health emergency. Testing multiple vaccine platforms in parallel may speed up and increase the likelihood of finding a good vaccine. We have proposed a rationale for ZKV epitope selection and design of T cell epitope-based vaccine against ZKV virus. Selection of candidate ZKV structureconstrained HLA class I epitopes able to bind an array of HLA class I supertypic molecules, and promiscuous class II T cell epitopes capable of binding to multiple HLA class II molecules could provide wide HLA and population coverage for such a vaccine which could be delivered using the synthetic, adjuvanted microsphere vaccine as outlined above or other techniques for epitope immunization that we discussed. ETHICS STATEMENT This study was carried out in accordance with the policies and procedures established by the WIRB Institutional Review Board with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the WIRB. AUTHOR CONTRIBUTIONS RR, EC-N, DR, and PH wrote the manuscript. All authors reviewed the manuscript. All authors directly participated in the research described with EC-N and DR performing modeling predictions of class II epitopes and AM, TO, SC, and RR performing class I epitope identification and SE calculations for class I epitopes. FUNDING The work described was funded in part by Flow Pharma, Inc. and through an in-kind grant of Azure platform cloud computer time by Microsoft Corporation which was used for class I epitope SE calculations. EC-N is supported by grants from CNPq (Brazilian National Scientific Council) and FAPESP (São Paulo State Research Foundation) 13/50302-3 1. Introduction To date, an effective vaccine for HIV has yet to be realized [1]. Here, we consider vaccines that fight the virus by inducing responses from cytotoxic T lymphocytes (CTLs). One key roadblock to an effective vaccine is that CTL-mediated attack of HIV infected cells is temporarily effective, but only until HIV mutates to escape such attack. Research has suggested that the HIV virus remains fit despite mutations within or near most CTL epitopes, and that escape at only a relatively small number of these locations will result in a less fit virus [2–4]. Consequently, it has been proposed that a successful vaccine would elicit responses exclusively against epitopes that are resistant to mutation or are otherwise characterized by a superior immune response [2–11]. Note that the need to elicit responses to multiple epitopes in a single individual may be important for effective viral control [2–11]. Unfortunately, CTL epitopes, like other small peptides, do not readily produce an immune response when injected on their own, even when combined with toll-like-receptor (TLR) agonist adjuvants known to boost the immune response to administered antigens [12]. Here, we describe a vaccine delivery mechanism that can elicit interferon gamma ELISPOT responses to multiple specific CTL epitopes. The delivery mechanism is a synthetic, non-living vector consisting of large d,l poly(lactic-co-glycolic) acid (PLGA) microspheres that carry multiple specific CTL epitopes. While PLGA microspheres have been investigated previously (see, e.g., [13,14] and references therein), we improve on this delivery mechanism in several respects. First, we demonstrate the need to include adjuvants positioned both inside and outside the microspheres, in contrast to previous work [13]. Second, we demonstrate in mice that it can be used to elicit substantial CTL responses to more than one epitope in the same individual, whereas previous studies have investigated only the inclusion of a single epitope. Finally, we compare our approach to current ones that elicit responses to specific epitopes, arguing our approach is simpler and more efficient. 1. Results We were able to manufacture the spheres to have specific mean diameters of any size ranging from 1 to 20 M, with a tight size distribution about the mean using a precision spray drying technique [15]. The geometric standard deviation(GSD) of diameter was typically 1.3–1.4 throughout the manufacturing process for each of the particle sizes produced in our experiments (Supplementary Fig. 1). We confirmed that PLGA microspheres were taken up by both mouse and human DCs. Time-lapse videos of human dendrocyte phagocytosis events after incubation with 8 M diameter spheres and 11 M diameter PLGA microspheres respectively were qualitatively evaluated. Dendrocytes were observed to phagocytose up to three of the 8 M spheres (Fig. 1a, b, and Supplementary Video 1) and a maximum of one of the 11 M spheres (Fig. 1c, d, and Supplementary Video 2), consistent with their relative volumes. A time lapse video of C57BL/6 dendrocytes incubated with 10 M standard size polystyrene spheres was similarly prepared to ensure that the size of the C57BL/6 dendrocytes was similar to that of the human cells (Fig. 1e, f, and Supplementary Video 3). Qualitative analysis of the C57BL/6 video showed a maximum of one 10 M polystyrene microsphere phagocytosed by a given C57BL/6 dendrocyte suggesting that the C57BL/6 dendrocytes were similar in size to their human counterparts. We performed our studies with 11 M spheres, the largest to be phagocytosed and thus capable of delivering large doses of epitope. The largest amount of peptide that could be loaded homogenously distributed in a sphere was 0.5% by weight. Spheres were loaded with ovalbumin (OVA) peptide (SIINFEKL) and vesicular stomatitis virus (VSV) peptide (RGYVYQGL), known mouse CTL epitopes [12]. C57BL/6 mice were inoculated with a single inter-dermal injection at the base of the tale and sacrificed after 14 days. Fresh splenocytes were harvested and subjected to IFN gamma ELISPOT analysis by strict Streeck, Frahm Walker criteria [16] against the same epitopes used in the inoculation. No inflammation at the injection site of any mouse was noted. We evaluated various adjuvants for use in the spheres themselves and in the solution surrounding the spheres loaded with the OVA epitope. For use in the carrier solution, we considered Monophosphoryl Lipid A (MPLA), a less toxic derivative of lipopolysaccharide that has been approved for use by the US FDA as an adjuvant for a marketed HPV product. MPLA acts as an immune-stimulant by signaling through the Toll-Like Receptor (TLR) pathway, specifically TLR4 [17]. MPLA has been used in commercial vaccine formulations as a viable alternative to LPS, the lipid A portion of Salmonella Minnesota Re595 lipopolysaccharide which is far too toxic for use in a vaccine [18,19]. For use in the spheres, we evaluated a CpG oligodeoxynucleotide (CpG ODN) specific for mouse cells, CpG ODN 1826. The composition of this adjuvant mimics bacterial DNA and so acts to stimulate the immune system through the TLR9 pathway [20–23]. The CpG ODN, is being used in at least one registered FDA monitored clinical trial, but has not yet been approved by the FDA for use in conjunction with a specific vaccine [21]. We found that the presence of CpG inside the spheres had a significant positive effect on the immune response (Fig. 2a, P = 0.0002). In addition, although previously published findings [24,25] showed increased CTL responses when MPLA was placed in the microsphere, we observed strong CTL responses only when MPLA was included in the carrier solution to rehydrate the microspheres for injection (Fig. 2b, P = 0.0002). We believe MPLA in the carrier solution acts to stimulate the tissue macrophages in the area where transformation to dendritic cells takes place, after which phagocytosis and antigen presentation occur. We found that presence of epitope inside the sphere was also critical. In particular, free epitope, even when combined with CpG and MPLA but without the presence of spheres produced essentially no immune response compared to the formulation using the PLGA loaded microspheres for the OVA (Fig. 2c, P = 0.0015) and for the VSV epitope (Fig. 2d, P = 0.0002). We evaluated the dose response to inoculation with 11 M microspheres loaded with 1%, 10% and 100% of maximum epitope for the OVA and VSV epitopes. The OVA epitope dose response showed a plateau beginning at the lowest level with no statistically significant difference between the 1% and 100% loaded levels (Fig. 3a, P = 0.25), whereas the VSV epitope showed a statistically significant increase in immune response with increasing loaded concentration atthe loading levels tested (Fig. 3b, P < 0.0001). Also, the difference in immune responses to OVA and VSV both at 1% loading were not statistically significant (P = 0.45), whereas the difference in responses to OVA and VSV both at 100% were statistically significant (P = 0.0013). We next evaluated the immune response exhibited from two epitopes delivered simultaneously by putting the two epitopes in the same microsphere, with a concentration of OVA and VSV both at 1% of maximum concentration. We used these concentrations because, as just mentioned, they produced immune responses of similar strength with single epitope loadings. We administered these spheres in a total amount equal to the amount used previously, with CpG in the spheres and MPLA in the carrier solution. The immune response to OVA in the presence of VSV was not significantly different from the response to OVA in the sphere by itself (Fig. 4a, P = 0.15), whereas the immune response to VSV in the presence of OVA was slightly greater than the response to VSV in the sphere by itself (Fig. 4b, P = 0.045). We then used an alternative technique to deliver two epitopes simultaneously by inoculating with two different 11 M microsphere populations, one population containing the OVA epitope and the other the VSV epitope, both at 1% of maximum concentration. We administered these two sphere populations in a total amount equal to the amount used previously, with CpG in the spheres and MPLA in the carrier solution. As in the same-sphere experiments, the immune response to OVA did not depend significantly on whether VSV spheres were present (Fig. 4c, P = 0.10). Also as in the same sphere experiments, the immune response to VSV in the presence of OVA spheres was greater than the response to VSV in the absence of OVA spheres (Fig. 4d, P = 0.019). These results suggest that vaccination against multiple epitopes can be achieved efficiently by manufacturing single-epitope microspheres, and then mixing the inoculum. 3. Summary and discussion In summary, this work evaluated interferon gamma ELISPOT responses produced by two different C57BL/6 mouse-relevant CTL epitopes. We showed that CpG (TLR9 agonist) inside 11 M PLGA microspheres significantly increased the immune response compared with spheres not containing CpG. We showed that MPLA (TLR4 agonist) had a statistically significant effect on the immune response when it was in the carrier solution but not when it was inside the sphere, in contrast to work by others [13,14,26]. For both epitopes tested, even with the addition of both CpG and MPLA, the free epitopes alone produced an immune response that was significantly lower than when the microspheres were used for microencapsulation of the epitopes and CpG. Finally, in contrast to previous studies which incorporated only a single epitope in spheres (e.g., [14]), we showed that it was possible to elicit an immune response from each of two epitopes delivered simultaneously, when the two epitopes were loaded into in the same spheres or different spheres. Recently, two methods have been described for eliciting immune responses to multiple specific epitopes. In both approaches, the epitopes to be targeted are linked together with short peptide sequences, sometimes referred to as a “string of beads” [27]. In one approach, the DNA corresponding to the string is inserted in a modified vaccinia Ankara (MVA) vector. Immune responses have been elicited in mice using this technique [10]. In a second approach, the DNA string is administered with electroporation [28]. Immune responses in Macaques have been elicited in this manner [11]. In contrast, we sought to use a biodegradable, microsphere based vaccine delivery platform as a way to allow one or more unmodified epitopes to easily be incorporated into a dosage form. This approach could streamline the development process by allowing epitopes to be added and subtracted from the formulation during the design phase without requiring the identification of appropriate linker peptides, an involved process [29], and subsequent confirmation that the desired individual epitopes would be properly presented. Further, given the need for an HIV vaccine in countries lacking medical infrastructure, we sought to make the administration procedure as simple as possible without relying on special medical devices. PLGA microsphere-based vaccines have been described in the literature and their limitations have been discussed. In particular, it has been pointed out that the tertiary structure of the delivered antigen may degrade due to exposure to solvents used in double-emulsion sphere fabricating technologies, high temperatures used during spray drying processes, or incompatibility with excipients [30]. We manufactured our microspheres avoiding double emulsion sphere manufacturing technology using a precision spray drying process that operates at room temperature [15]. In addition, because we are delivering the epitopes themselves and not a large protein antigen, tertiary structure stability in the formulation is not an issue, as our results demonstrate. Kanchan has reported the potential effect of particle size on the immune response stating that nano-sized particles may be more likely to produce a cellular immune response compared with micron-sized spheres [31]. However, in a review article, Agaki concludes that more studies with precisely sized spheres will be required to fully understand the relationship between the size and activity of vaccine-loaded biodegradable spheres [32]. Here, we sought to use microspheres sized near the diameter of a dendritic cell and found that class I epitopes could indeed elicit a cytotoxic T-lymphocyte response in mice and have contradicted the notion that large microspheres are not suited for this purpose as has been suggested [31]. Aluminum salts have been widely used as vaccine adjuvants but may not be effective in vaccines relying on T-cell activation [33]. Here we explored the use of other adjuvants and demonstrated that CpG within the microsphere and MPLA in the injectate enhanced T-cell activation. This is an important finding since MPLA has been used within PLGA microspheres for vaccine design previously and others have suggested that placing MPLA within the microsphere is the preferred approach [13,14,26]. The only TLR agonist being used in an FDA approved vaccine (Cervarix)isMPLA(TLR-4 agonist). TLR9 has been used in FDA cleared US clinical trials [34]. Because of this clinical history, we evaluated the potential beneficial effect of both of these adjuvants in our vaccine design. In our experiments, we measured immune responses by interferon gamma release. Additional work should be done to demonstrate cytolytic activity (see, e.g., [14]) and antiviral efficacy. Further work will be required to study the residence time of the phagocytosed microspheres within the antigen presenting cells and to characterize the minimum microsphere size at which a substantial immune response is seen. Also, experiments involving the simultaneous delivery of two epitopes both exhibiting a dose–response curve would increase the ability to detect attenuation of the immune response of one epitope by another. Finally, applications of this delivery mechanism to vaccines for other pathogens where CTL targeting is potentially relevant, such as hepatitis C [35–38], and influenza [39,40], should be investigated. Acknowledgements We thank Darrell Irvine of the Ragon Institute for helping us review previous research in the area, Nicole Frahm of the Fred Hutchinson Cancer Research Center for immunochemistry advice, Dan Barouch of the Beth Israel Hospital for his interest and support, Niraj Patil for assistance with illustration preparation, Craig Rouskey for helpful comments and Jonathan Carlson of Microsoft Research who helped review the manuscript. This work was supported in part by a Qualifying Therapeutic Drug Discovery Project Grant from the United States Government and a grant from Microsoft Research. Study Design. Three noncontiguous spinal implant sites in 1 rabbit were challenged with Staphylococcus aureus and local antibiotic prophylaxis was given with gentamicin in controlled-release microspheres (poly(lactic-coglycolic-acid) [PLGA]). Postoperative biomaterial centered infection on and around the titanium rods was assessed using standard bacterial quantification essays. Objective. To assess surgical site and biomaterial-centered infection reduction with controlled release gentamicin from microspheres against S. aureus. Summary of Background Data. A postoperative biomaterial-centered infection can be devastating after successful thoracolumbar spinal surgery and puts a high burden on patients, families, surgeons, and hospitals, endangering both our healthcare budget and our ability to perform challenging cases in patients with increasing numbers of comorbidities. Systemic antibiotics often do not reach “dead-space” hematomas where bacteria harbor after surgery, whereas local, controlled release gentamicin prophylaxis through PLGA microspheres showed favorable pharmacokinetics data to achieve local bactericidal concentrations for up to 7 days after surgery. Methods. A well published rabbit spinal implant model with systemic cephalosporin prophylaxis was challenged to create a baseline infection of 70% in control sites. We then challenged 3 noncontiguous titanium rods inside the laminectomy defect with 10e6 colony forming units S. aureus and randomly treated 2 sites with gentamicin PLGA microspheres and 1 site with PLGA carrier only (control). Standard quantification techniques were used to assess biomaterial centered and soft tissue bacterial growth after 7 days. Results. After establishing reliable infection rates in control sites, the therapeutic arm of the study was started. Surgical site infections were found in 75% of control sites, whereas gentamicin microspheres reduced the incidence down to 38% in the same rabbits. Biomaterial-centered infection was reduced from 58% to 23% only in all sites challenged with 10e6 S. aureus. Conclusion. Postoperative, biomaterial-centered infection was reduced at least 50% with intraoperative gentamicin microspheres in the face of systemic cephalosporin prophylaxis and high dose S. aureus in a laminectomy defect in rabbits. The data are statistically and clinically significant, and further animal testing is planned to confirm these results. Key words: postoperative infection, biomaterial centered, PLGA, gentamicin, Staphylococcus aureus, rabbit model. Spine 2009;34:479 – 483 Surgical site infection (SSI) is the most common, potentially preventable adverse outcome of a major operation. The economic impact alone is enormous and is estimated to cost the US healthcare excess of1.8 billion per year.1
The cost of treating a single implant-associated spinal wound infection can run in excess of \$900,000 and requires substantial resource allocation on the part of hospitals and physicians who are often poorly reimbursed.2–4 As such, the burden placed on hospitals and physicians to provide care for these patients is substantial and disproportionately falls on high volume tertiarycare referral centers, where patients with an implant associated spinal infection are often referred.5,6 Thus the prevention of SSI (prophylaxis) is a first line defense in the battle against these rising healthcare costs. The cost of orthopedic SSIs to patients, in terms of loss of limb and function, goes beyond the economic impact. Infection often results in the need for multiple operations, prolonged antibiotics, and extensive rehabilitation. For patients who develop an SSI, the consequences can be severe as the average length of hospital stay and overall mortality risk are doubled.9 Interventions that decrease the risk of SSI stand to benefit the patients, their providers, the healthcare system, and society at large.
Despite improvements in surgical technique, systemic antibiotic prophylaxis, and reduced operating time, implant-associated spinal wound infections remain a serious concern.1,2,7,8 This is especially true in light of the rapid emergence of multidrug resistant pathogen strains and a large immunocompromised patient population. Though under ideal conditions the incidence of infection has been reported to be less than 1% for patients undergoing elective spinal surgery, conditions are rarely ideal. The incidence of deep infection after spinal surgery may be in excess of 10% dependant on patient- and procedure-related factors.3,7,8,10–13 Geriatric, immunocompromised, diabetic, obese, cognitively impaired, and trauma patients are all known to have greater risks of infection after spinal surgery.8,11,13,14 The purpose of the study was to investigate the use of novel local antibiotic delivery vehicle, as an adjunct to routine perioperative systemic antibiotic prophylaxis, using a spinal implant animal model. An FDA-approved biodegradable polymer (poly(lactic-co-glycolic-acid) [PLGA]) was used to create resorbable microspheres to facilitate the controlled local delivery of gentamicin to wounds and hematoma. The efficacy of these microspheres in prevention of implant-associated spinal wound infections was evaluated using a well published spinal implant model in New Zealand white Rabbits (NZW).

## Materials and Methods

Animals
This investigation was approved by the Institutional Animal Care and Use Committee. Twenty-five NZW female rabbits were obtained weighing between 3.0 and 3.5 kg each. Female rabbits were used because, in the experience of the senior author, they are generally more docile and less prone to territorial marking with sprayed urine, which can potentially serve as a source of surgical site contamination.
Experimental Design
The current investigation was a randomized, prospective blinded study of the efficacy of a novel local antibiotic delivery vehicle for the prevention of implant-associated spinal wound infections, using a previously described animal infection model in the NZW rabbit.15 This multisite biomaterial-centered animal model is time tested and reliably mimics the human condition of posterior spinal surgery with instrumentation. By using 3 noncontiguous implant sites, a single animal may serve as both a treatment and internal control, thereby minimizing the number of animals needed for the study. Using an FDA-approved biodegradable polymer PLGA slurry containing 20% gentamicin, resorbable microspheres (10 m, resorption in 3–7 days) were created to facilitate a reliable, controlled release delivery system to wounds and hematoma. The pharmacokinetics of the release were studied in vitro and in vivo before application in an animal model and have been previously described. In short, the gentamicin-microspheres or powdered gentamicin was administered into the rabbit spinal defect in the absence of bacteria (500 g antibiotics per site). Animals were subsequently killed after 2, 4, 10, 24, 48, 72, 144, 168, and 208 hours. Hematoma was harvested from the implant sites and released gentamicin was determined in the supernatant after
homogenation and centrifugation. Representation of release for both the microspheres and powdered gentamicin can be seen in Figure 1. Systemic levels never rose above the detection limit of 0.05 g/mL in serum.
During the initial phase of the current study, 13 NZW rabbits were challenged at each of 3 surgical sites with varying concentrations of Staphylococcus aureus bacteria to reliably create a SSI in 70% of control sites in the absence of antibiotics local. Once an infectious dose (ID-70) was established, the second phase of the study investigated the efficacy of gentamicin microspheres (2.5 mg per site containing 500 g of gentamicin) for the prevention of implant-associated spinal wound infection. Twelve rabbits were used for the second phase of the study. Three noncontiguous surgical sites were used in each rabbit; 2 treatment sites and 1 control site, which were assigned in a random fashion. After 7 days, postoperative wound infection was assessed using standard tissue sampling and bacterial quantification techniques to study our hypothesis that the incidence of SSI and of implant-associated wound infection can be reduced using controlled, local delivery of gentamicin using microsphere technology.

Bacterial Inoculum
One day before surgery, S. aureus (ATCC 25923) was suspended in 5 mL trypticase soy broth and incubated at 37°C. After 18 hours, the culture was centrifuged (10,000 RPM) for 10 minutes, and the pellet was diluted in sterile saline. This washing process was repeated twice. Final concentrations of bacteria were obtained by making different dilutions in sterile saline. The final bacterial concentration (colony forming units (CFU) per milliliter) was estimated by using a densometric apparatus and assay (LaMotte 2020e, LaMotte, Chestertown, MD) and final determination was done by plating on Trypticase Soy Agar plates with 5% sheep blood (Fisher Scientific, Boston, MA).

Surgical Procedure
Induction of general anesthesia was performed using a combination of ketamine and xylazine, and subsequently maintained using isoflurane inhalation via nose-cone mask. All rabbits were given intravenous prophylactic ceftriaxone (20 mg/kg) before surgery to mimic preoperative prophylaxis in humans. After induction of anesthesia, the rabbits were positioned prone and each back was shaved, prepared, and draped in a sterile fashion. Three noncontiguous sites were created in each rabbit overlying the T13, L3, and L6 vertebrae. The surgical approach was identical for each site, though separate instruments and drapes were used for each surgical site to prevent cross contamination. A 1.5-cm dorsal skin incision was made longitudinally in the midline, followed by a single incision in the fascia to expose the spinous process. Using a small rongeur, the entire spinous process with surrounding musculature and ligaments was excised from the base, creating a self-contained defect, approximating a partial laminectomy defect. The ligamentum flavum was not violated, and the dura was not exposed. A 1-cm Ti90/Al6/V4
rod (2-mm diameter, Item: TI017905, Goodfellow corporation, Oakdale, PA) was implanted into the defect. Wound hemostasis was achieved with a flowable hemostatic agent (Surgifoam, Johnson and Johnson Wound Management, Somerville, NJ), mixed with either a nonantibiotic PLGA resomer (control group) or gentamicin PLGA microspheres. Bacterial inoculum (100 L) was placed into the defect using a sterile syringe needle (30 G). The fascia was closed using running sutures with biodegradable Vicryl 2/0 suture (Ethicon Inc. Piscataway, NJ). The skin was closed using a running subcutaneous suture with Vicryl 3/0 (Ethicon Inc.). During the initial phase of the study only the nonantibiotic PLGA Resomer was used and each of the 3 surgical sites was challenged with a randomly assigned bacterial load between 104 and 106 CFU to establish the ID-70. The second phase of the study started once the infectious dose was established. In this phase, 1 control site and 2 treatment sites were assigned randomly to each rabbit (using a random number generator). All wounds were challenged with 106 CFU.
After the procedure, analgesia was provided using a standard protocol, and all rabbits were permitted to drink, eat, and weight bear ad libitum. They were monitored daily, especially in regard to their wound healing, body weight, and signs of systemic infection.

Evaluation
After 7 days, postoperative wounds infections were assessed using standard tissue sampling and bacterial quantification techniques. Rabbits were killed using an intravenous injection of phenobarbital (10 mg/kg). After the skin was removed off the entire back using sterile technique, samples of the fascia, the hematoma, and the vertebral lamina were taken and the implanted metal rods were removed from all sites. A piece of the right liver lobe and an intravenous blood sample were obtained to monitor for systemic infection. Harvested tissues weighed, then immediately homogenized (PowerGen 35, Fisher Scientific, Pittsburgh, PA), and implants were sonicated (UBATH, World Precision Instruments, Sarasota, FL) for 15 minutes in cold saline to detach bacteria. Serial dilutions of all samples were created and plated on blood agar plates for 24 hours of incubation at 37°C. The final CFU was determined per gram of tissue sample and per centimeter of titanium rod. Biomaterialcentered infection was defined to occur where S. aureus was present on the implanted rods and at least 1 other tissue sample from the same site. All samples were collected by and evaluated by a member of the team blind to the treatment type at each site. 2 calculations (SigmaStat 3.5; Systat Inc. San Jose, CA) were used to determine if differences in infection incidence were statistically significant. Student t tests were performed to identify statistical differences in severity of bacterial burden, both with a P value set at 5% for significance.

## Results

Two rabbits did not survive to the 7-day endpoint. One rabbit could not be resuscitated after induction of anesthesia before surgical intervention in phase 1 and another animal died unexpectedly during recovery in the
postanesthesia incubator after surgery for phase 2. None of the remaining 23 animals suffered from any systemic infection and all started to gain weight again after postoperative day 2. Phase 1 was completed with 12 animals, and results from the increasing bacterial inoculum to achieve an approximate infection incidence of 70% are listed in Table 1. Based on the results of phase 1, all spinal implant sites in phase 2 were inoculated with 106 CFU S. aureus. Eleven rabbits were evaluated after 7 days and final results for infection incidence are listed in Table 2. Both for SSI as well as implant-associated infection, incidence of infection was significantly reduced using gentamicin microspheres compared with control sites (P 0.01: 2 test; SigmaStat 3.5, Systat Software, Inc.). Severity of infection was assessed using serial plating techniques with average bacterial counts for infected samples shown in Table 3. There was no significance between the control- and treatment group for severity of bacterial growth once a site became infected.

## Discussion

Despite meticulous technique, bacteria end up inside the surgical wound after long procedures.16,17 Though the routine use of systemic antibiotic prophylaxis has revolutionized the care of surgical patients, this modality
alone is insufficient for high-risk patients.17 Local hematoma harboring bacteria at the end of the procedure, combined with systemic malnutrition, tissue hypoxia, compromised skin under a stabilizing brace, and poor wound healing while patients are bedridden are important factors for the progression of these initial bacterial burdens into clinically significant infection. Even the implants themselves conspire against the surgeon physician
to decrease the body’s ability to eliminate bacteria. The use of implants enhances the formation of a surface adherent and protective “biofilm” that is difficult to eradicate despite the use of antibiotics that are highly effective
in standard in vitro susceptibility tests.18–21 Local delivery of antibiotics to spinal surgical wounds is intuitively attractive as an adjunct to systemic perioperative antibiotics. Thus, it allows for the local environment, where intravenous antibiotics cannot reach (hypoxic, devitalized tissue, dead space, pooled hematoma without vascular supply), to be sterilized. The ability to delivery antibiotics locally to wounds, primarily in the form of antibiotic powder impregnated in bone cement, is well established in the treatment of musculoskeletal infections.22–25 However, the use of bone cement for the local delivery of powdered antibiotics has many drawbacks. Foremost, the pharmokinetics of antibiotic delivery with bone cement are unpredictable and vary depending on the porosity of the cement used, the type of antibiotic, and the mixing conditions.

In general, the pharmokinetics are characterized by initial burst levels of antibiotics, which may be cytotoxic, and which rapidly decline often below therapeutic levels.23,32–35 Because bone cement is not bio-absorbable, the cement itself may serve as a nidus for infection once the antibiotics have been delivered.36,37 Bone cement also allows bacterial adhesion and growth even in the presence of antibiotics and sustained exposure to subtherapeutic antibiotic levels contributes to the further development of drugresistant bacteria. Furthermore, the bulk of the bone cement may compromise a surgeon’s ability to close the surgical wound and, because the bone cement is nonabsorbable, additional surgery is often required for its explantation.
Gentamicin microspheres described herein offer many advantages over antibiotic impregnated bone cement. The reduction of postoperative infection was statistically and clinically significant, although the spheres did not protect against the severity of the infection in cases where infections occurred. Once a site became “overcolonized” despite the presence of the gentamicin microspheres, bacterial burdens were similar as seen in infected control
sites. The explanation for that could be the “all-ornothing” phenomenon. Once S. aureus CFUs overcame the local challenge and started surviving more frequently, overcolonization of the sites occurred. There is most likely a ceiling effect, above which CFUs become nutrient deprived and therefore, severity cannot “worsen” in control sites over the sites treated prophylactically with the microspheres. The pharmokinetics of the gentamicin microspheres, which provide a controlled and sustained release of therapeutic levels of antibiotics, are clinically superior to those of powdered antibiotics.4 Additionally, because the microspheres are bioabsorbable, there is never a need for patients to undergo an additional surgical procedure for their removal nor do they serve as a nidus for infection once their antibiotics are delivered. The small size of the microspheres allows them to accommodate any existing surgical defect, even allowing them to be injected by syringe after primary fascial closure, without ever compromising the surgical wound.
Gentamicin microspheres are not just intuitively attractive; they have proven to be effective both in vitro and in vivo. The results of the current study in a well established animal model are promising and have demonstrated the ability of these microspheres to significantly decrease the incidence of implant-associated postoperative wound infections. This is in agreement with prior efficacy data for the spheres against nonimplantassociated infection.4 Most importantly, the use of gentamicin microspheres demonstrated a protective effect against SSI, in addition to that provided by systemic perioperative antibiotics, mimicking the current clinical standards. Clinical investigation of these gentamicin microspheres in postoperative spine wounds in high-risk patients is eminent.

## Key Points

● Reliable biomaterial-centered infections were established in 3 spinal implant in NZW.
● Gentamicin microsphere treatments locally inside the hematoma of a laminectomy defect significantly reduced the infection incidence over control treatment in the same animal and will be investigated as an adjunct antibacterial prophylaxis in humans.