Katy Cornish & Ehmke Pohl
Edited by Will Stanley
Viruses are ubiquitous, with the number of virus particles vastly exceeding the number of stars in the observable universe. They infect every type of organism, from simple single-cell bacteria and fungi, to complex multicellular organisms including plants and humans, where they often cause serious diseases. Viruses can even be found in even the most inhospitable environments on the planet, such as the boiling waters of volcanic hot springs, infecting the specialised thermophilic or ‘heat-loving’ organisms that thrive there. So far more than 6000 different types of virus (Figure 1) have been identified and characterised to a certain degree; however, it is clear that there are many thousands if not millions more that await discovery.
The importance of viral proteins in biotechnology
With so many yet to be discovered, viruses represent the largest and most diverse genetic resource on the planet. These viruses each carry a genetic code containing all of the instructions required to for them replicate and produce new virus particles. However, in order to do so they must hijack a living host, as they are not equipped with all of the necessary machinery to multiply themselves unaided (Figure 2). They force the commandeered host cell to make vast quantities of viral proteins and genetic material, which can then be formed into new virus particles that fill the cell.
As they are not able to replicate without a host, there is continuous debate between scientists as to whether viruses should be classed as living organisms or not. Nevertheless, viral enzymes (protein molecules that speed up chemical reactions in living organisms) that have evolved to take control of the host cell’s machinery present powerful tools in the bio-scientist’s toolbox. Viral polymerases (enzymes that copy strands of genetic material such as DNA), proteases (enzymes that degrade other proteins), and lytic enzymes (involved in the bursting open or ‘lysis’ of the host cell to release the newly made virus particles trapped inside) have found numerous applications in today’s biotechnology-focused economy. One important feature of many of these enzymes utilized in laboratories worldwide is a need to be able to function under even extreme conditions, such as at high temperatures and in very acidic or basic environments that would destroy typical enzymes – like the kind found in our bodies. This in turn makes the most extreme locations on the planet ideal hunting grounds for new suitable enzymes that can cope and even thrive under such harsh conditions.
The Virus-X consortium and the Virus-X data base
In 2016, an international team of researchers from 14 academic institutions and biotech companies across eight European countries put together the Virus-X consortium, which set out to find new tools for biotechnology by harnessing the genetic diversity of viruses with funding from the European Union. They focussed on two of the most extreme habitats on the planet: (i) deep sea vents in the Atlantic Oceans located more than 2300 m below sea level, featuring temperature of over 300 °C and pressures of several hundred times our atmospheric pressure; and (ii) volcanic hot springs in Iceland with boiling water, high salt content and acidic (low pH) conditions (Figure 3). Since the cultivation of these viruses is extremely challenging, if at all possible, the team employed and developed methods designed to extract all of the genetic material in samples taken from these environments and sequence it, providing them with the letter-code that makes these organisms tick. In this so-called meta-genomic approach, all gene sequences are then analysed by powerful bio-informatic algorithms to first identify the genes that have come from viruses, rather than from bacteria or other organisms present in the sample. In the next step the genomes of individual viruses are assembled and characterised. So far, the team has identified several completely new viral families and assembled an astonishing data set of over 50 million new genes in the Virus-X database.
Treasure hunt in the Virus-X database
With over 50 million genes to work with, the next task was to assess the possible functions of the encoded proteins and select those with the highest potential for further exploitation. One particular challenge is that viruses have a short life cycle and the replication of their genetic material is often error prone, meaning viral genomes change very quickly. This allows them to react quickly to changes in host defence systems, though another consequence is that viral genomic sequences are far removed from their ancestors in terms of similarity. Far more than half of the sequences in the Virus-X data base show no similarity to any known gene in any other data base – a true treasure trove of untapped genetic wealth.
Needle in the haystack
In order to identify protein candidates for biotechnology applications, powerful computer algorithms developed by the team were employed. Several hundred proteins predicted to have particularly desirable functions, such as enzymes that can cut, stick or copy DNA (nucleases, DNA ligases and DNA polymerases, respectively), were selected. The underlying genes were then inserted into specialised laboratory strains of E. coli bacteria, which synthesise the desired proteins so they can be harvested. Several of these proteins are now being evaluated by the industrial partners A&A Biotechnology and ArcticZymes to see how they can best be put to use in the lab.
Virus-X – exploring the unknown
The final and key aim of the team was to explore the unknown, to go boldly where no one has gone before. Once more, powerful algorithms were used to identify genes encoding new enzymes with potentially novel functions. To analyse these enzymes, the Virus-X team utilised X-ray crystallography to elucidate their 3-dimensional structure. For this technique, the proteins are first highly purified and prompted to grow into crystals (Figure 4), before shipping to a powerful X-ray source: the Diamond Light Source synchrotron in Oxfordshire. Here the protein crystals are bombarded with X-rays and the data collected through this process are analysed to determine the protein structure in exquisite detail. These structures give insights into what these totally new enzymes are able to do and how they might function.
[Figure 4]: Under very precise conditions, proteins in solution will form crystals such as these. The biggest crystals here are still only a few thousandths of a millimetre big and must be scooped up by hand using a tiny loop under a microscope so they can be sent to the Diamond Light Source.
Novel lytic enzymes
As mentioned previously, lytic enzymes are used to burst open or ‘lyse’ bacterial cells by disrupting the protective cell wall that surrounds them. Due to this important function, lytic enzymes are used in a wide range of applications, from releasing desired proteins from bacterial cells in research laboratories to controlling bacterial pathogens in the food industry. Importantly, the use of lytic enzymes in these applications needs to be extremely well tailored to the specific conditions. While most lytic enzymes degrade certain components in the bacterial cell wall, the team was able to identify a previously unknown component of the lytic system and suggest a novel mechanism for how it works. The protein, named XepA, forms a unique structure made of five identical protein chains, called a pentamer (Figure 5).
The structure reveals that the top half of the pentamer resembles proteins involved in binding phospholipids: components of the bacterial membrane that lies beneath the cell wall. The surface charge representation reveals that this part of the protein is largely positively charged and perfectly suited to bind the negatively charged phospho-head groups of the phospholipids in the cell membrane, disrupting the cell wall and causing the cell to burst open.
The Virus-X project may have reached its initial goals, but the research continues. As well as the success of finding several commercially valuable enzymes, our knowledge of ‘the virosphere’ – the world of viruses around us – has increased dramatically as a result of the teams’ work. Though none of the proteins discovered through the project are from viruses that infect humans, a better understanding of how they function in general can be applied to how pathogenic viruses may cause human disease. Members of the project are now turning their focus on how they can apply this newly acquired knowledge to help during the global COVID-19 pandemic. For example, several DNA-binding proteins discovered through the project are now being assessed for their ability to improve efficiency in COVID-19 testing.
The project has reiterated the need to continue exploring the variety of the world around us, and demonstrated the importance of collaborative research across borders for pushing the boundaries of understanding.
Katy Cornish is in the second year of her PhD working as part of the Virus-X team at Durham University. Her focus is on structural biology, using techniques including X-ray crystallography to reveal the structures of novel enzymes.
Ehmke Pohl is a Reader in the Departments of Biosciences and of Chemistry at Durham University. His main research focus is in protein structure and their interactions with other molecules, to explore new biomedical and biotechnological applications.