The Covid-19 Gene-Sequence From A South African Patient: Now What?


Recently researchers from the University of the Western Cape and the National Institute for Communicable Diseases of the National Health Laboratory Service sequenced the Covid-19 genome from a South African patient (http://virological.org/t/whole-genome-sequence-of-the-severe-acute-respiratory-syndrome-coronavirus-2-sars-cov-2-obtained-from-a-south-african-coronavirus-disease-2019-covid-19-patient/452). My thoughts will not be about the technical aspects of the science but rather about the implications (if any) and what this means for regular people; any thoughts remotely resembling technical aspects will be within this context.


The need for genetic information


A very central theme in molecular biology is information storage and its subsequent transfer. The genetic information that is stored in macromolecular structures called nucleic acids exerts its influence on an organism through other molecules called proteins. If nucleic acids are the driver then proteins are the wheels that dictate where the vehicle goes. Ultimately, to understand the behaviour of an organism (vehicle) we must have the genetic (driver’s) information. Sequencing is a technique that allows us to extract that information. Once that information is extracted we then can understand the behaviour of the organism. It therefore becomes quite central in drug/therapy design to sequence the genetic information of pathogens. Strictly speaking, viruses are not organisms but organic molecules with a fancy organisation. This means we cannot ascribe “motive” to viruses. Now that we have this genetic information what do we do?


Populations and statistics


A key motif in evolutionary biology is how genetic information affects populations. The recently sequenced Covid-19 genome from a South African patient was compared to a reference genome (a reference genome is any genetic sequence which is thought of as the “normal” or “standard” to which other sequences are compared in order to find any differences. This reference is somewhat arbitrary). This is important because in order to determine if the Corona virus in South Africa is different to the one in Wuhan (for example), the South African sequence has to be compared to the reference Wuhan sequence. As a corollary, the South African sequence must be established as a polymorphism rather than just an individual mutation. A polymorphism is the occurrence of a particular genetic sequence in at least one percent of the population. These means whatever genetic changes were observed in that particular patient should not just be an “isolated” mutation for one person but must reflect by a frequency of at least one percent of the total global patients. It is this population that is altered by evolutionary forces such as natural selection pressures and genetic drifts. The sequenced Covid-19 genome from a South African patient had about three point mutations (changes) of interest, Aspartic acid to Glycine, Proline to Leucin and a point mutation in the 5' untranslated region. Knowing if these changes occur in at least one percent of the entire global patient population and whether these changes are the most predominant in South Africa would allow for the development of alternative (better) therapies. Currently, only one patient is known to have this variation in South Africa and more patients’ Covid-19 genomes need to be sequenced to determine the extent of this potential polymorphism in the South African landscape.


Drug/vaccine design


Most drugs are designed to target protein molecules (proteins are the workhorses through which genes exert their influence – the wheels of the vehicle in my earlier analogy). The changes observed in the South African patient with Aspartic acid changing to Glycine, Proline changing to Leucin and a point mutation in the 5' untranslated region are significant changes and may potentially have an impact on the behaviour of the virus. In protein biochemistry Aspartic acid is polar and Glycine is non-polar; Proline is a helix breaker while Leucin is hydrophobic. This is just a fancy way of saying the tire brands of the vehicle (virus) are so different that substituting one for the other will affect how the vehicle runs. The next logical step is to investigate how these changes affect the kinetics and stability of the virus in South Africa as compared to the Wuhan reference. Many tools in structural biology exist for this purpose whose details are outside the realms of this article and its intention. Without this information we would be unprepared to answer any eventualities that may arise from these changes in the African context.


In conclusion, what is evident is that more work needs to be done to fully understand the scope and implications of what we may be faced with. Two key pieces of information that must be investigated are whether the mutations found in the Covid-19 sequenced genome from a South African patient are polymorphic – that is - are significant enough to be found in at least one percent of the global patient pool and what the distribution of this variant is like in South Africa. Parallel sequencing of as many patients as possible can achieve this with high-throughput sequencing technologies such as Next Generation Sequencing (NGS). Lastly, the kinetics, thermodynamics and structural elucidations of the proteins encoded by this mutated genome should be analysed to comprehend the full extent of what these changes mean for South Africa. Next, we must develop the full capacity of our biotechnology start-ups to use this information and offer solutions in drug/ therapy/testing designs.