• Saturday, Dec 03, 2022
  • Last Update : 09:54 am

Cracking the genome mystery

  • Published at 07:14 pm May 15th, 2020
CORONAVIRUS main corona
Photo: Bigstock

Ultimately, biology will provide the answer

SARS-CoV-2 has so far infected more than 4,100,000 people in 187 countries and claimed over 300,000 deaths but no drug or vaccine is available yet. In Bangladesh, over 16,500 people got infected, out of which over 250 died. Lockdown can provide a temporary solution, but we need a sustainable solution for this. 

Although there are three (A, B, C) SARS-CoV-2 variants, still we don’t know which one is prevailing in our country, how and through which route it has been transmitted here, if it has acquired any mutations by now, and how deadly has it become. Also, we do not know why some people are affected more, showing serious symptoms while others remain asymptomatic. 

We do not know clearly why and how this coronavirus created havoc in some countries, whereas others are mildly affected. In the modern era, problems in biological sciences are tackled by a bottom-up approach, where we do genome sequence of the relevant organism and associate it with other metadata to address the problem and find solutions. 

For the same reason so far, 80 countries have deposited more than 24,000 genome sequences of this virus, which includes even countries like Nepal and Vietnam where the coronavirus problem is comparatively less severe. Since the first cases were reported on March 7, 2020 by the country’s epidemiology institute IEDCR, we have been repeatedly advocating the need of genome sequencing of this virus. We also ensured that Bangladesh has made substantial advancement in science and technology -- especially with the special attention of the prime minister -- now we are able to do genome sequencing by Next Generation Sequencing (NGS) in our country. 

There are some institutes and private organizations where NGS machines are available and virus genome sequencing can be done, and we also have expert and experienced bioinfomaticians who can perform complete genome sequence analysis. 

However, finally, the icebreaking work has been done by the Child Health Research Foundation (CHRF). Dr Senjuti Saha, Dr Samir Kumar Saha, along with their team from CHRF, collected samples from a coronavirus infected 22-year-old female person and arrange to do whole genome sequencing of the virus using Illumina iSeq 100 NGS platform. 

As soon as the news of deposition of genome sequence data became available on May 12, Tuesday afternoon, we sought to extract this sequence and information from the public repository GISAID, CNCB, and started to explore it. Lead by me, at the Department of Genetic Engineering and Biotechnology, University of Dhaka, the Epigenetic and Bioinformatics team on nCoV research has done basic analysis of the genome. 

My team member Mr Abdullah Al Kamran Khan was with me in this analysis. We compared the sequence with that of first reported coronavirus genome sequence from Wuhan, China -- which is globally considered as “reference.” Strikingly, we have found that this genome is very similar (99.7% similarity) to that of reference SARS-CoV-2 isolated from Wuhan. 

There are changes only in nine places and these changes are single nucleotide change (SNP). No deletion or insertion/addition of any large sequence compared to the original reference.

However, to our surprise, we found that this genome has acquired two new mutations which have not been seen among the viruses reported so far. Hence, we observed it closely. At position 1163 (genes orf1ab) a new mutation from A to T has been detected. Previously at the same position nucleotide A to C in one virus, and nucleotide A to G change in another genome reported. 

Also, a brand new mutation at 17019 position was detected in our Bangladeshi isolated virus which has not been reported so far. This means that these are the new changes that the virus has acquired after entering in Bangladesh. Out of 9, other 7 mutations are very common in sequenced viruses so far. Further studies are required to know what trouble or benefit these new mutations have brought to us.

Interestingly, of these nine mutations, it contains a mutation (Single Nucleotide Mutation or SNP) in its spike protein. There is non-silent (non-synonymous), amino acid changing (aspartate to glycine) mutation at the 614th position of the Spike protein (D614G). 

This is of particular interest because probably due to this mutation it could quickly spread in the European and American population and out-competed the original virus from China. This creates an additional serine protease (Elastase) cleavage site near the Open Reading Frame (ORF) S1 and S2 junction of the spike protein.

The interesting aspect is that in human, a single nucleotide mutation (deletion of C nucleotide, delC) (rs35074065 variant site) in the TMPRSS2 receptor gene, facilitates the entry of SARS-CoV-2 with D614G mutation to the cell very effectively. Dr Hemayet Ullah from Howard University, USA, also informed us that this delC mutation is very common in the American and European population but very rare in the East Asian/Asian population -- hence the change of amino acid aspartic acid to glycine in the S protein of the virus may be helpful for Asian countries but more infective in the American and European population. We do see a less severe effect in Asian countries compared to that in the European and American areas. Any deleterious mutation from the perspective of an organism gets lost by natural selection and hope that later in time more virulent mutation does not appear in Asian countries. Several research papers are also available on this mutation.

To understand the origin, we have constructed phylogenetic tree (UPGMA and neighbour-joining) in MEGA with default parameters, with representative sequences from 60 other countries and the reference sequence, totaling 350 sequences. Phylogenetic tree shows that this Bangladeshi SARS- CoV-2 genome isolate seems closer to the European cluster, most likely the person gets infected from someone who returned from European country or maybe she herself returned from any European country. We are fine-tuning the phylogenetic tree, and also in process of making phylogenetic tree with selected high quality 10,000 sequences from 80 countries to better explain the origin and route of transmission of this particular virus.

It is imperative to understand that to comprehend the pattern of infection in Bangladesh, only one genome sequence is not enough. We need sequence of at least 100 isolates. We have made a proposal to ICT ministry in response to their “Call for Nation (Hakathon).” 

In this study proposal we aim to create a dataset by combining 100 coronavirus genomes from Bangladeshi patients and integrate this genome information with patient’s personal/clinical/treatment/diagnostic etc information. This information will be analyzed extensively by computational methods to do clustering, phylogenetic and pharmacogenomics studies, and will compare data with other worldwide available data to make a concrete information-base that will help pharmaceutical industries to produce appropriate drugs and vaccine for our population. 

Also, the ICT Ministry will be able to announce that Bangladesh has uncovered the genome mystery of the coronavirus circulating in Bangladesh and trace back the transmission. This project will be a multicentre research where essential help from ICT/Bangladesh government, and help of IEDCR through Bangladesh government will be required to get patients’ samples and relevant clinical data. We will carry out sequencing (Next Generation Sequencing) of the viral genome and other analyses with our own resources in Bangladesh. 

If ICT/the government supports us, it is also possible to do further research in the future where, in addition to the viral genome, we can sequence genome of some individuals who were infected and developed the disease as well as healthy individuals who did not develop the disease. 

This may also let us know the factors (if any) that conferred resistance to them. Our team consists of relevant experts who are well experienced in doing similar projects at home and abroad, also all members have their own young, energetic, and well-trained working group. 

ABMM Khademul Islam is Associate Professor, Genetic Engineering and Biotechnology, University of Dhaka.

132
Facebook 132
blogger sharing button blogger
buffer sharing button buffer
diaspora sharing button diaspora
digg sharing button digg
douban sharing button douban
email sharing button email
evernote sharing button evernote
flipboard sharing button flipboard
pocket sharing button getpocket
github sharing button github
gmail sharing button gmail
googlebookmarks sharing button googlebookmarks
hackernews sharing button hackernews
instapaper sharing button instapaper
line sharing button line
linkedin sharing button linkedin
livejournal sharing button livejournal
mailru sharing button mailru
medium sharing button medium
meneame sharing button meneame
messenger sharing button messenger
odnoklassniki sharing button odnoklassniki
pinterest sharing button pinterest
print sharing button print
qzone sharing button qzone
reddit sharing button reddit
refind sharing button refind
renren sharing button renren
skype sharing button skype
snapchat sharing button snapchat
surfingbird sharing button surfingbird
telegram sharing button telegram
tumblr sharing button tumblr
twitter sharing button twitter
vk sharing button vk
wechat sharing button wechat
weibo sharing button weibo
whatsapp sharing button whatsapp
wordpress sharing button wordpress
xing sharing button xing
yahoomail sharing button yahoomail