Origin of Covid-19: What We Now Know

An overview of the WHO-China investigation report, an update on current knowledge, and some next steps

Shin Jie Yong, MSc (Res)
16 min readApr 18, 2021
Image: DrAfter123/iStock

The causative agent of Covid-19, SARS-CoV-2, is one successful coronavirus that has initiated a global pandemic. How did it emerge in humans, however, remains a question so crucial that nobody has a full answer to it. After all, knowing how a clinically important virus came into existence is key to anticipating and stopping the next one.

Just as the World Health Organization (WHO) did an investigational field visit in China between January 14 and February 10, 2021, and published their report on March 30, 2021, we thought we might finally have the answer to the origin of Covid-19. Unfortunately, we don’t. But at least we have some hints and have moved a step closer.

The WHO-China report in brief

First, let’s begin with the WHO-China report that spans 120 pages.

The report introduces the members involved — that is, 17 Chinese and 17 experts from other countries — followed by their agenda. The international experts spent weeks one and two in quarantine with the Chinese experts presenting their data. For weeks three and four, the international experts visited several important sites in China — including the Wuhan Institute of Virology (WIV) and Huanan wet market — and interviewed the workers therein.

Source: WHO-China Joint Report

Next, pages 16–110 of the report expounds on the main findings, which involves:

  • Epidemiological data on respiratory infection cases (including Covid-19), purchases of medications (for fever, cold, and cough), pneumonia death rates, and blood donor screening in Wuhan and surrounding provinces. Results revealed, “no evidence to suggest substantial SARS-CoV-2 transmission in the months preceding the outbreak in December [2019].” But it “does not exclude, however, the possibility that some SARS-CoV-2 circulation was occurring in the population at a low level.” Only in December 2019 was there a spike in Covid-19 cases in Wuhan, some of which (but not all) were linked to the Huanan wet market therein.
  • Molecular data analyses from multiple genetic databases to create an evolutionary tree (or map) of coronaviruses, including SARS-CoV-2. This section concludes that the closest relative of SARS-CoV-2 is the RaTG13 bat coronavirus in Yunnan, and that “Estimates of the time to most recent common ancestor (from literature and re-analysis) suggest that virus transmission or circulation date might be recent, in late 2019.”
  • Environmental data from the sampling of floors, doors, object surfaces, sewage, animal products, and stray cats in the Huanan market from January 1 to March 2, 2020, were tested for SARS-CoV-2 via RT-PCR. Of the 923 samples, 73 nonanimal samples were positive. But none of the 457 animal-related samples — involving 188 animals of 18 species — were positive (table four, page 99; shown below). They even tested 2,480 samples from 37 wild animal species in neighboring provinces, and none were positive (table five; page 100). They also did antibody testing on 11,708 blood serum samples from livestock and poultry animals and RT-PCR testing on 26,807 stored animal samples across China, and none were positive (tables six and seven; page 101–102). Several thousands of samples from wildlife (including captive and farmed) animals in China (74 species post-outbreak and 69 species pre-outbreak, including >1,100 bats) were also tested, and they got nothing again (tables eight to 10; page 103–106).
  • Environmental data from the sampling of 440 imported cold-chain goods from 37 countries into Wuhan (table 11; page 107). Results found that Covid-19 cases linked to cold-chain goods were 3.3x higher than goods without cold-chain (5.6% vs. 1.7%). Besides, recall from the above point that 73 nonanimal samples tested positive for SARS-CoV-2, and these samples are linked to stalls that handled cold-chain goods. But all of the imported cold-chain samples — that is, animal and package surface samples — tested negative for SARS-CoV-2 by RT-PCR.
Source: WHO-China Joint Report. Table four; page 99. None of the 457 samples from 188 animals of at least 18 different species were positive for SARS-CoV-2 by RT-PCR.

The report then proposed and ranked — based on theoretical arguments and opinion scale — the possible hypotheses for the origin of SARS-CoV-2 (page 111–120). In the order from most to least likely, the hypotheses are:

  1. Direct zoonotic transmission — that is, animal-to-human spillover.
  2. Introduction via an intermediate host, followed by zoonotic transmission.
  3. Introduction through cold-chain animal products.
  4. Introduction through a lab incident (extremely unlikely).

The report concludes that more research is needed on the food supply chain of the markets in Wuhan to identify the potential hosts of SARS-CoV-2. “Given the geographic range of the animal species in which closest relatives of SARS-CoV-2 have been found,” the report wrote, which includes parts of Southeast Asia such as Thailand and Cambodia. “Such surveys should be expanded to include other countries, guided by knowledge on ecology and smuggling routes.” Further research into the fourth hypothesis, however, was not recommended as much as the first three.

What others think of the WHO-China report

The WHO Director-General Tedros Adhanom Ghebreyesus has admitted that the report did not collect enough evidence, and it was difficult for the international experts to access and assess the raw data. Other countries — the U.S., Australia, Britain, Canada, Czechia, Denmark, Estonia, Israel, Japan, Latvia, Lithuania, Norway, Slovenia, and South Korea — agree that there was “significantly delayed and lacked access to complete, original data and samples.”

Looking at the report’s main findings spanning pages 16–110, there were not really any investigations on the lab leak hypothesis besides a one-day visit at the WIV. Tedros acknowledges the same. “The team also visited several laboratories in Wuhan and considered the possibility that the virus entered the human population as a result of a laboratory incident,” he said in a briefing to the UN health agency on March 30, 2021. “However, I do not believe that this assessment was extensive enough… Although the team has concluded that a laboratory leak is the least likely hypothesis, this requires further investigation, potentially with additional missions involving specialist experts, which I am ready to deploy.”

Therefore, the criticisms on the report mainly lie in the inability to access raw data and prompt dismissal of the lab leak hypothesis without conducting any actual investigations. So, at present, “all hypotheses remain on the table,” Tedros said.

Source: Text excerpt from the WHO director-general’s remarks at the Member State Briefing on the report of the international team studying the origins of SARS-CoV-2 on March 30, 2021.

Other science journalists and experts agree. For example, Matt Ridley, PhD, a renowned science journalist and author, criticized the WHO for conducting a “superficial two-week investigation,” writing that:

The event turned into a 2-hour Chinese propaganda exercise, entertaining the implausible and evidence-free suggestion that Covid was imported on frozen fish or meat while ruling out even investigating the possibility that it might have leaked from the world’s leading bat coronavirus laboratory, which happens to be in Wuhan. Afterwards, members of the WHO team backtracked, saying they were still open-minded about the laboratory, that they had only gone along with the frozen-fish theory “to respect, a bit, the findings” of their Chinese colleagues and that the visit had not been an “investigation” after all.

Nature, a world-leading science journal, has reported that “A small group of scientists have sent letters to the media saying that they wouldn’t trust the outcome of the investigation because it was closely overseen by China’s government.” About 25 international researchers have also signed an open letter to call a full, unrestricted, and independent (free of unresolved conflict of interest) investigation into the Covid-19 origin in China.

There may also be others who kept their concerns quiet. Any mention of a possible lab leak of SARS-CoV-2 was once quickly dismissed as a conspiracy — such as Bill Gates using the virus to push for a new world order or 5G network triggering its emergence — and could be career suicide for scientists. The rising hate for Asian Americans and Donald Trump, former U.S. president, calling SARS-CoV-2 the “Chinese virus” during the pandemic further discouraged any mention about lab leak.

So, this hypothesis has never been subjected to a “fair and dispassionate discussion of the facts as we know them,” remarked David Relman, MD, a microbiologist and immunologist at Stanford University.

But there are other experts who believe otherwise. “I’m sure people will say that the Chinese researchers are lying, but it strikes me as honest,” Eddie Holmes, a virologist at the University of Sydney, told Nature. “But the sceptics are going to want a deeper investigation than the Chinese government allowed.” Peter Daszak, president of EcoHealth Alliance, agrees that there were no signs of dishonesty when the WIV researchers were interviewed.

Arguments against the lab leak hypothesis

One strong argument is simply that the animal spillover hypothesis — from contact with bats, intermediate hosts, or cold-chain goods — is much more probable, in line with the WHO-China report’s conclusions.

Regarding the probability of SARS-CoV-2 introduction into humans, the lab leak hypothesis relies on a single or a few events — namely, a lab mistake (setting aside the deliberate creation and release of SARS-CoV-2).

In contrast, there are millions of events where the wildlife spillover of SARS-CoV-2 into humans can occur, argued Josh Fischman, the senior editor at Scientific American, providing an analogy that: “If you had to bet on a particular card turning up in your poker hand, would you put your money on the card that only has one chance? Or the card that has a million chances to show up? Both scenarios are possible. One is a lot more probable.”

Therefore, although there’s no evidence to prove either the animal spillover or lab leak hypotheses, the former is more probable to occur.

Besides, the WIV claims that they have not worked on any viruses very similar to SARS-CoV-2 before the outbreak. To imply that there’s a possible lab leak means that WIV must have been working with live SARS-CoV-2 virions, which there’s no evidence of, said Daszak, who has worked closely with the WIV on coronavirus research for 15 years.

There are suspicions that samples taken from the sick miners with Covid-19-like pneumonia who worked in the bat cave in Yunnan in 2012 to the WIV might have contained live SARS-CoV-2, which might have then leaked, per the Mojiang Miners Passage (MMP) hypothesis. However, researchers at the WIV have retested the serum samples they collected from the miners and found no signs of SARS-CoV-2 via RT-PCR and antibody tests.

Image from the author: The Mojiang Miners Passage (MMP) hypothesis in brief (refs a, b, c, and d).

Ram Samudrala, PhD, a professor of computational biology and bioinformatics at the University at Buffalo in New York, once commented that “In my opinion, in terms of a conspiracy that involves humans, the best way to evaluate its correctness, is to ask: ‘How many people would have to keep quiet in order for this to work.’”

Many Chinese doctors quickly warned the public about SARS-CoV-2 during the early outbreak in Wuhan despite restrictions and threats by the Chinese Communist Party, Samudrala mentioned. So, it’s highly implausible that nobody at the WIV has confessed by now if SARS-CoV-2-related experiments were really going on at the WIV before the outbreak.

Lastly, as mentioned, researchers at WIV seemed honest when interviewed by the WHO international team. So, if the lab leak hypothesis is true, it would be a lab mistake that everyone missed. It would also be a lab mistake that could have happened in other labs outside of China.

Arguments for the accidental lab leak hypothesis

The presumption here is that if the lab leak hypothesis is true, it’s most likely accidental. If the leak is deliberate, they won’t release it in China, let alone in Wuhan, where the WIV is located. There are at least three ways in which the accidental lab leak can happen:

  • Some of the Wuhan researchers might have gotten infected in the bat cave in Yunnan and carried and seeded the outbreak in Wuhan.
  • Some of the samples taken from the bat cave in Yunnan to Wuhan might have carried live SARS-CoV-2 that seeded the outbreak in Wuhan.
  • Gain-of-function experiments might have been performed to create SARS-CoV-2. The necessary technology for this is already available (and even published) and can be performed without leaving any genetic traces.

As there’s no evidence for any of these speculative scenarios, this article will just focus on the accidental lab leaks, regardless of how they happen.

Robert R. Redfield, MD, MPH, a former director of the U.S. CDC, told CNN in an interview on March 28, 2020, that “I still think the most likely etiology of this pathogen… was from a laboratory, you know, escaped. Other people don’t believe that, that’s fine… It’s not unusual for respiratory pathogens that are being worked on in a laboratory to infect a laboratory worker.” But Redfield cautioned, “That’s my own view. It’s only opinion. I’m allowed to have opinions now.”

As Redfield pointed out, accidental lab leaks of microbes can happen, even in the top laboratories in the world. Over 1,100 cases of accidental leaks of bacteria, viruses, or toxins with potential public health risks happened between 2008 to 2012, the USA Today reported. Other notable examples of accidental lab leaks of microbes include:

  1. The 2014 CDC anthrax lab incident, where workers handled live anthrax bacterium that was thought to be dead. About 70 workers possibly exposed to anthrax were treated with antibiotics and vaccines, and nobody fell ill.
  2. The 2014 CDC influenza lab incident, where the shipment of a harmless H9N2 influenza strain to the USDA poultry lab was contaminated with a dangerous H5N1 influenza strain, but nobody got sick.
  3. The 2014 National Institutes of Health (NIH) smallpox lab incident, where six vials of the lethal variola virus were found in the lab for low-risk research. The vials might have been there since the 1960s and were immediately transferred to a stricter lab before any disaster happened.
  4. Even SARS-CoV-1 has accidentally leaked from the lab on six separate occasions after the 2003 epidemic — one in Singapore, one in Taiwan, and four in Beijing, China. For example, in 2004, the third SARS lab leak happened in less than a year at the Beijing CDC, resulting in hundreds of people being quarantined, nine infections, and one death.

There are unpublished reports of one or two researchers in the WIV who got sick with flu-like symptoms in the fall of 2019, right before the pneumonia outbreak in Wuhan, although they tested negative for SARS-CoV-2 in March/April 2020. But this contradicts the initial claims of Shi Zhengli, head of the WIV, that there were zero infections among their staff and students. As a result, the U.S. has called for more transparency on this matter.

The WIV also has a history of questionable lab practices. The WIV received two official warnings from American embassy officials in 2018 on inadequate lab safety measures. There were also reports that lab workers got wounded from bat attacks and were exposed to bat urine by accident.

Plus, there have been suspicious behaviors at the WIV surrounding the Mojiang incident, as pointed out by a published paper in the Frontiers of Public Health, as well as other sources:

  1. In 2012–2013, the WIV researchers went to the Mojiang mine in Yunnan, right after the incident where miners working therein got severe Covid-19-like pneumonia. The researchers sampled and discovered several bat coronaviruses, including BtCoV/4991. Oddly, they published its full genome as late as 2020, renaming it as RaTG13 without any explanations and made no reference to the Mojiang mine. (RaTG13 is the closest known relative of SARS-CoV-2, sharing about 96% genomic identity.)
  2. The head of the WIV, Shi Zhengli, mentioned that a fungus caused pneumonia in the miners, contrary to what molecular analyses showed.
  3. The database that houses genomic data of coronaviruses, including those sampled from the Mojiang mine, was locked in September 2019 and taken down during the spring of 2020 to allegedly protect against hackers. This database was not even examined during the WHO-China joint investigation. And many now demand that the database’s contents be made public again.
  4. Many journalists from the BBC and Associated Press wanted to investigate the Mojiang mine but were denied access and tailed by the Chinese authorities.
  5. The WIV has also performed secret and classified research for China’s military, according to the Georgia U.S. embassy, which the WHO investigators have most probably been not allowed to see.

Besides, there’s the furin cleavage site (FCS) in the spike protein of SARS-CoV-2 that endows it with the ability to infect human cells with high efficiency. The FCS is arguably the most peculiar thing about SARS-CoV-2 because the FCS is not found in its close relatives — that is, the beta-coronavirus family. This led to a few scientists suspecting that the FCS has been inserted into SARS-CoV-2 via gain-of-function lab experiments that can leave no genetic traces. But this is mere speculation for now as the FCS may exist in unsampled beta-coronaviruses in the wild, or SARS-CoV-2 might have obtained its FCS from nonbeta-coronaviruses.

Another point in favor of the lab leak hypothesis lies in the widespread animal testing China conducted. They tested tens of thousands (~45,000) of animal samples in Wuhan and surrounding China provinces, mainly via RT-PCR, and got nothing (as detailed above). RT-PCR is a very sensitive test that can pick up genetic fragments of SARS-CoV-2. This means that not the entire virion has to be there for a positive RT-PCR result.

Therefore, it appears that the WHO-China report has evidence that SARS-CoV-2 was not found in animals, not even fragments of it. But samples at the WIV were not tested. Yet, the report still concludes that wildlife spillover is the most likely hypothesis, recommending more research on this area. Are the tens of thousands of animal samples sampled — and tested negative for SARS-CoV-2 — not enough?

The next step forward

Yes, apparently, it may not be enough. “The amount of testing that’s been done is not sufficient to say, in any way, that wildlife farms were not carrying the virus,” Daszak remarked. The WHO-China report recommends testing more animal samples from possible smuggling and supplier trading routes, as well as neighboring regions in Southeast Asia. Holmes agrees that the research priority is to “follow the animals.”

The ultimate goal is to find the coronavirus progenitor (or precursor) that gave rise to SARS-CoV-2. This progenitor should share about 99% genomic identity with SARS-CoV-2.

The bat species that carry the RaTG13 coronavirus (the closest known relative of SARS-CoV-2) is Rhinolophus affinis, which is also suspected to harbor the SARS-CoV-2 progenitor. Importantly, R. affinis not only lives in Yunnan but also in other Southeast Asia regions, such as Thailand and Cambodia. This means that Yunnan, which much of the focus has been put on, isn’t the only region where the Covid-19 spillover can occur.

In a preprint study released last month, researchers in China sampled hundreds of samples from various bat species in Yunnan between May 2019 to November 2020. They managed to assemble 24 new coronavirus genomes, including four similar to SARS-CoV-2. But none of them surpasses RaTG13 as the closest known relative to SARS-CoV-2 with 96% genomic identity.

This preprint tells us an important thing — that the SARS-CoV-2 progenitor was not found in Yunnan during this round of sampling. Given the extensive bat sampling performed in Yunnan over the past decade, it appears that Yunnan may not be the right place to look for the SARS-CoV-2 progenitor. “If you gave me a billion dollars, I would not sample in Mojiang cave [in Yunnan],” said Linfa Wang, a world-leading expert in zoonotic diseases, bat immunology, and pathogen discovery. “I would sample in southeast Asia,” such as Thailand and Cambodia, where sampling is relatively lacking.

So, at this point, we can only hope for more research into other geographical regions — and perhaps the WIV, too — to find the progenitor. Yes, even the WIV, if possible. Recall that Tedros said that “all hypotheses remain on the table.” At least, after the WHO-China joint investigation, the lab leak hypothesis discussion has become more acceptable and transparent compared to last year where it was considered a taboo and conspiracy.

We should also consider the possibility that the SARS-CoV-2 progenitor may have gone extinct. Variants or strains of viruses can disappear, as in the live virus capable of infecting a host; however, their genetic sequence can still be preserved in virtual databases. For instance, the flu strain that caused the 1918 pandemic only exists as a virtual genetic sequence now, unless it was reconstructed again using reverse genetics technology. Thus, we might have missed the window of opportunity to identify and sequence the coronavirus progenitor that gave rise to SARS-CoV-2. This suggests that the longer we wait, the less likely we will find the progenitor.

Over a year has passed since the pneumonia outbreak in Wuhan in December 2019. And the origin of SARS-CoV-2 or Covid-19 may forever remain obscure. “I seriously doubt we’ll find a smoking gun,” Filippa Lentzos, PhD, a biosecurity scientist at King’s College London, told Nature this month. “There won’t be an undisputable origins answer. All we’ll have are likelihoods and probabilities.” But at least this is better than being clueless.

If the Covid-19 emerged from the wildlife, we should tighten surveillance and regulations therein, as we have always done, but more seriously. If the lab leak hypothesis is true, or at least with a sufficiently high probability of being true, we should rethink how and what kind of research should be done.

Labs like the WIV that house the largest collection of coronaviruses are meant to study and understand how much of a threat these coronaviruses are in order to prevent a pandemic like Covid-19. “But if instead of doing that, it caused the pandemic, we need to know that,” Ridley said. If not, then more funding should go into the WIV for them to do a better job at preventing the next coronavirus outbreak or pandemic.

If you have made it this far, I appreciate it. Subscribe to my Medium email list here. If you want to become a member to get unlimited access to Medium, you can use my referral link and I will receive a small commission.



Shin Jie Yong, MSc (Res)

Named Standford's world top 1% scientists | Independent science writer and researcher | Powerlifter with national records | Medium boost program's nominator