Life originated in water, and this means that it has been evolving in the oceans for much longer than on land. This process resulted in a huge variety of organisms, especially microorganisms such as bacteria and archaeons. Ocean microorganisms play key roles in the biochemical metabolism and energy processes that affect the state of the ocean and ultimately the Earth’s climate. That’s why learning about the diversity and understanding the functions of the organisms that populate the oceans is so crucial.
Scientists used an advanced technique called metagenomics to examine the DNA of all organisms present in ocean water samples taken from different locations and depth zones of oceans around the world. On this basis, they created the so-called. The global ocean genome. It’s a complete set of genes from all marine organisms – from bacteria and archaeons, to fungi and plants, to animals – along with the information those genes encode. It is the foundation of marine biodiversity, the functioning of these ecosystems and all biogeochemical processes in them.
The long history of the evolution of life in the ocean
The ocean is the largest habitat in the world. It covers more than 70 percent. of the globe’s surface and accumulates more than 1.3 billion km3, or about 97 percent. Earth’s total water resources. Also in the ocean the first forms of life on Earth originated and this took place about 3.9 billion years ago. The long evolutionary history of life in the oceans is illustrated by the fact that of the 34 known types of animals, only one (Onychophora claws), is found exclusively on land, while all the others have representatives also associated with the aquatic environment. Considering how long life has evolved in the ocean, it should come as no surprise that it is characterized by great biodiversity. Most of the organisms inhabiting it are still unexplored to us.
By far the most abundantly represented in the ocean are prokaryotes, that is, unicellular organisms without cell nuclei or specialized organelles, which include two groups: bacteria (bacteria proper, eubacteria) and archaeons. Although they have much in common, an important feature that distinguishes them is the presence of peptidoglycan (murein) in the cell wall of many bacteria, and its absence in archaeons. More than two million species of bacteria are estimated to live in the global ocean, but still little is known about them or other types of ocean microbes. These organisms are extremely difficult to study, and more than 99 percent of the of which have never been grown in a laboratory.
The power of metagenomics, or the answer to the question of who lives in the ocean
One method of studying microorganisms is sequencing their DNA, which means reading the genomes (set of genes) of the organisms. For many years, the technique was an extremely time-consuming and error-prone process, with individual experiments only able to sequence short stretches of DNA chain. For example, it took 13 years (1990-2003) for scientists from 20 institutions around the world to sequence the human genome as part of the international Human Genome Project, and the work was not finally completed until 2021.
Amazing technical advances in DNA sequencing have not only made it faster, easier and cheaper to read the entire genome, but they have also enabled the rapid development of a field called metagenomics. This is a study of the collective genetic information of all organisms contained in a sample from the environment, such as water or soil. This information tells what types of organisms are present and what ecological functions they perform in the studied habitats.
The first ocean metagenomic survey was conducted in 2003-2004 as part of the Sorcerer II Global Ocean Sampling Expedition. The marine plankton community was then analyzed. Other expeditions have taken place over the past 10 years, including the TARA ocean expedition, which, between 2009 and 2013, collected more than 200 samples from 68 sites, mostly in the upper, epipelagic, part of the ocean. Metagenomic analysis identified 33.3 million genes (for comparison, the human genome contains 30,000).
The global ocean genome, or how many genes the ocean holds
If you can sequence the DNA of a piece of ocean, why not try to do it for the whole? The initiative to create a complete “catalog” of the global ocean genome, dubbed KMAP Global Ocean Gene Catalog 1.0, was undertaken by researchers at King Abdullah University of Science and Technology (KAUST) in Saudi Arabia, and they published their results in January of this year on Frontiers in Science [1, 2].
Using the European Nucleotide Archive’s(ENA) genetic data repository, they collected metagenomic data from more than 2,000. samples taken as part of earlier studies. Most of them came from the Pacific (41 percent) and Atlantic (28 percent), with others from the Indian Ocean, Mediterranean, Arctic and Southern Oceans. The vast majority of samples (78.5 percent) were taken from the upper zone of the ocean (depths up to 200 m), while the rest were taken from the mesopelagic zone (depths of 200-1000 m), the dark ocean (below 1000 m), and (about 4 percent) from the sediments of the benthic zone, or bottom. By sequencing the full DNA in each sample and identifying individual genes, the researchers identified more than 300 million gene clusters (groups of genes with similar function, encoding closely related proteins) present in the ocean’s metagenomes.
The final step was to determine what the genes were responsible for and what organism they came from. This is done by comparing the identified gene sequences and the proteins they encode with existing databases (gene repositories). Researchers were able to identify 52 percent of. of them, which is a very good result, but still means that almost half are still unknown to science.
The main inhabitants of the ocean are bacteria
If anyone thinks that the main inhabitants of the oceans are fish, they are mistaken. By analyzing those gene clusters that could be linked to a specific type of organism and for which functional information was available, the researchers found that more than 78 percent of the genes in all analyzed samples belonged to bacteria, 12 percent. to eukaryotes (organisms whose cells contain a cellular nucleus with chromosomes – animals, plants and fungi), and the remaining 10 percent. To archeons and viruses.
However, the genes of these four main types of oceanic organisms are not evenly distributed across all ocean depth zones, although bacteria were dominant in all (accounting for 77 to 88 percent). For example, genes of eukaryotic organisms and viruses were more often identified in the epipelagic zone (upper layers of the ocean) than in the deep, dark ocean, and the opposite was true for archaeons. This is not surprising, since conditions in different depth zones vary (especially in terms of temperature and access to light), creating ecological niches for different types of organisms.
Interestingly, in the mesopelagic zone, more than half of the identified eukaryote gene groups were fungi, suggesting that these organisms play a more important role in oceanic processes than previously thought.
Microbial metabolism can affect Earth’s climate
The researchers also looked in detail at genes related to microbial metabolism, which keeps the oceans healthy by controlling the flow of nutrients and energy. Some of these processes are essential for the cycling of elements such as carbon, nitrogen and sulfur, and thus can affect the Earth’s climate. Half of all metabolism-related genes were involved in the processing of carbon compounds, viz. Carbon dioxide (CO2) or methane (CH4), as energy sources. Both gases are classified as greenhouse gases and contribute to global warming.
As in the case of taxonomic variability, a large variation was found between ocean depth zones. More than 40 percent. described clusters of genes found in samples from the bottom (benthic) zone were involved in metabolic processes, while in the pelagic zone such clusters were only 25 percent.
Some bacteria and algae use photosynthesis to convertCO2 into carbohydrates in the presence of sunlight, thus absorbingCO2 from the atmosphere. But photosynthesis is not the only pathway that ocean organisms use to metabolize carbon. In addition, methane metabolization pathways do not require light and can occur in ocean depths and bottom zones. The large proportion of gene clusters involved in methane metabolism in the bottom zone testifies to the great importance of this poorly understood area of the ocean for the carbon cycle in nature.
Why do we need an ocean gene catalog?
The KMAP 1.0 ocean gene catalog consists of about 163 million annotated clusters, providing information on the types of organisms that live at different ocean depths and the functions they perform. The global ocean genome is much more than a simple catalog of the organisms living there and their functions. It also has important applications in various fields of research and industry.
The catalog contains information about protein-coding genes that can be useful in drug development, agriculture and other industries. It also contributes to a better understanding of ocean biodiversity, enhances knowledge of the locations of various microorganisms and their role in biogeochemical processes that shape the state of ecosystems and are relevant to climate change. It allows tracking the impact of human activities on marine life. It can serve as a reference for monitoring the effects of global warming, pollution and other anthropogenic changes in the marine environment. Finally, it can be used to set directions for future research, to formulate research theses on specific habitats, groups of organisms or other areas of marine biology.
Next steps in building the global ocean genome
The authors of the cited studies point to several areas that require further work to more fully identify the global ocean genome. One of the priorities is to increase the number of samples from deep-sea zones and the ocean floor, which are highly diverse and poorly explored environments, and are likely to hide many yet undiscovered genes and functions. Another important task is to extend the analysis to viral RNA.
The authors also cite several technological challenges, including the need to increase the computing power needed to analyze metagenomes as new genes are added to existing repositories and to improve techniques for identifying them. This can help to classify 48 percent of. gene clusters that the authors failed to characterize, as well as in the metagenome sequence, where no genes were identified.
The global ocean genome, even after the sequencing work is completed, will require regular updates as the ocean is constantly changing. This means that ongoing scientific cooperation on a global scale is needed to fully understand, monitor and exploit the complexity of its biodiversity.
In the article, I used, among others. From the works:
[1] Laiolo E., Alam I., Uludag M., Jamil T., Agusti S., Gojobori T. et. al. al. (2024). Metagenomic probing toward an atlas of the taxonomic and metabolic foundations of the global ocean genome. Front. Sci. 1:1038696. doi: 10.3389/fsci.2023.1038696
[2] Laiolo E., Alam I., Uludag M., Jamil T., Agusti S., Gojobori T. et al. (2024) The Global Ocean Genome: A “Catalog” of Ocean Life. Front. Young Minds. 12:1052361. doi: 10.3389/frym.2023.1052361