Academic Thesis

Basic information

Name Kojima Isshu
Belonging department
Occupation name
researchmap researcher code R000055330
researchmap agency Okayama University of Science

Title

A protein language model for exploring viral fitness landscapes

Bibliography Type

Joint Author

Author

Ito J, Strange A, Liu W, Joas G, Lytras S; Genotype to Phenotype Japan (G2P-Japan) Consortium; Sato K.

Summary

Successively emerging SARS-CoV-2 variants lead to repeated epidemic surges through escalated fitness (i.e., relative effective reproduction number between variants). Modeling the genotype-fitness relationship enables us to pinpoint the mutations boosting viral fitness and flag high-risk variants immediately after their detection. Here, we present CoVFit, a protein language model adapted from ESM-2, designed to predict variant fitness based solely on spike protein sequences. CoVFit was trained on genotype-fitness data derived from viral genome surveillance and functional mutation assays related to immune evasion. CoVFit successively ranked the fitness of unknown future variants harboring nearly 15 mutations with informative accuracy. CoVFit identified 959 fitness elevation events throughout SARS-CoV-2 evolution until late 2023. Furthermore, we show that CoVFit is applicable for predicting viral evolution through single amino acid mutations. Our study gives insight into the SARS-CoV-2 fitness landscape and provides a tool for efficiently identifying SARS-CoV-2 variants with higher epidemic risk.

Magazine(name)

Nature communication

Publisher

Volume

13

Number Of Pages

16

StartingPage

1

EndingPage

Date of Issue

2025/05

Referee

Exist

Invited

Not exist

Language

English

Thesis Type

Research papers (academic journals)

ISSN

DOI

10.1038/s41467-025-59422-w.

NAID

PMID

40360496

URL

J-GLOBAL ID

arXiv ID

ORCID Put Code

DBLP ID