A unified data optimization framework (dubbed ActiveTCR) that integrates active learning and TCR-epitope binding affinity prediction models. In two distinct use cases, ActiveTCR demonstrated superior performance over passive learning, notably cutting annotation costs approximately half and minimizing redundancy by over 40% - all without compromising on model performance. ActiveTCR stands as the first systematic exploration into the realm of data optimization for TCR-epitope binding affinity prediction.
https://github.com/Lee-CBG/ActiveTCR
Pengfei Zhang, Seojin Bang, Heewook Lee
Introducing catELMo, a groundbreaking and efficient amino acid embedding model, specifically tailored for T cell receptors. This advanced model facilitates an impressive boost of over 20% in absolute AUC when predicting binding affinity for unseen or novel epitopes, outperforming the conventional BLOSUM62. Moreover, catELMo exhibits an extraordinary capacity to maintain comparable performance to BLOSUM62, while reducing about 93% training data, making it a game-changer in the field of TCR analysis.
https://elifesciences.org/reviewed-preprints/88837v1
Pengfei Zhang, Seojin Bang, Michael Cai, Heewook Lee
How to better summarize multiple amino-acid-level embeddings into a single sequence-level embedding compared to average pooling? We build sequence encoders utilizing various structures including Transformer, BiLSTM, and ByteNet, and propose PiTE, a state-of-the-art two-step pipeline designed for TCR-epitope binding affinity prediction..
https://www.worldscientific.com/doi/pdf/10.1142/9789811270611_0032
Pengfei Zhang, Seojin Bang, Heewook Lee
ATM-TCR leverages multi-head self-attention mechanisms to capture biological contextual information and improves generalization ff TCR-epitope binding affinity prediction models. A novel application of the attention map to improve out-of-sample performance by demonstrating on recent SARS-CoV-2 data.
https://www.frontiersin.org/articles/10.3389/fimmu.2022.893247/full
Michael Cai, Seojin Bang, Pengfei Zhang, Heewook Lee