Yuhui Hong

Yuhui Hong

Postdoctoral Scholar at Noble Lab (2025 - Present)
University of Washington
Advised by Professor William Stafford Noble

Ph.D. in Computer Science (2020-2025)
Indiana University Bloomington
Advised by Professor Haixu Tang

I specialize in computational mass spectrometry, developing methods to analyze complex biological data through analytical instruments (e.g., LC-MS, GC-MS). Currently working in proteomics mass spectrometry, I focus on advancing peptide de novo sequencing methods to better understand biological systems.

During my PhD study, I developed computational approaches for small molecule identification, including predicting LC-MS/MS spectra and molecular properties from 3D conformations, and identifying chemical formulas directly from spectra to move beyond traditional database-dependent methods.

Prior to graduate school, I received my B.S. in Computer Science from Xidian University, China, in 2019, and subsequently worked as a research assistant in computer vision at Xi'an Jiaotong University from 2019 to 2020 under the guidance of Professor Yaochen Li.

Please feel free to get in touch!


News đź“°

- [09/19/2025] Awarded the UW Data Science Fellowship at eScience Institute, University of Washington.

- [07/07/2025] I will join Noble Lab at University of Washington as a Postdoctoral Scholar in August 2025.

- [05/09/2025] Recipient of the the Luddy Outstanding Research Award.

- [03/04/2025] Our work, A Task-Specific Transfer Learning Approach to Enhancing Small Molecule Retention Time Prediction with Limited Data, has been selected for an oral presentation at ASMS 2025.


Selected Publications ✨

Preprints

tstl_rt
A Task-Specific Transfer Learning Approach to Enhancing Small Molecule Retention Time Prediction with Limited Data
Yuhui Hong, & Haixu Tang (2025).
bioRxiv 2025.06.26.661631. [Preprint] [Code]
TSTL (Task-Specific Transfer Learning) is introduced as a training strategy for predicting retention times in various LC systems with limited training data. Evaluated across 6 benchmark datasets from different LC systems using 5 deep neural network architectures, TSTL achieved significant improvements in prediction accuracy, increasing average R² from 0.587 to 0.825 with superior data efficiency.

Peer-reviewed Articles

fiddle
FIDDLE: a Deep Learning Method for Chemical Formulas Prediction from Tandem Mass Spectra
Yuhui Hong, Sujun Li, Yuzhen Ye, & Haixu Tang (2024).
bioRxiv 2024.11.25.625316. (accepted by Nature Communications). [Preprint] [Code] [PyPI package - msfiddle]
FIDDLE (Formula IDentification by Deep LEarning) is introduced as a deep learning-based method for identifying chemical formulas from MS/MS data. It is trained on over 38,000 molecules and 1 million MS/MS spectra collected under various conditions, including collision energy and precursor types, using Quadrupole Time-of-Flight (QTOF) and Orbitrap instruments.
ac.3c04028
Enhanced Structure-Based Prediction of Chiral Stationary Phases for Chromatographic Enantioseparation from 3D Molecular Conformations
Yuhui Hong, Christopher J Welch, Patrick Piras, & Haixu Tang (2024).
Analytical Chemistry, 96(6), 2351-2359. [Paper] [Code]
3DMolCSP leverages a 3D molecular conformation representation algorithm, alongside a dataset of over 300k enantioseparation records. This approach significantly improves enantioselectivity predictions, enabling more efficient and informed decisions in chiral chromatography.
bioinfo.btad354f1
3DMolMS: Prediction of Tandem Mass Spectra from Three Dimensional Molecular Conformations
Yuhui Hong, Sujun Li, Christopher J Welch, Shane Tichy, Yuzhen Ye, & Haixu Tang (2023).
Bioinformatics, btad354. [Paper] [Code] [Documentation] [PyPI package - molnetpack] [Workflow on Konia]
3DMolMS is a deep neural network model that predicts MS/MS spectra from 3D conformations. The learned molecular representation also enhances predictions of chemical properties, such as elution time and collisional cross section, aiding compound identification.

Presentations and Talks đź’ˇ

Professional Services 🙌

Teaching 👩🏽‍🏫