I am a postdoctoral researcher at Department of Statistics & Data Science, Carnegie Mellon University, hosted by Dr Kathryn Roeder and Jing Lei.

I received my PhD (2022) from Department of Biostatistics, University of Washington. My dissertation advisor was Professor Noah Simon. Before graduate school, I double-majored in biology and math at Peking University (2013-17). I also worked on business-motivated problems at Amazon and FOXO Technologies.

You can find my CV here.

Research Interest

My research mainly focuses on scalable statistical learning methods that are mathematically rigorous and computationally efficient. Some research keywords that catch my eyes: nonparametric methods (basis expansion, reproducing kernel and shape-constrained); model selection (cross-validation and its variants); online estimation with streaming data. Single-cell RNA sequencing data analysis (new questions and new methods), polygenic risk score (and related health disparity issues).

Software

Most of my methodology works have a companion R package. I also mix in some C++ to improve the computational efficiency when needed. It is hard to keep the package information up-to-date in peer-reviewed publications so I list them here for your ease of reference.

Sieve on R CRAN:

Tianyu Zhang and Jing Lei. “Online Estimation with Rolling Validation: Adaptive Nonparametric Estimation with Streaming Data.”
Tianyu Zhang and Noah Simon. “A Sieve Stochastic Gradient Descent Estimator for Online Nonparametric Regression in Sobolev Ellipsoids.”

HMC on R CRAN:

Tianyu Zhang, Jing Lei, and Kathryn Roeder. “Debiased Projected Two-Sample Comparisonscfor Single-Cell Expression Data.”

Joint-Lassosum on Github:

Tianyu Zhang, Geyu Zhou, Lambertus Klei, Peng Liu, Alexandra Chouldechova, Hongyu Zhao, Kathryn Roeder, Max G’Sell, and Bernie Devlin. “Evaluating and Improving Health Equity and Fairness of Polygenic Scores.”