I am a postdoctoral researcher at Department of Statistics & Data Science, Carnegie Mellon University, hosted by Dr. Kathryn Roeder and Dr. Jing Lei.

I received my PhD (2022) from Department of Biostatistics, University of Washington. My dissertation advisor was Dr. Noah Simon. Before graduate school, I double-majored in biology and math at Peking University (2013-17). I also worked on business-motivated problems at Amazon and FOXO Technologies.

You can find my CV with a complete list of publication here.

Research Interest

My research mainly focuses on scalable statistical learning methods that are mathematically rigorous and computationally efficient. Some research keywords that catch my eyes: nonparametric methods (basis expansion, reproducing kernel and shape-constrained); model selection (cross-validation and its variants); online estimation with streaming data. Single-cell RNA sequencing data analysis (new questions and new methods), polygenic risk score (and related health disparity issues).

Software

Most of my methodology works have a companion R package. I also mix in some C++ to improve the computational efficiency when needed. It is hard to keep the package information up-to-date in peer-reviewed publications so I list them here for your ease of reference.

Sieve on R CRAN:

Tianyu Zhang and Jing Lei. “Online Estimation with Rolling Validation: Adaptive Nonparametric Estimation with Streaming Data.”
Tianyu Zhang and Noah Simon. “A Sieve Stochastic Gradient Descent Estimator for Online Nonparametric Regression in Sobolev Ellipsoids.”

HMC on R CRAN:

Tianyu Zhang, Jing Lei, and Kathryn Roeder. “Debiased Projected Two-Sample Comparisonscfor Single-Cell Expression Data.”

Joint-Lassosum on Github:

Tianyu Zhang, Geyu Zhou, Lambertus Klei, Peng Liu, Alexandra Chouldechova, Hongyu Zhao, Kathryn Roeder, Max G’Sell, and Bernie Devlin. “Evaluating and Improving Health Equity and Fairness of Polygenic Scores.”