Projects
Personal and professional projects (6 total)
Showing all 6 projects
A hands-on journey through the complete LLM pipeline—tokenization, pretraining, distributed training, and fine-tuning—inspired by Andrej Karpathy's nanochat. Training a 561M parameter conversational model on 11.2B tokens to understand different aspects of modern language models.
Evaluating 8 SSL methods with 3 GNN architectures on large-scale mouse embryo spatial transcriptomics data reveals interesting performance of reconstruction-based approaches
A systematic evaluation of 8 SSL methods with 3 GNN architectures on mouse brain spatial transcriptomics data
An exploration of modern self-supervised learning methods including SimCLR, BYOL, SimSiam, Barlow Twins, DINO, etc, with mathematical foundations and mutual information theory
A comprehensive guide to understanding and applying GNNs to spatial transcriptomics data, from foundational concepts to cutting-edge research implementations
A modern personal website with AI-powered content search and interactive knowledge graphs