Mingqian Ma 马鸣谦
I am a Master student at Carnegie Mellon University, advised by Prof. Jian Ma. My work centers on foundation models for science, with a particular focus on DNA and genomics.
Previously, I was a research intern at Microsoft Research AI for Science with Dr. Guoqing Liu, where I worked on pretraining DNA foundation models (HybriDNA, NatureLM). I received my undergraduate degrees through a dual-degree program — a BSE in Computer Science & Engineering from the University of Michigan, advised by Prof. L. Jay Guo, and a BE in Electrical & Computer Engineering from Shanghai Jiao Tong University, with Prof. Xiaofeng Gao.
News
- Apr 2026 New preprint: SkillFoundry — building self-evolving agent skill libraries from heterogeneous scientific resources.
- Mar 2025 Our optical multilayer thin-film inverse design survey was accepted by iScience.
- Feb 2025 HybriDNA and the NatureLM project I contributed to at MSR are now on arXiv.
- May 2024 Joined Microsoft Research AI4Science as a research intern, working on pretraining large-scale DNA foundation models with Dr. Guoqing Liu.
Research
I build foundation models for sequential scientific data — biology in particular. Current questions: how do we pretrain DNA models that scale to ultra-long genomic context, generalize across organisms, and respect the symmetries of biology? How do we wire foundation models into agentic systems that can drive real scientific workflows?
Publications
* denotes equal contribution. Full list on Google Scholar.
-
SkillFoundry: Building Self-Evolving Agent Skill Libraries from Heterogeneous Scientific Resources
ArXiv, 2026
SkillFoundry converts heterogeneous scientific resources — repositories, APIs, scripts, and documentation — into validated, reusable agent skills. The framework organizes scientific domains as knowledge trees, extracts operational contracts, compiles executable skill packages, and refines them through iterative validation. 71.1% of mined skills are novel relative to existing libraries, and they meaningfully improve agent performance on genomics workflows including cell-type annotation and scDRS.
-
Reverse-Complement Consistency for DNA Language Models
ArXiv, 2025
A simple fine-tuning recipe for Reverse-Complement Consistency tasks for DNA language models.
-
HybriDNA: A Hybrid Transformer-Mamba2 Long-Range DNA Language Model
ArXiv, 2025
Advances in natural language processing have inspired new approaches to modeling DNA, often called the “language of life.” However, DNA modeling requires handling ultra-long sequences with single-nucleotide precision and excelling in both generative and understanding tasks. We introduce HybriDNA, a decoder-only DNA language model that combines Transformer and Mamba2 architectures to efficiently process sequences up to 131kb. HybriDNA achieves state-of-the-art performance across 33 DNA understanding benchmarks and excels in generating synthetic regulatory elements. Our findings highlight its scalability from 300M to 7B parameters, demonstrating its potential to drive new discoveries in DNA research and applications.
-
NatureLM: Deciphering the Language of Nature for Scientific Discovery
ArXiv, 2025
NatureLM, developed by Microsoft Research AI for Science, is a groundbreaking sequence-based science foundation model designed to unify multiple scientific domains, including small molecules, materials, proteins, DNA and RNA. This innovative model leverages the “language of nature” to enable scientific discovery through text-based instructions.
-
Optical Multilayer Thin Film Structure Inverse Design: From Optimization to Deep Learning
iScience, Volume 28, Issue 4, 112222, 2024
A survey paper of optical multilayer thin film structure inverse design. The survey convers all aspects from the traditional optimization-based methods to state-of-the-art deep learning-enabled inverse design algorithms.
Education & Experience
- 2025 – Present Carnegie Mellon University · M.S., advised by Prof. Jian Ma
- 2024 – 2025 Microsoft Research · Research Intern, AI for Science, with Dr. Guoqing Liu
- 2023 – 2025 University of Michigan · BSE in Computer Science & Engineering, advised by Prof. L. Jay Guo
- 2021 – 2025 Shanghai Jiao Tong University · BE in Electrical & Computer Engineering, advised by Prof. Xiaofeng Gao