[Home | Publications]

Naveen Arunachalam
Senior ML Scientist, Nosis Bio

Email: naveen DOT t DOT arun AT gmail DOT com
GitHub: @naveenarun

Hi, I'm Naveen! I am a Senior ML Scientist at Nosis Bio, where I develop receptor-targeted RNA medicines. Previously, I completed my PhD at MIT in the Kulik Group, where I worked on ML-guided chemical discovery informed by HPC quantum chemistry calculations. My PhD research was supported by the NSF, DARPA, and the Office of Naval Research.
Prior to MIT, I majored in Chemical Engineering at Caltech, where I developed polymer electrolyte MM simulations (advised by Prof. Tom Miller; curr. CEO, Iambic) and created multiscale modeling workflows for GPCR-ligand interactions (advised by Prof. Bill Goddard; prev. cofounder, Schrodinger).
I am interested in advancing AI- and ML-accelerated virtual screening of novel therapeutics, whereby safe and effective medicines are rapidly discovered on demand for any disease.
In my free time, I enjoy hiking, learning new recipes, and speedrunning video games.

Personal Information

Ph.D. MIT, 2023

B.S. Caltech, 2018 (top 5% of graduating class)

Research
I am interested in creating positive feedback loops for computational discovery of real-world therapeutics. My current research is focused on interpetability, improvability, and tractability.

Interpretability: Some of the strongest insights about how to improve ML models can be gained from analyzing high-confidence errors and systematic blind spots in model representations. Addressing these issues closes the generalization gap between in-silico performance and in-vitro/in-vivo validation.

Improvability: Most experimental processes create far more data than they capture. I build workflows for capturing and training on intermediate/hard-to-acquire experimental data that would normally be discarded or not collected at all. This data then informs model architecture decisions to drive performance beyond what is possible with public datasets.

Tractability: Chemical design space is intractably large, which requires the use of surrogate models for optimization. I am interested in developing algorithmic and architectural improvements to ML surrogate models to make discovery campaigns faster and more effective on currently available hardware.

These three areas mutually reinforce each other: interpretability reveals how to improve AI/ML models, better models lead to higher quality and more frequent experimental data, and more data means better opportunities to probe model reasoning.

Teaching

Fall, 2020, 10.585: Engineering Nanotechnology (TA; Prof. Michael Strano)

Spring, 2018, ChE 105: Dynamics and Control of Chemical Systems (TA; Prof. John Seinfeld)

Winter, 2018, ChE 103b: Transport Phenomena (TA; Prof. Mikhail Shapiro; curr. cofounder, Merge Labs)

Fall, 2017, ChE 15: Introduction to Chemical Engineering Computation (TA; Prof. Richard Flagan)

Spring, 2017, ACM 95b/100b: Applied and Computational Mathematics (TA; Prof. Dan Meiron)

Fall, 2016, ChE 62: Separation Processes (TA; Prof. John Seinfeld)

Winter, 2016, Ch1b: General Chemistry (TA; Prof. Sarah Reisman)