About me
I am a second year PhD student at Mila and the Université de Montréal, supervised by Damien Scieur.
My research lies at the intersection of Self-Supervised Learning (SSL), Applied Mathematics, and Systems Design. I primarily focus on optimizing the training efficiency and scalability of SSL methods (see predoc report), with a particular emphasis on Joint Embedding Predictive Architectures (JEPA). I am particularly drawn to JEPA, proposed by Yann LeCun, as I believe it represents a promising direction for the future of AI systems.
My work is motivated by the observation that current AI systems lack robust physical understanding and real-world representations—capabilities that could be acquired through sophisticated world models. I hypothesize that video, with its rich informational content, is an ideal modality for learning such world models. Currently, I am investigating V-JEPA architectures, working to reduce their training costs and improve scalability to accelerate the development of effective world-modeling agents.
Recommended Readings (Winter 2025)
I’d like to share some recent papers that I find particularly compelling and relevant to today’s SSL research landscape. Whether you’re interested in theoretical foundations or practical implementations, these papers provide valuable insights.
DINO-WM: World Models On Pre-Trained Visual Features Enable Zero-Shot Planning - A groundbreaking demonstration of task-agnostic SSL world model. The authors achieved remarkable results across diverse environments through test-time planning and action sequence optimization. This work highlights how inference and planning deserve greater attention in the field.
A Cookbook of Self-Supervised Learning - An essential resource for anyone working in self-supervised learning. While the literature review may not cover the most recent developments, the paper’s true value lies in its comprehensive collection of practical implementation techniques used by practitioners. It reveals how the seemingly simple concept of self-supervised learning requires careful attention to numerous implementation details for successful deployment.
The Llama 3 Herd of Models - Masterclass in the intricacies of LLM training at scale offering invaluable insights into the practical challenges and solutions in modern realistic model training.
On the Geometry of Deep Learning - An insightful synthesis of Randall Balestriero’s contributions to understanding deep learning through the lens of spline theory. This comprehensive review illuminates the geometric principles underlying neural networks and their behavior.
On the Limitations of Elo: Real-World Games, are Transitive, not Additive - A crucial contribution given the rising prominence of agentic AI. This work tackle the limits of ELO rating system opening a path for better comparison between AI agents.
Academic Service
I am deeply committed to knowledge sharing in ML through co-organizing MTLMLOpt (a biweekly seminar series on deep learning and optimization), co-creating the Mila Optimization CrashCourse for graduate students, and reviewing for the Montreal AI Symposium.
Theme from Alfredo Canziani. Last update: 03 Jan 2025.