Louis Tiao

Louis Tiao

Machine Learning Research Scientist (PhD Candidate)

University of Sydney

Bio

Hi. I am Louis Tiao, a machine learning (ML) research scientist with a broad interest in probabilistic ML and a particular focus on approximate Bayesian inference and Gaussian processes, and their applications to Bayesian optimization and graph representation learning.

I recently submitted my PhD thesis at the University of Sydney, where I worked with Edwin Bonilla and Fabio Ramos. Our research has garnered recognition at premier conferences like NeurIPS and ICML, where our work has routinely been selected as Oral/Spotlight talks.

Before pursuing my doctorate, I obtained a BSc in Computer Science with First Class Honours from the University of New South Wales (UNSW Sydney), where I majored in artificial intelligence (AI) and minored in mathematics. I began my professional career in ML as a software engineer at National ICT Australia (NICTA), which later merged into CSIRO’s Data61—the AI research division of Australia’s national science agency.

During my PhD, I’ve had the privilege of collaborating with exceptional people through multiple rewarding industrial research internships. In the Summer and Fall of 2019, I completed a research internship at Amazon in Berlin, Germany, where I worked with Matthias Seeger, Cédric Archambeau, and Aaron Klein. Between Fall 2021 and Spring 2022, I worked with Vincent Dutordoir and Victor Picheny at Secondmind Labs in Cambridge, UK. More recently, in Summer 2022, I extended my stay in Cambridge and returned to Amazon, reuniting virtually with my former Berlin team.

Download my resumé .

Interests
  • Probabilistic Machine Learning
  • Approximate Bayesian Inference
  • Gaussian Processes
  • Bayesian Optimization
Education
  • Ph.D. in Computer Science (Machine Learning), 2023

    University of Sydney

  • B.Sc. (Honours Class 1) in Computer Science (Artificial Intelligence and Mathematics), 2015

    University of New South Wales

News

Employment

Research Experience

 
 
 
 
 
Amazon Web Services
Applied Scientist Intern
Amazon Web Services
Jun 2022 – Oct 2022 Cambridge, United Kingdom

As an applied scientist intern at Amazon Web Services (AWS), I led an explorative research project focused on addressing the challenges of hyperparameter optimization for large language models (LLMs). Our primary objective was to gain a comprehensive understanding of the scaling behavior of LLMs and investigate the feasibility of extrapolating optimal hyperparameters from smaller LLMs to their massive counterparts. This hands-on work involved orchestrating the parallel training of multiple LLMs from scratch across numerous GPU cloud instances to gain insights into their scaling dynamics.

During this internship, I was fortunate to be reunited with Aaron Klein, Matthias Seeger, and Cédric Archambeau, with whom I had previously collaborated during an earlier internship at AWS Berlin.

 
 
 
 
 
Secondmind
Doctoral Placement Researcher
Oct 2021 – May 2022 Cambridge, United Kingdom

As a student researcher at Secondmind (formerly Prowler.io), a research-intensive AI startup renowned for its innovations in Bayesian optimization (BO) and Gaussian processes (GPs), I contributed impactful research and open-source code aligned with their focus on advancing probabilistic ML. Specifically, I developed open-source software to facilitate sampling efficiently from GPs, substantially improving their accessibility and functionality. Additionally, I led a research initiative to improve the integration of neural networks (NNs) with GP approximations, bridging a critical gap between probabilistic methods and deep learning. These efforts culminated in a research paper that was selected for an oral presentation at the International Conference on Machine Learning (ICML).

I had the privilege of collaborating closely with Vincent Dutordoir and Victor Picheny during this period.

 
 
 
 
 
Amazon Web Services
Applied Scientist Intern
Amazon Web Services
Jun 2019 – Dec 2019 Berlin, Germany

As an applied scientist intern at Amazon Web Services (AWS), I contributed to the development of the Automatic Model Tuning functionality in AWS SageMaker. My primary focus was on advancing AutoML and hyperparameter optimization, particularly Bayesian optimization (BO) methods. I spearheaded a research project aimed at integrating multi-fidelity BO with asynchronous parallelism to significantly improve the efficiency and scalability of model tuning. This initiative led to the development of a research paper and the release of open-source code within the AutoGluon, subsequently forming the basis of the SyneTune library.

I had the privilege of working closely with Matthias Seeger, Cédric Archambeau, and Aaron Klein during this internship.

 
 
 
 
 
CSIRO's Data61
Software Engineer
Jul 2016 – Apr 2019 Sydney, Australia
As a machine learning (ML) software engineer at CSIRO’s Data61, the AI research division of Australia’s national science agency, I was an integral part of the Inference Systems Engineering Team, specializing in probabilistic ML for diverse problem domains. Our focus encompassed areas such as spatial inference and Bayesian experimental design, with a primary emphasis on scalability. I led the development of new microservices and contributed to the development of open-source libraries for large-scale Bayesian deep learning. I also had a stint with the Graph Analytics Engineering Team, where my contributions to research on graph representation learning led to a research paper selected for a spotlight presentation at the Conference on Neural Information Processing Systems (NeurIPS).
 
 
 
 
 
National ICT Australia (NICTA)
Software Engineer
National ICT Australia (NICTA)
May 2015 – Jun 2016 Sydney, Australia
As a machine learning (ML) software engineer at NICTA, I was part of an interdisciplinary ML research team contributing to the Big Data Knowledge Discovery initiative, which engaged with leading scientists across various natural sciences domains to develop Bayesian ML software frameworks to support Australia’s evolving scientific research landscape. During this time, I led the development and release of numerous open-source libraries for applying Bayesian ML at scale.
 
 
 
 
 
Commonwealth Scientific and Industrial Research Organisation (CSIRO)
Research Intern
Nov 2013 – Feb 2014 Sydney, Australia
As a summer vacation scholar at CSIRO’s Language and Social Computing team, I applied cutting-edge machine learning (ML) and natural language processing (NLP) techniques to build a robust text classification system for automated sentiment analysis.

Recent & Upcoming Talks

Recent Posts

Teaching

COMP9418: Advanced Topics in Statistical Machine Learning (UNSW Sydney)

The course has a primary focus on probabilistic machine learning methods, covering the topics of exact and approximate inference in directed and undirected probabilistic graphical models - continuous latent variable models, structured prediction models, and non-parametric models based on Gaussian processes.

Lab exercise on Gaussian Process Regression, running in JupyterLab.
Lab exercise on Gaussian Process Regression, running in JupyterLab.

This course has a major emphasis on maintaining a good balance between theory and practice. As the teaching assistant (TA) for this course, my primary responsibility was to create lab exercises that aid students in gaining hands-on experience with these methods, specifically applying them to real-world data using the most current tools and libraries. The labs were Python-based, and relied heavily on the Python scientific computing and data analysis stack (NumPy, SciPy, Matplotlib, Seaborn, Pandas, IPython/Jupyter notebooks), and the popular machine learning libraries scikit-learn and TensorFlow.

Students were given the chance to experiment with a broad range of methods on various problems, such as Markov chain Monte Carlo (MCMC) for Bayesian logistic regression, probabilistic PCA (PPCA), factor analysis (FA) and independent component analysis (ICA) for dimensionality reduction, hidden Markov models (HMMs) for speech recognition, conditional random fields (CRFs) for named-entity recognition, and Gaussian processes (GPs) for regression and classification.

Publications

Quickly discover relevant content by filtering publications.
(2023). Spherical Inducing Features for Orthogonally-Decoupled Gaussian Processes. In ICML2023. Accepted as Oral Presentation.

PDF Cite Code Poster Slides Conference Proceeding

(2022). Batch Bayesian Optimisation via Density-ratio Estimation with Guarantees. In NeurIPS2022.

PDF Code

(2021). BORE: Bayesian Optimization by Density-Ratio Estimation. In ICML2021. Accepted as Long Presentation (Awarded to Top 3% of Papers).

PDF Cite Code Poster Slides Video Conference Proceeding Supplementary material

(2020). Variational Inference for Graph Convolutional Networks in the Absence of Graph Data and Adversarial Settings. In NeurIPS2020. Accepted as Spotlight Presentation (Awarded to Top 3% of Papers).

PDF Cite Code Dataset Video

(2020). Model-based Asynchronous Hyperparameter and Neural Architecture Search.

PDF Cite Code Video

(2018). Cycle-Consistent Adversarial Learning as Approximate Bayesian Inference. In ICML2018 Theoretical Foundations and Applications of Deep Generative Models. Accepted as Contributed Talk..

PDF Cite Code Poster Slides Workshop Homepage

Contact

Get in touch