2020
 October 12, Artificial Stupidity: The New AI and the Future of Fintech, Andrew W. Lo (Massachusetts Institute of Technology)
 September 28, LSTM is dead. Long Live Transformers! Leo Dirac, Amazon. LSTM paper, LSTM Diagrams Understanding LSTM, Attention is all you need, Illustrated Attention, BERT: Pretraining of Deep Bidirectional Transformers for Language Understanding, Deep contextualized word representations, huggingface/transformers
 September 14, Machine Learning Projects Against COVID19, Yoshua Bengio, Université de Montréal
 No Lunch Seminar during the summer break
 June 22, Kernel and Deep Regimes in Overparameterized Learning, Suriya Gunasekar (TTIChicago, Microsoft Research)
 June 15, Energybased Approaches to Representation Learning, Yann LeCun (New York University, FaceBook)
 June 8, Learning probability distributions; What can, What can’t be done, Shai BenDavid (University of Waterloo)
 May 25, Generalized Resilience and Robust Statistics, Jacob Steinhardt (UC Berkeley).
 May 18, From Classical Statistics to Modern Machine Learning, Mikhail Belkin (The Ohio State University).
 References: Mikhail Belkin, Siyuan Ma, Soumik Mandal To Understand Deep Learning We Need to Understand Kernel Learning, Luc Devroye et al., The Hilbert Kernel Regression Estimate, Cover and Hart, Nearest Neighbor Pattern Classification, Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate, Adityanarayanan Radhakrishnan et al., Overparameterized Neural Networks Can Implement Associative Memory, Mikhail Belkin et al., Reconciling modern machinelearning practice and the classical bias–variance tradeoff, Madhu S. Advani, Andrew M. Saxe, Highdimensional dynamics of generalization error in neural networks
 May 11, Automatic Machine Learning, part 3, (from minute 90) Frank Hutter (University of Freiburg) and Joaquin Vanschoren (Eindhoven University of Technology). Slides part 3.
 May 4, Automatic Machine Learning, part 2, (from minute 4690) Frank Hutter (University of Freiburg) and Joaquin Vanschoren (Eindhoven University of Technology). Slides parts 12, Slides part 3.
 References: Thomas Elsken et al., Neural Architecture Search: A Survey. J. Bergstra et al., Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures, Hector Mendoza et al., Towards AutomaticallyTuned Neural Networks, Barret Zoph et. al., Neural Architecture Search with Reinforcement Learning, Thomas Elsken et al., Neural Architecture Search: A Survey, Peter J. Angeline et al., An Evolutionary Algorithm that Constructs Recurrent Neural Networks, Kenneth O. Stanley et al., Evolving Neural Networks through Augmenting Topologies, Risto Miikkulainen et al., Evolving Deep Neural Networks, Esteban Real et al.,Regularized Evolution for Image Classifier Architecture Search, Kevin Swersky et al., Raiders of the Lost Architecture: Kernels for Bayesian Optimization in Conditional Parameter Spaces, Kirthevasan Kandasamy et al., Neural Architecture Search with Bayesian Optimisation and Optimal Transport, Chenxi Liu et al., Progressive Neural Architecture Search, Arber Zela et al., Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search, Catherine Wong et al., Transfer Learning with Neural AutoML, Tianqi Chen et al., Net2Net: Accelerating Learning via Knowledge Transfer, Tao Wei et al., Network Morphism, Han Cai et al., PathLevel Network Transformation for Efficient Architecture Search, Han Cai et al., Efficient Architecture Search by Network Transformation, Thomas Elsken et al., Simple and Efficient Architecture Search for CNNs, Corinna Cortes et al., AdaNet: Adaptive Structural Learning of Artificial Neural Networks, Shreyas Saxena, Convolutional Neural Fabrics, Gabriel Bender et al., Understanding and Simplifying OneShot Architecture Search, Hieu Pham et al., Efficient Neural Architecture Search via Parameter Sharing, Andrew Brock et al., SMASH: OneShot Model Architecture Search through HyperNetworks, Hanxiao Liu et al., DARTS: Differentiable Architecture Search, Mingxing Tan, MnasNet: PlatformAware Neural Architecture Search for Mobile, Rui Leite et al., Selecting Classification Algorithms with Active Testing, Salisu Mamman Abdulrahman et al., Speeding up algorithm selection using average ranking and active testing by introducing runtime, Martin Wistuba et al., Learning Hyperparameter Optimization Initializations, J. N. van Rijn et al., Hyperparameter Importance Across Datasets, Philipp Probst et al., Tunability: Importance of Hyperparameters of Machine Learning Algorithms, Martin Wistuba et al., Hyperparameter Search Space Pruning – A New Component for Sequential ModelBased Hyperparameter Optimization, C. E. Rasmussen et al.,Gaussian Processes for Machine Learning, Martin Wistuba et al., Scalable Gaussian processbased transfer surrogates for hyperparameter optimization, Matthias Feurer et al., Scalable MetaLearning for Bayesian Optimization
 April 27, Automatic Machine Learning, part 1, Frank Hutter (University of Freiburg) and Joaquin Vanschoren (Eindhoven University of Technology). Slides parts 12, Slides part 3.
 References: Part 1: Book, chapter 1, J. Močkus, On bayesian methods for seeking the extremum, Nando de Freitas et al., Exponential Regret Bounds for Gaussian Process Bandits with Deterministic Observations, Kenji Kawaguchi et al., Bayesian Optimization with Exponential Convergence, Ziyu Wang et al., Bayesian Optimization in High Dimensions via Random Embeddings, Frank Hutter et al., Sequential ModelBased Optimization for General Algorithm Configuration, Kevin Swersky et al., Raiders of the Lost Architecture: Kernels for Bayesian Optimization in Conditional Parameter Spaces, Leo Breiman, Random Forests, Jasper Snoek et al., Scalable Bayesian Optimization Using Deep Neural Networks, Jost Tobias Springenberg, Bayesian Optimization with Robust Bayesian Neural Networks, James Bergstra, Algorithms for HyperParameter Optimization, Hans Georg Beyer, Hans Paul Paul Schwefel, Evolution strategies –A comprehensive introduction, Nikolaus Hansen, The CMA Evolution Strategy: A Tutorial, Ilya Loshchilov et al., CMAES for hyperparameters optimization of neural networks, Tobias Domhan et al., Speeding Up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves, Luca Franceschi et al., Bilevel Programming for Hyperparameter Optimization and MetaLearning, Jelena Luketina et al., Scalable GradientBased Tuning of Continuous Regularization Hyperparameters, Aaron Klein et al, Learning curve prediction with Bayesian neural networks, Kevin Swersky, MultiTask Bayesian Optimization, Kevin Swersky, FreezeThaw Bayesian optimization, Kirthevasan Kandasamy, Multifidelity Bayesian Optimisation with Continuous Approximations, Stefan Falkner et al., BOHB: Robust and Efficient Hyperparameter Optimization at Scale, Github link, Lisha Li et al., Hyperband: Banditbased configuration evaluation for hyperband parameters optimiation, Kevin Jamieson, Nonstochastic Best Arm Identification and Hyperparameter Optimization, Chris Thornton et al., AutoWEKA: Combined Selection and Hyperparameter Optimization of Classification Algorithms, Brent Komer et al., HyperoptSklearn: Automatic Hyperparameter Configuration for ScikitLearn, Matthias Feurer et al., Efficient and Robust Automated Machine Learning, Autosklearn, GitHub link, Randal S. Olson et al., Automating Biomedical Data Science Through TreeBased Pipeline Optimization
 April 6, Efficient Deep Learning with Humans in the Loop, Zachary Lipton (Carnegie Mellon University)
 References: Davis Liang et al., Learning NoiseInvariant Representations for Robust Speech Recognition, Zachary C. Lipton et al., BBQNetworks: Efficient Exploration in Deep Reinforcement Learning for TaskOriented Dialogue Systems, Yanyao Shen et al., Deep Active Learning for Named Entity Recognition, Aditya Siddhant et al., Deep Bayesian Active Learning for Natural Language Processing: Results of a LargeScale Empirical Study, David Lowell et al., Practical Obstacles to Deploying Active Learning, Peiyun Hu et al., Active Learning with Partial Feedback, Shish Khetan et al., Learning From Noisy Singlylabeled Data, Yanyao Shen et al. Deep Active Learning for Named Entity Recognition, Peiyun Hu et al. Active Learning with Partial Feedback, Jonathon Byrd et al., What is the Effect of Importance Weighting in Deep Learning? Jason Yosinski et al., Understanding Neural Networks Through Deep Visualization
 March 30, Studying Generalization in Deep Learning via PACBayes, Gintare Karolina Dziugaite (Element AI)

 Few references: G.K Dziugaite, D. Roy, Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data, Huang et al., Stochastic Neural Network with Kronecker Flow, Zhou et al., Nonvacuous Generalization Bounds at the ImageNet Scale: a PACBayesian Compression Approach, Abadi et al., Deep Learning with Differential Privacy, R Herbrich, T Graepel, C Campbell, Bayes point machines, Neyshabur et al., The role of overparametrization in generalization of neural networks, K Miyaguchi, PACBayesian Transportation Bound
 A little bit of background on probably approximately correct (PAC) learning: Probably Approximately Correct Learning, A primer on PACBayesian learning
 March 23, Integrating Constraints into Deep Learning Architectures with Structured Layers J. Zico Kolter (Carnegie Mellon University)

 References: Honglak Lee, Roger Grosse, Rajesh Ranganath, Andrew Y. Ng, Convolutional Deep Belief Networksfor Scalable Unsupervised Learning of Hierarchical Representations. Brandon Amos, J. Zico Kolter, OptNet: Differentiable Optimization as a Layer in Neural Networks. PoWei Wang, Priya L. Donti, Bryan Wilder, Zico Kolter, SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver. Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, David Duvenaud, Neural Ordinary Differential Equations. Shaojie Bai, J. Zico Kolter, Vladlen Koltun, Trellis Networks for Sequence Modeling. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Attention Is All You Need
 March 2, Rebooting AI, Gary Marcus (Robust AI)
 February 24, Is Optimization the Right Language to Understand Deep Learning? Sanjeev Arora (Princeton University)
 February 17, Adversarial Machine Learning, Ian Goodfellow (Google)
 February 10, Our Mathematical Universe, Max Tegmark (MIT)
 February 3, Nobel Lecture: Michel Mayor, Nobel Prize in Physics 2019
 January 29, How to Successfully Harness Machine Learning to Combat Fraud and Abuse, Elie Bursztein, AntiAbuse Research Lead (Google)
2019
 December 16 & January 13, 201920, Variational Inference: Foundations and Innovations (Part 2, 46′), David Blei (Columbia University)
 December 2 & 9, 2019, Variational Inference: Foundations and Innovations (Part 1), David Blei (Columbia University)
 November 18, On Large Deviation Principles for Large Neural Networks, Joan Bruna (Courant Institute of Mathematical Sciences, NYU)
 November 11, 2019, Anomaly Detection using Neural Networks, Dean Langsam (BlueVine)
 October 28 & November 4, 2019, Extreme Value Theory. Paul Embrechts (ETH)
 October 7, 2019, On the Optimization Landscape of Matrix and Tensor Decomposition Problems, Tengyu Ma (Princeton University)
 September 30, 2019, Recurrent Neural Networks, Ava Soleimany (MIT)
 September 23, 2019, When deep learning does not learn, Emmanuel Abbe (EPFL and Princeton)
 July 15, 2019, Optimality in Locally Private Estimation and Learning, John Duchi (Stanford)
 July 1, 2019. Capsule Networks, Geoffrey Hinton (University of Toronto – Google Brain – Vector institute)
 June 24, 2019, A multiperspective introduction to the EM algorithm, William M. Wells III.
 June 17, 2019, Theoretical Perspectives on Deep Learning, Nati Srebro (TTI Chicago)
 May 27, 2019. 2018 ACM Turing Award. Stanford Seminar – Human in the Loop Reinforcement Learning. Emma Brunskill (Stanford)
 May 20, 2019. How Graph Technology Is Changing Artificial Intelligence and Machine Learning. Amy E. Hodles (Neo4j), Jake Graham (Neo4j).
 May 13, 2019, 2017 Nobel Lectures in Physics. Awarded « for decisive contributions to the LIGO detector and the observation of gravitational waves ». Rainer Weiss (MIT), Barry C. Barish (Caltech) and Kip S. Thorne (Caltech)
 May 6, 2019, Accessorize to a Crime: Real and Stealthy Attacks on StateOfTheArt Face Recognition, Mahmood Sharif, Sruti Bhagavatula, Lujo Bauer (Carnegie Mellon University) and Michael K. Reiter (University of North Carolina Chapel Hill), paper
 April 29, 2019, Build Intelligent Fraud Prevention with ML and Graphs, Nav Mathur, Graham Ganssle
 April 15, 2019, Active Learning: Why Smart Labeling is the Future of Data Annotation, Jennifer Prendki (Figure Eight)
 April 8, 2019, Generalization, Interpolation, and Neural Nets, Alexander Rakhlin (MIT)
 April 1, 2019, Similarity learning using deep neural networks – Jacek Komorowski (Warsaw University of Technology)
 March 18/25, 2019, Deep Reinforcement Learning (First lecture of MIT course 6.S091), Lex Fridman (MIT)
 March 11, 2019, Ensembles: Boosting, Alexander Ihler University of California, Irvine)
 March 4, 2019, Dataset shift in machine learning, Peter Prettenhofer (DataRobot)
 February 25, 2019, Could Machine Learning Ever Cure Alzheimer’s Disease? – Winston Hide (Sheffield University)
 February 18, 2019, 2015 IAAA Winner Intelligent Surgical Scheduling System
 February 11, 2019, Artificial Intelligence Machine Learning Big Data, Exponential Finance – Neil Jacobstein (Singularity University)
 February 4, 2019, Bayesian Deep Learning with Edward (and a trick using Dropout) – Andrew Rowan (PrismFP)
 January 28, 2019, Ouroboros, Aggelos Kiayias (University of Edinburgh)
 January 21, 2019, Cosmos Proof of Stake – Sunny Aggrawal
 January 14, 2019, Geometric Deep Learning – Michael Bronstein (University of Lugano and Tel Aviv University)
 January 7, 2019, Deep Generative Networks as Inverse Problems – Stéphane Mallat, Ecole Normale Supérieure (ENS)
2018