Speakers
Keynote Speakers

Joan
BRUNA
Keynote 3
Mathematics of Neural Networks in the Billion-parameter Age
The pace of progress of large-scale machine learning keeps increasing, towards even bigger models and datasets, producing astonishing results along the way in data-heavy domains such as text or images. Such rapid progress also leaves our mathematical understanding further behind, to the extent that one wonders whether it will ever catch up?
In this talk, we will raise salient questions about this trend while zooming-in on technical snippets, covering approximation properties of transformers, mathematical aspects of score-based diffusion generative models, and optimization aspects of learning semi-parametric models.
Topic's brief abstract The pace of progress of large-scale machine learning keeps increasing, towards even bigger models and datasets, producing astonishing results along the way in data-heavy domains such as text or images. Such rapid progress also leaves our mathematical understanding further behind, to the extent that one wonders whether it will ever catch up?
In this talk, we will raise salient questions about this trend while zooming-in on technical snippets, covering approximation properties of transformers, mathematical aspects of score-based diffusion generative models, and optimization aspects of learning semi-parametric models.
Joan Bruna is an Associate Professor at Courant Institute, New York University (NYU), in the Department of Computer Science, Department of Mathematics (affiliated) and the Center for Data Science. He belongs to the CILVR group and to the Math and Data groups. From 2015 to 2016, he was Assistant Professor of Statistics at UC Berkeley and part of BAIR (Berkeley AI Research). Before that, he worked at FAIR (Facebook AI Research) in New York. Prior to that, he was a postdoctoral researcher at Courant Institute, NYU. He completed his PhD in 2013 at Ecole Polytechnique, France. Before his PhD he was a Research Engineer at a semi-conductor company, developing real-time video processing algorithms. Even before that, he did a MsC at Ecole Normale Superieure de Cachan in Applied Mathematics (MVA) and a BA and MS at UPC (Universitat Politecnica de Catalunya, Barcelona) in both Mathematics and Telecommunication Engineering. For his research contributions, he has been awarded a Sloan Research Fellowship (2018), a NSF CAREER Award (2019), a best paper award at ICMLA (2018) and the IAA Outstanding Paper Award.
Biography Joan Bruna is an Associate Professor at Courant Institute, New York University (NYU), in the Department of Computer Science, Department of Mathematics (affiliated) and the Center for Data Science. He belongs to the CILVR group and to the Math and Data groups. From 2015 to 2016, he was Assistant Professor of Statistics at UC Berkeley and part of BAIR (Berkeley AI Research). Before that, he worked at FAIR (Facebook AI Research) in New York. Prior to that, he was a postdoctoral researcher at Courant Institute, NYU. He completed his PhD in 2013 at Ecole Polytechnique, France. Before his PhD he was a Research Engineer at a semi-conductor company, developing real-time video processing algorithms. Even before that, he did a MsC at Ecole Normale Superieure de Cachan in Applied Mathematics (MVA) and a BA and MS at UPC (Universitat Politecnica de Catalunya, Barcelona) in both Mathematics and Telecommunication Engineering. For his research contributions, he has been awarded a Sloan Research Fellowship (2018), a NSF CAREER Award (2019), a best paper award at ICMLA (2018) and the IAA Outstanding Paper Award.

Aristides
GIONIS
Keynote 1
Opinion dynamics in online social networks: models and computational methods
Online social networks are widely used nowadays by people to engage in conversations about a variety of topics. Over time, these discussions can have a significant impact on people's opinions. In this talk we present an overview of models that have been proposed in the literature to capture how information spreads and how opinions form in online social media. One of our objectives is to obtain a better understanding of adverse social phenomena, such as increase in polarization and creation of filter bubbles. We then present some of the computational challenges that have arisen recently in this domain. In particular, we discuss mediation stategies for maximizing the diversity of the content of users via recommendations and prioritizing their feed in order to reduce polarization. Finally, we study the question of whether an adversary can sow disagreement in a social network by influencing the opinion of a small set of users.
Topic's brief abstract Online social networks are widely used nowadays by people to engage in conversations about a variety of topics. Over time, these discussions can have a significant impact on people's opinions. In this talk we present an overview of models that have been proposed in the literature to capture how information spreads and how opinions form in online social media. One of our objectives is to obtain a better understanding of adverse social phenomena, such as increase in polarization and creation of filter bubbles. We then present some of the computational challenges that have arisen recently in this domain. In particular, we discuss mediation stategies for maximizing the diversity of the content of users via recommendations and prioritizing their feed in order to reduce polarization. Finally, we study the question of whether an adversary can sow disagreement in a social network by influencing the opinion of a small set of users.
Aristide Gionnis is a WASP professor at KTH Royal Institute of Technology and an adjunct professor at Aalto University. He works in algorithms, data mining, graph mining, and social-network analysis.
Biography Aristide Gionnis is a WASP professor at KTH Royal Institute of Technology and an adjunct professor at Aalto University. He works in algorithms, data mining, graph mining, and social-network analysis.

Helen
MARGETTS
Keynote 4
Developing Artificial Intelligence for the Public Good
Most Artificial Intelligence is developed by and for the private sector. This talk will focus on what can happen when we think about AI from a public sector perspective. How can AI be used to improve policymaking, public services and governance? What are the 'wicked' public policy problems that AI might help to solve? Drawing on research underway at the Public policy programme at The Alan Turing Institute for Data Science and AI in the UK, the talk will explain the tasks for which data science and AI are particularly suited. It will show how the use of these data-driven technologies can foster government innovation, optimise resource allocation and highlight longstanding injustices in public decision-making. Developing and using AI in the public sector might help to make governments more efficient, effective, fair and resilient than ever before.
Topic's brief abstract Most Artificial Intelligence is developed by and for the private sector. This talk will focus on what can happen when we think about AI from a public sector perspective. How can AI be used to improve policymaking, public services and governance? What are the 'wicked' public policy problems that AI might help to solve? Drawing on research underway at the Public policy programme at The Alan Turing Institute for Data Science and AI in the UK, the talk will explain the tasks for which data science and AI are particularly suited. It will show how the use of these data-driven technologies can foster government innovation, optimise resource allocation and highlight longstanding injustices in public decision-making. Developing and using AI in the public sector might help to make governments more efficient, effective, fair and resilient than ever before.
Helen Margetts is Professor of Society and the Internet and Professorial Fellow at Mansfield College. She is a political scientist specialising in the relationship between digital technology and government, politics and public policy. She is an advocate for the potential of multi-disciplinarity and computational social science for our understanding of political behaviour and development of public policy in a digital world. She has published over a hundred books, articles and policy reports in this area, including Political Turbulence: How Social Media Shape Collective Action (with Peter John, Scott Hale and Taha Yasseri, 2015); Paradoxes of Modernization (with Perri 6 and Christopher Hood, 2010); Digital Era Governance (with Patrick Dunleavy, 2006, 2008); and The Tools of Government in the Digital Age (with Christopher Hood, 2007).
Since 2018, Helen has been Director of the Public Policy Programme at The Alan Turing Insitute, the UK’s national institute for data science and artificial intelligence. The programme works with policy-makers to research and develop ways of using data science and AI to improve policy-making and service provision, foster government innovation and establish an ethical framework for the use of data science in government. The programme comprises over 25 research projects involving 60 researchers across 10 universities. As well as being programme director, Helen is theme lead for criminal justice in the AI for Science and Government programme and principal investigator on research projects Hate Speech: Measures and Counter-measures, Social Information and Public Opinion and Political Volatility
Biography Helen Margetts is Professor of Society and the Internet and Professorial Fellow at Mansfield College. She is a political scientist specialising in the relationship between digital technology and government, politics and public policy. She is an advocate for the potential of multi-disciplinarity and computational social science for our understanding of political behaviour and development of public policy in a digital world. She has published over a hundred books, articles and policy reports in this area, including Political Turbulence: How Social Media Shape Collective Action (with Peter John, Scott Hale and Taha Yasseri, 2015); Paradoxes of Modernization (with Perri 6 and Christopher Hood, 2010); Digital Era Governance (with Patrick Dunleavy, 2006, 2008); and The Tools of Government in the Digital Age (with Christopher Hood, 2007).
Since 2018, Helen has been Director of the Public Policy Programme at The Alan Turing Insitute, the UK’s national institute for data science and artificial intelligence. The programme works with policy-makers to research and develop ways of using data science and AI to improve policy-making and service provision, foster government innovation and establish an ethical framework for the use of data science in government. The programme comprises over 25 research projects involving 60 researchers across 10 universities. As well as being programme director, Helen is theme lead for criminal justice in the AI for Science and Government programme and principal investigator on research projects Hate Speech: Measures and Counter-measures, Social Information and Public Opinion and Political Volatility

Balaji
PADMANABHAN
Keynote 2
From Artificial Intelligence to Augmented Intelligence
This talk will first present an overview of Artificial Intelligence over the years leading to where we are today, and use that historical context to discuss some of the successes and failures we have seen over the years. With this backdrop, we will set the stage for thinking about designing for “augmented intelligence” not just artificial intelligence; where we want to think about tackling more complex business and societal problems (than image recognition, say) using a combination of data, algorithms and people. In addition to providing an overview of some important recent work in this context we will present a complex systems perspective into this issue, and show how such a perspective can be useful to design, develop, evaluate and refine newer augmented intelligence methods going forward.
Topic's brief abstract This talk will first present an overview of Artificial Intelligence over the years leading to where we are today, and use that historical context to discuss some of the successes and failures we have seen over the years. With this backdrop, we will set the stage for thinking about designing for “augmented intelligence” not just artificial intelligence; where we want to think about tackling more complex business and societal problems (than image recognition, say) using a combination of data, algorithms and people. In addition to providing an overview of some important recent work in this context we will present a complex systems perspective into this issue, and show how such a perspective can be useful to design, develop, evaluate and refine newer augmented intelligence methods going forward.
Balaji Padmanabhan is the Anderson Professor of Global Management at USF’s Muma College of Business, where he is also the Director of the Center for Analytics & Creativity. He has a Bachelor's degree in Computer Science from Indian Institute of Technology (IIT) Madras and a PhD from New York University (NYU)’s Stern School of Business. He has worked in the data science, AI/machine learning and business analytics areas for 25 years. He has published in data science and related areas at premier journals and conferences in the field and has served on the editorial board of leading journals including Management Science, MIS Quarterly, INFORMS Journal on Computing, Information Systems Research, Big Data, ACM Transactions on MIS and the Journal of Business Analytics. He also works extensively with businesses on data science problems, and has advised over twenty firms in a variety of industries through consulting, executive teaching and research partnerships.
Biography Balaji Padmanabhan is the Anderson Professor of Global Management at USF’s Muma College of Business, where he is also the Director of the Center for Analytics & Creativity. He has a Bachelor's degree in Computer Science from Indian Institute of Technology (IIT) Madras and a PhD from New York University (NYU)’s Stern School of Business. He has worked in the data science, AI/machine learning and business analytics areas for 25 years. He has published in data science and related areas at premier journals and conferences in the field and has served on the editorial board of leading journals including Management Science, MIS Quarterly, INFORMS Journal on Computing, Information Systems Research, Big Data, ACM Transactions on MIS and the Journal of Business Analytics. He also works extensively with businesses on data science problems, and has advised over twenty firms in a variety of industries through consulting, executive teaching and research partnerships.
Tutorial Speakers

Mitali
BANERJEE
Tutorial 3A
Image Recognition Using Deep-Learning: Implementation and Application
This 3 hour module will offer a hands-on introduction to deep learning based image recognition tools. Participants will gain familiarity with preparing and importing images into software (python) and applying one of the foundational deep learning architectures to classify the images and create vector representations. We will discuss different applications of the output of deep learning tools to extract managerial and scientific insights. In particular, the course will discuss applications of these tools to creating large-scale measures that have otherwise proven to be elusive to measure or susceptible to bias in measurement.
Pre-requisites:
- Basic knowledge of linear algebra is helpful but not required
- Basic knowledge of python (e.g. libraries such as pandas and numpy) is helpful but not required.
- Basic familiarity with standard regression OLS models. You should be familiar with what it means to estimate relationships between variables using OLS models.
- A gmail account is required to open the google collab notebooks which will be shared before the class.
Topic's brief abstract This 3 hour module will offer a hands-on introduction to deep learning based image recognition tools. Participants will gain familiarity with preparing and importing images into software (python) and applying one of the foundational deep learning architectures to classify the images and create vector representations. We will discuss different applications of the output of deep learning tools to extract managerial and scientific insights. In particular, the course will discuss applications of these tools to creating large-scale measures that have otherwise proven to be elusive to measure or susceptible to bias in measurement.
Pre-requisites:
- Basic knowledge of linear algebra is helpful but not required
- Basic knowledge of python (e.g. libraries such as pandas and numpy) is helpful but not required.
- Basic familiarity with standard regression OLS models. You should be familiar with what it means to estimate relationships between variables using OLS models.
- A gmail account is required to open the google collab notebooks which will be shared before the class.

Isabelle
BLOCH
Tutorial 3B
Hybrid Artificial Intelligence and Image Understanding
The tutorial will review a few methods for symbolic AI, for knowledge representation and reasoning, and show how they can be combined with learning approaches for image understanding. Examples in medical image understanding will illustrate the talk.
Topic's brief abstract The tutorial will review a few methods for symbolic AI, for knowledge representation and reasoning, and show how they can be combined with learning approaches for image understanding. Examples in medical image understanding will illustrate the talk.

Rémi
FLAMARY
Tutorial 1B
Optimal Transport for Machine Learning
This tutorial aims at presenting the mathematical theory of optimal transport (OT) and providing a global view of the potential applications of this theory in machine learning, signal and image processing and biomedical data processing.
The first part of the tutorial will present the theory of optimal transport and the optimization problems through the original formulation of Monge and the Kantorovitch formulation in the primal and dual. The algorithms used to solve these problems will be discussed and the problem will be illustrated on simple examples. We will also introduce the OT-based Wasserstein distance and the Wasserstein barycenters that are fundamental tools in data processing of histograms. Finally we will present recent developments in regularized OT that bring efficient solvers and more robust solutions.
The second part of the tutorial will present numerous recent applications of OT in the field of machine learning and signal processing and biomedical imaging. We will see how the mapping inherent to optimal transport can be used to perform domain adaptation and transfer learning. Finally we will discuss the use of OT on empirical datasets with applications in generative adversarial networks, unsupervised learning and processing of structured data such as graphs.
Topic's brief abstract This tutorial aims at presenting the mathematical theory of optimal transport (OT) and providing a global view of the potential applications of this theory in machine learning, signal and image processing and biomedical data processing.
The first part of the tutorial will present the theory of optimal transport and the optimization problems through the original formulation of Monge and the Kantorovitch formulation in the primal and dual. The algorithms used to solve these problems will be discussed and the problem will be illustrated on simple examples. We will also introduce the OT-based Wasserstein distance and the Wasserstein barycenters that are fundamental tools in data processing of histograms. Finally we will present recent developments in regularized OT that bring efficient solvers and more robust solutions.
The second part of the tutorial will present numerous recent applications of OT in the field of machine learning and signal processing and biomedical imaging. We will see how the mapping inherent to optimal transport can be used to perform domain adaptation and transfer learning. Finally we will discuss the use of OT on empirical datasets with applications in generative adversarial networks, unsupervised learning and processing of structured data such as graphs.

Alexandre
GRAMFORT
Tutorial 2B
Supervised learning on multivariate brain signals
Understanding how the brain works in healthy and pathological conditions is considered as one of the major challenges for the 21st century. After the first electroencephalography (EEG) measurements in 1929, the 90's was the birth of modern functional brain imaging with the first functional MRI (fMRI) and full head magnetoencephalography (MEG) system. Presently new tech companies are developing new consumer grade devices for at home recordings of neural activity. By offering noninvasively unique insights into the living brain, these technologies have started to revolutionize both clinical and cognitive neuroscience.
The availability of such new devices made possible by pioneering breakthroughs in physics and engineering now pose major computational and statistical challenges for which machine learning currently plays a major role. In this course you will discover hands-on the types of data one can collect to record the living brain. Then you will learn about state-of-the-art supervised machine learning approaches for EEG signals in the clinical context of sleep stage classification as well as brain computer interfaces. ML techniques that will be explored are based on deep learning as well as Riemannian geometry that has proven very powerful to classify EEG data. You will do so with MNE-Python (https://mne.tools) which has become a reference tool to process MEG/EEG/sEEG/ECoG data in Python, as well as the scikit-learn library (https://scikit-learn.org). For the deep learning aspect you will use the Braindecode package (https://braindecode.org) based on PyTorch. The teaching will be done hands-on using Jupyter notebooks and public datasets, that you will be able to work using google colab.
Finally this tutorial will be a unique opportunity to see what ML can offer beyond standard applications like computer vision, speech or NLP.
Topic's brief abstract Understanding how the brain works in healthy and pathological conditions is considered as one of the major challenges for the 21st century. After the first electroencephalography (EEG) measurements in 1929, the 90's was the birth of modern functional brain imaging with the first functional MRI (fMRI) and full head magnetoencephalography (MEG) system. Presently new tech companies are developing new consumer grade devices for at home recordings of neural activity. By offering noninvasively unique insights into the living brain, these technologies have started to revolutionize both clinical and cognitive neuroscience.
The availability of such new devices made possible by pioneering breakthroughs in physics and engineering now pose major computational and statistical challenges for which machine learning currently plays a major role. In this course you will discover hands-on the types of data one can collect to record the living brain. Then you will learn about state-of-the-art supervised machine learning approaches for EEG signals in the clinical context of sleep stage classification as well as brain computer interfaces. ML techniques that will be explored are based on deep learning as well as Riemannian geometry that has proven very powerful to classify EEG data. You will do so with MNE-Python (https://mne.tools) which has become a reference tool to process MEG/EEG/sEEG/ECoG data in Python, as well as the scikit-learn library (https://scikit-learn.org). For the deep learning aspect you will use the Braindecode package (https://braindecode.org) based on PyTorch. The teaching will be done hands-on using Jupyter notebooks and public datasets, that you will be able to work using google colab.
Finally this tutorial will be a unique opportunity to see what ML can offer beyond standard applications like computer vision, speech or NLP.

Julien
GRAND-CLEMENT
Tutorial 5A
Decision-making Under Uncertainty
The goal of this tutorial is to understand how uncertainty impacts classical decision-making models and the operational and business consequences. Any decision model that is data-driven may face uncertainty due to errors in the data, in the modeling assumptions, or due to the inherent randomness of the decision process. Overlooking this uncertainty may lead to decisions that are suboptimal, unreliable, or, in some crucial applications, practically infeasible and dangerous for the users. In this tutorial, we will learn to (1) estimate the uncertainty given a decision problem and a dataset, and (2) mitigate the impact of uncertainty with a robust approach. As an application, a robust portfolio management problem will be investigated in detail, though we will see that the problem of uncertainty arises in many (if not most) real decision settings.
This tutorial is structured as follows:
*1.How to estimate the uncertainty in a decision model?
1.a.Motivating examples: what is the practical impact of uncertainty?
1.a.i.Wrong images classification, variability in demands for supply chains, artificial intelligence in healthcare, Tesla auto-driving, robotics, maintenance, inventory optimization, facility location, project management, etc.
1.a.ii.Introduction of the running example: portfolio management.
1.b.Understanding the origin of the uncertainty: poor data, little data, is the uncertainty inherent to the application? When do we need to take it into account?
1.c.Risk-sensitive decisions vs. parameter uncertainty.
1.d.How to estimate the uncertainty? Examples with simulations with Colab and synthetic data for the portfolio management problem.
*2.How to mitigate the impact of uncertainty in practice? Robust portfolio management.
2.a.Deterministic approach: pessimism in parameters estimations.
2.b.Robust and distributional robust approach: how to obtain decisions with guarantees of good performances.
2.c.Evidence from simulations with Colab: trade-offs nominal performances vs. worst-case performances for the portfolio management problem. How to deal with variability?
2.d.(Time-permitting) Two-stage decision-making: how to act when uncertainty is revealed over time?
Prerequisites:
- Basic knowledge of statistics (means, confidence intervals, quantiles). Knowing linear programming is a plus. For the simulations, all code will be in Python, and a Colab notebook will be available for the participants, with some pre-coded examples.
Topic's brief abstract The goal of this tutorial is to understand how uncertainty impacts classical decision-making models and the operational and business consequences. Any decision model that is data-driven may face uncertainty due to errors in the data, in the modeling assumptions, or due to the inherent randomness of the decision process. Overlooking this uncertainty may lead to decisions that are suboptimal, unreliable, or, in some crucial applications, practically infeasible and dangerous for the users. In this tutorial, we will learn to (1) estimate the uncertainty given a decision problem and a dataset, and (2) mitigate the impact of uncertainty with a robust approach. As an application, a robust portfolio management problem will be investigated in detail, though we will see that the problem of uncertainty arises in many (if not most) real decision settings.
This tutorial is structured as follows:
*1.How to estimate the uncertainty in a decision model?
1.a.Motivating examples: what is the practical impact of uncertainty?
1.a.i.Wrong images classification, variability in demands for supply chains, artificial intelligence in healthcare, Tesla auto-driving, robotics, maintenance, inventory optimization, facility location, project management, etc.
1.a.ii.Introduction of the running example: portfolio management.
1.b.Understanding the origin of the uncertainty: poor data, little data, is the uncertainty inherent to the application? When do we need to take it into account?
1.c.Risk-sensitive decisions vs. parameter uncertainty.
1.d.How to estimate the uncertainty? Examples with simulations with Colab and synthetic data for the portfolio management problem.
*2.How to mitigate the impact of uncertainty in practice? Robust portfolio management.
2.a.Deterministic approach: pessimism in parameters estimations.
2.b.Robust and distributional robust approach: how to obtain decisions with guarantees of good performances.
2.c.Evidence from simulations with Colab: trade-offs nominal performances vs. worst-case performances for the portfolio management problem. How to deal with variability?
2.d.(Time-permitting) Two-stage decision-making: how to act when uncertainty is revealed over time?
Prerequisites:
- Basic knowledge of statistics (means, confidence intervals, quantiles). Knowing linear programming is a plus. For the simulations, all code will be in Python, and a Colab notebook will be available for the participants, with some pre-coded examples.

Johan
HOMBERT
Tutorial 1A
Data in Finance: FinTech Lending
This tutorial includes a short lecture followed by an interactive game in which participants play the role of a FinTech lender. Context: Banks and insurers increasingly use alternative data and machine learning to screen consumers and price products. For example, a FinTech using digital footprints to predict default will have a competitive edge over traditional banks. However, there are important pitfalls to avoid when using alternative data and machine learning to score consumers, such as the winner’s curse, the risk of discrimination and the Lucas critique. This tutorial and its interactive game provide an introduction to these issues.
Pre-requisites:
- Multivariate statistical analysis, in particular OLS / logit regressions, and/or machine learning methods
Topic's brief abstract This tutorial includes a short lecture followed by an interactive game in which participants play the role of a FinTech lender. Context: Banks and insurers increasingly use alternative data and machine learning to screen consumers and price products. For example, a FinTech using digital footprints to predict default will have a competitive edge over traditional banks. However, there are important pitfalls to avoid when using alternative data and machine learning to score consumers, such as the winner’s curse, the risk of discrimination and the Lucas critique. This tutorial and its interactive game provide an introduction to these issues.
Pre-requisites:
- Multivariate statistical analysis, in particular OLS / logit regressions, and/or machine learning methods

Winston
MAXWELL
Tutorial 2A
Operationalizing AI Regulation
How will Europe’s future AI regulation impact the design, testing and use of AI applications such as credit scoring, recruitment algorithms, anti-fraud algorithms and facial recognition? We will explore how AI concepts such as explainability, fairness, accuracy, robustness and human oversight will be implemented into the future regulation, and how the regulation compares to other international standards on trustworthy AI. The course will focus on two concrete use cases, facial recognition and credit scoring, to see how the European regulatory framework would apply throughout the lifecycle of the project, walking students through the process of creating a risk management system, including an impact assessment on potential risks for safety and fundamental rights, developing a list of requirements, testing, performance parameters, documentation, and human oversight mechanisms. We’ll explore the potential friction between the European AI Act and other regulatory frameworks such as the European General Data Protection Regulation (GDPR), and lead a debate on how the future regulation will impact AI innovation and research in Europe.
Topic's brief abstract How will Europe’s future AI regulation impact the design, testing and use of AI applications such as credit scoring, recruitment algorithms, anti-fraud algorithms and facial recognition? We will explore how AI concepts such as explainability, fairness, accuracy, robustness and human oversight will be implemented into the future regulation, and how the regulation compares to other international standards on trustworthy AI. The course will focus on two concrete use cases, facial recognition and credit scoring, to see how the European regulatory framework would apply throughout the lifecycle of the project, walking students through the process of creating a risk management system, including an impact assessment on potential risks for safety and fundamental rights, developing a list of requirements, testing, performance parameters, documentation, and human oversight mechanisms. We’ll explore the potential friction between the European AI Act and other regulatory frameworks such as the European General Data Protection Regulation (GDPR), and lead a debate on how the future regulation will impact AI innovation and research in Europe.

Klaus
MILLER
Tutorial 4A
Impact of Privacy Regulation on Online Advertising Market: GDPR in Europe
We will discuss the impact of privacy regulation on the online advertising market and specifically focus on the case of the European Union’s General Data Protection Regulation (GDPR). Specifically, participants of this tutorial will learn: (1) Why and how the European General Data Protection Regulation (GDPR) impacts the online advertising market, particularly advertisers, publishers and users. (2) How advertisers and publishers leverage users’ personal data to pursue their goals. (3) Which aspects of the GDPR are most relevant for advertisers, publishers and users.(4) How complex it is to go through the process of obtaining user permission for personal data processing, and how IAB’s Transparency and Consent Framework (TCF) intends to help.(5) How many firms a publisher provides with access to its users’ data, and how long it takes a user to respond to all permission requests. (6) Which developments are taking place with regard to personal data processing, among players in the online advertising industry, as well as among regulators and consumer protection agencies. Anyone interested in learning how and why the online advertising industry benefits from using personal data, and how the GDPR impacts this practice should attend this tutorial. The tutorial is based on the book “The Impact of the General Data Protection Regulation (GDPR) on the Online Advertising Market” available completely for free at www.gdpr-impact.com
Pre-requisites:
- Reading Chapter 1 and Chapter 2 of the referenced book available at gdpr-impact.com
- Installed Version of Base R and R Studio for the Empirical Analysis of Cookie Data
Topic's brief abstract We will discuss the impact of privacy regulation on the online advertising market and specifically focus on the case of the European Union’s General Data Protection Regulation (GDPR). Specifically, participants of this tutorial will learn: (1) Why and how the European General Data Protection Regulation (GDPR) impacts the online advertising market, particularly advertisers, publishers and users. (2) How advertisers and publishers leverage users’ personal data to pursue their goals. (3) Which aspects of the GDPR are most relevant for advertisers, publishers and users.(4) How complex it is to go through the process of obtaining user permission for personal data processing, and how IAB’s Transparency and Consent Framework (TCF) intends to help.(5) How many firms a publisher provides with access to its users’ data, and how long it takes a user to respond to all permission requests. (6) Which developments are taking place with regard to personal data processing, among players in the online advertising industry, as well as among regulators and consumer protection agencies. Anyone interested in learning how and why the online advertising industry benefits from using personal data, and how the GDPR impacts this practice should attend this tutorial. The tutorial is based on the book “The Impact of the General Data Protection Regulation (GDPR) on the Online Advertising Market” available completely for free at www.gdpr-impact.com
Pre-requisites:
- Reading Chapter 1 and Chapter 2 of the referenced book available at gdpr-impact.com
- Installed Version of Base R and R Studio for the Empirical Analysis of Cookie Data

Krikamol
MUANDET
Tutorial 6B
Reliable Decision Making and Causal Inference with Kernels
Data-driven decision-making tools have become increasingly prevalent in society today with applications in critical areas like health care, economics, education, and the justice system. To ensure reliable decisions, it is essential that the models learn from data the genuine correlations (i.e., causal relationships) between the outcomes and the decision variables. In this tutorial, I will first give an introduction to the causal inference problem from a machine learning perspective including causal discovery, treatment effect estimation, instrumental variable (IV), and proxy variables. Then, I will review recent development in how we can leverage machine learning (ML) based methods, especially modern kernel methods, to tackle some of these problems.
Topic's brief abstract Data-driven decision-making tools have become increasingly prevalent in society today with applications in critical areas like health care, economics, education, and the justice system. To ensure reliable decisions, it is essential that the models learn from data the genuine correlations (i.e., causal relationships) between the outcomes and the decision variables. In this tutorial, I will first give an introduction to the causal inference problem from a machine learning perspective including causal discovery, treatment effect estimation, instrumental variable (IV), and proxy variables. Then, I will review recent development in how we can leverage machine learning (ML) based methods, especially modern kernel methods, to tackle some of these problems.

Geoffroy
PEETERS
Tutorial 5B
Learning for audio signals
As in many fields, deep neural networks have allowed important advances in the processing of audio signals. In this tutorial, we review the specificities of these signals, elements of audio signal processing (as used in the traditional machine-learning approach) and how deep neural networks (in particular convolutional ones) can be used to perform feature learning (without prior knowledge --1Dconv, TCN--, or using prior knowledge --source/filter, auto-regressive, HCQT, SincNet, DDSP--).
We then review the dominant DL architectures, meta-architectures and training paradigms (classification, metric learning, supervised, unsupervised, self-supervised, semi-supervised) used in audio.
We exemplify the used of those for some key applications in music and environmental sounds processing: sound event detection, localization, auto-tagging, source separation, generation.
Topic's brief abstract As in many fields, deep neural networks have allowed important advances in the processing of audio signals. In this tutorial, we review the specificities of these signals, elements of audio signal processing (as used in the traditional machine-learning approach) and how deep neural networks (in particular convolutional ones) can be used to perform feature learning (without prior knowledge --1Dconv, TCN--, or using prior knowledge --source/filter, auto-regressive, HCQT, SincNet, DDSP--).
We then review the dominant DL architectures, meta-architectures and training paradigms (classification, metric learning, supervised, unsupervised, self-supervised, semi-supervised) used in audio.
We exemplify the used of those for some key applications in music and environmental sounds processing: sound event detection, localization, auto-tagging, source separation, generation.

Bilal
PIOT
Tutorial 4B
Introduction to deep reinforcement learning
Be it on Atari Games, Go, Chess, Starcraft II or Dota, Deep Reinforcement Learning (DRL) has opened up Reinforcement Learning to a variety of large scale applications. While it could formally appear as a straightforward extension of reinforcement learning to deep learning based function approximations, DRL often involves more than simply plugging the newest deep learning architecture into the best theoretical reinforcement learning method. In this tutorial, we will journey through the recent history of DRL, from the now seminal Neural fitted-Q, to the to most popular Deep Q-Network (DQN). Alongside the lecture, the practical session will revolve around implementing and testing DRL algorithms in JAX and Haiku on simple environments.
Topic's brief abstract Be it on Atari Games, Go, Chess, Starcraft II or Dota, Deep Reinforcement Learning (DRL) has opened up Reinforcement Learning to a variety of large scale applications. While it could formally appear as a straightforward extension of reinforcement learning to deep learning based function approximations, DRL often involves more than simply plugging the newest deep learning architecture into the best theoretical reinforcement learning method. In this tutorial, we will journey through the recent history of DRL, from the now seminal Neural fitted-Q, to the to most popular Deep Q-Network (DQN). Alongside the lecture, the practical session will revolve around implementing and testing DRL algorithms in JAX and Haiku on simple environments.

David
RESTREPO AMARILES
Tutorial 2A
Operationalizing AI Regulation
How will Europe’s future AI regulation impact the design, testing and use of AI applications such as credit scoring, recruitment algorithms, anti-fraud algorithms and facial recognition? We will explore how AI concepts such as explainability, fairness, accuracy, robustness and human oversight will be implemented into the future regulation, and how the regulation compares to other international standards on trustworthy AI. The course will focus on two concrete use cases, facial recognition and credit scoring, to see how the European regulatory framework would apply throughout the lifecycle of the project, walking students through the process of creating a risk management system, including an impact assessment on potential risks for safety and fundamental rights, developing a list of requirements, testing, performance parameters, documentation, and human oversight mechanisms. We’ll explore the potential friction between the European AI Act and other regulatory frameworks such as the European General Data Protection Regulation (GDPR), and lead a debate on how the future regulation will impact AI innovation and research in Europe.
Topic's brief abstract How will Europe’s future AI regulation impact the design, testing and use of AI applications such as credit scoring, recruitment algorithms, anti-fraud algorithms and facial recognition? We will explore how AI concepts such as explainability, fairness, accuracy, robustness and human oversight will be implemented into the future regulation, and how the regulation compares to other international standards on trustworthy AI. The course will focus on two concrete use cases, facial recognition and credit scoring, to see how the European regulatory framework would apply throughout the lifecycle of the project, walking students through the process of creating a risk management system, including an impact assessment on potential risks for safety and fundamental rights, developing a list of requirements, testing, performance parameters, documentation, and human oversight mechanisms. We’ll explore the potential friction between the European AI Act and other regulatory frameworks such as the European General Data Protection Regulation (GDPR), and lead a debate on how the future regulation will impact AI innovation and research in Europe.

Corentin
TALLEC
Tutorial 4B
An introduction to deep reinforcement learning
Be it on Atari Games, Go, Chess, Starcraft II or Dota, Deep Reinforcement Learning (DRL) has opened up Reinforcement Learning to a variety of large scale applications. While it could formally appear as a straightforward extension of reinforcement learning to deep learning based function approximations, DRL often involves more than simply plugging the newest deep learning architecture into the best theoretical reinforcement learning method. In this tutorial, we will journey through the recent history of DRL, from the now seminal Neural fitted-Q, to the to most popular Deep Q-Network (DQN). Alongside the lecture, the practical session will revolve around implementing and testing DRL algorithms in JAX and Haiku on simple environments.
Topic's brief abstract Be it on Atari Games, Go, Chess, Starcraft II or Dota, Deep Reinforcement Learning (DRL) has opened up Reinforcement Learning to a variety of large scale applications. While it could formally appear as a straightforward extension of reinforcement learning to deep learning based function approximations, DRL often involves more than simply plugging the newest deep learning architecture into the best theoretical reinforcement learning method. In this tutorial, we will journey through the recent history of DRL, from the now seminal Neural fitted-Q, to the to most popular Deep Q-Network (DQN). Alongside the lecture, the practical session will revolve around implementing and testing DRL algorithms in JAX and Haiku on simple environments.

Aluna
WANG
Tutorial 6A
Intelligent Risk Management: Graph-Based Anomaly Detection Using the MDL Principle
Risk management encompasses the identification, analysis, and response to risk factors arising over the life of a business. Recognizing patterns and detecting anomalies in big data can be critical to effective risk management. While numerous technologies for spotting anomalies in collections of multi-dimensional data points have been developed in the past years, anomaly detection techniques for structured graph data have lately become a focus. Why do we need to use graph-based approaches to anomaly detection? What are some of the high-impact applications of graph-based anomaly detection in risk management? How can we develop and deploy graph-based anomaly detection techniques for financial transaction data? This short course answers the above questions by introducing two general, scalable, and explainable anomaly detection models, with a focus on the use of graphs and the minimum description length (MDL) principle. The course also discusses how to deploy these techniques and use them for risk management.
Prerequisites:
- Basic knowledge of Python and Jupyter Notebook
Topic's brief abstract Risk management encompasses the identification, analysis, and response to risk factors arising over the life of a business. Recognizing patterns and detecting anomalies in big data can be critical to effective risk management. While numerous technologies for spotting anomalies in collections of multi-dimensional data points have been developed in the past years, anomaly detection techniques for structured graph data have lately become a focus. Why do we need to use graph-based approaches to anomaly detection? What are some of the high-impact applications of graph-based anomaly detection in risk management? How can we develop and deploy graph-based anomaly detection techniques for financial transaction data? This short course answers the above questions by introducing two general, scalable, and explainable anomaly detection models, with a focus on the use of graphs and the minimum description length (MDL) principle. The course also discusses how to deploy these techniques and use them for risk management.
Prerequisites:
- Basic knowledge of Python and Jupyter Notebook
Panelist

Grégory
BOUTTE
HI! PARIS Corporate Donor Representative of KERING
Industry Panel
Chief Client & Digital Officer, KERING
Biography Chief Client & Digital Officer, KERING

Nathalie
BRUNELLE
HI! PARIS Corporate Donor Representative of TOTALEnergies
Industry Panel
Directrice du projet TotalEnergies Paris-Saclay
Biography Directrice du projet TotalEnergies Paris-Saclay

David
CRESSEY
HI! PARIS Corporate Donor Representative of L'Oréal
Industry Panel
Global Head of BeautyTech Accelerators - AI & Data - Solutions at L'Oréal
Biography Global Head of BeautyTech Accelerators - AI & Data - Solutions at L'Oréal

Guillaume
DUBRULE
HI! PARIS Corporate Donor Representative of REXEL
Industry Panel
Group Purchasing and Supplier Relationship Director, Rexel
Biography Group Purchasing and Supplier Relationship Director, Rexel

François
LEMAISTRE
HI! PARIS Corporate Donor Representative of Vinci
Industry Panel
Directeur Général of VINCI Energies
Biography Directeur Général of VINCI Energies

Valérie
PERIRHIN
HI! PARIS Corporate Donor Representative of Capgemini
Industry Panel
Global Head of Insights driven enterprise @ Capgemini Invent France
Biography Global Head of Insights driven enterprise @ Capgemini Invent France