Incentive Compatible and Incentive-Aware Learning

Tutorial at EC2020

Pre-recording dates: June 17th and June 18th

Organizers: Nika Haghtalab, Chara Podimata





Motivation

Machine Learning has revolutionized our everyday lives. All of us rely on complex algorithms that learn our preferences over products, services, and even news articles. The deployed algorithms are always strongly vetted, with theoretical performance guarantees being provided against both stochastic and adversarial attackers. But what if these models of attacks fail to fully capture the goals and desires of real-life attackers? Think about the spread of Fake News, for example. The agents spreading Fake News are agents trying to take advantage of the way the online news serving algorithms work; bots make sure to appropriately bump a Fake News article, so that more users are exposed to it. These carefully crafted strategic manipulations should be dealt with in a different way than standard adversarial attacks, whose goal is to destroy the news algorithms. Similar strategic incentives arise in many scenarios, such as classification algorithms for evaluation, algorithms for Stackelberg Security Games and many more.

Such problems, situated at the interface of Algorithmic Game Theory and Machine Learning, present a unique challenge, but also a unique opportunity to modern researchers. The goal of this tutorial is to bring this vibrant research agenda, which bridges between Economics and Machine Learning, to the forefront of the EC community, by presenting a number of results published in Machine Learning and Algorithmic Game Theory venues.

Schedule

Pre-recording Part I Wednesday, June 17 1:30-2:15pm, 2:30-3:15pm Eastern
Exercise Session Wednesday, June 17 3:30-4:30pm Eastern
Pre-recording Part II Thursday, June 18 1:30-2:15pm, 2:30-3:15pm Eastern

Watch-Party Part I (password on EC20 registration)   Monday, July 13th 10-10:45am, 11-11.45am Eastern
Watch-Party Part II (password on EC20 registration)   Monday, July 13th 3-3:45pm, 4-4:45pm Eastern

Tutorial Materials

Part I

Part IA:     slides video

Part IB:     slides video

Part II

Part IIA:     slides video

Part IIB:     slides video

Overview -- Part I

  • Introduction.
    We introduce several important real-life settings, where the strategic manipulation from the side of the agents affects the deployment of the Machine Learning algorithm. We discuss how the two types of noise normally considered in Machine Learning (namely, stochastic and adversarial noise) fail to capture the incentives of human data sources. Drawing intuition from the literature in Algorithmic Game Theory we introduce the following three axes for learning in economic settings: learning task, agent's utility function, agent's behavior. These axes help one understand the areas with more contributions and the space for future work.
  • Primer on Machine Learning.
    We start this primer by explaining learning in offline settings and providing the key definitions of PAC-learnability and VC-dimension. Offline settings are centered around the assumption that there exists a static dataset, on which we want to train and test our machine learning algorithms. We next introduce the online learning paradigm, which makes no such assumption and for that might be more suitable for learning in dynamical real-life economic settings. In the next part of the primer, we provide an overview of some key tools from online learning. We start by describing the notions of external and Stackelberg regret, which are the performance measures used in strategic online learning settings. The two notions of regret capture different levels of strategic thinking from the perspective of the agents. Our goal here is to give examples of settings where external and Stackelberg regret are used, to explain their differences, and stress the fact that they can be strongly incompatible in strategic settings (i.e., any sequence of actions achieving no-external-regret suffers from linear Stackelberg regret, and vice-versa). For the final part of the primer, we focus on the differences between full and partial information settings, and the algorithms that can tackle each. We present two well-known full-information algorithms, namely Hedge and Follow-the-Perturbed-Leader. Based on Hedge, we then explain how one can derive a partial information algorithm (namely, EXP3). We conclude by outlining information models that are in between full and partial information and play a central role in online learning in economic settings.
  • The role of truthfulness in Machine Learning.
    We present two approaches to regret minimization with expert advice and highlight the role of truthfulness. First, we consider an online learning from expert advice setting where the "experts" are the actions of a truthful agent and discuss how to learn from these truthful actions. Examples of this include works on learning parameters of truthful auctions. We then consider settings where the experts themselves are agents with incentives to misreport their losses in order to increase their leverage over the learning algorithm. Examples of this include works on forecasting and general multi-armed bandit problems. In this setting, we show how to design incentive-compatible online algorithms that achieve the same regret rate as online learning with no strategic considerations. In this part of the tutorial, we present techniques and ideas from our works (Dudik, Haghtalab, Luo, Schapire, Syrgkanis, Vaughan [6] and Freeman, Pennock, Podimata, Vaughan [7]), while drawing connections to seminal works of Kalai and Vempala '04 ([1]) on regret minimization and Lambert, Langford, Vaughan, Chen, Reeves, Shoham, Pennock '08 ([2]) regarding strict incentive-compatibility of weighted score wagering mechanisms.

Overview -- Part II

  • Incentive-Aware Learning.
    We next relax the strict desideratum of truthfulness and we discuss dynamic settings modeled as repeated Stackelberg games. We begin with a brief overview of Stackelberg Security Games, and explain why machine learning algorithms are of great importance for real-life instances of them. We discuss the works of Blum, Haghtalab, Procaccia '14 ([3]) and Balcan, Blum, Haghtalab, Procaccia '15 ([4]) that showcase how the incentive structure of the attackers induces a nice geometric interpretation, to which the authors base their proposed no-Stackelberg regret algorithms. We next discuss how strategic classification can be modeled as a repeated Stackelberg game. We discuss the work of Chen, Liu, Podimata '20 ([5]) who, in the spirit of Balcan, Blum, Haghtalab, Procaccia'15 ([3]), find no-Stackelberg regret algorithms based on the specific geometric interpretation of the strategic learning problem. We showcase the technique of adversarial adaptive domain discretization, the usefulness of which arises from the geometric interpretation of the strategic learning problem. This technique can be of interest even beyond strategic settings. We next highlight a relatively underexplored area in learning in economic settings; learning against agents that do not always satisfy individual rationality. We take two perspectives on this. First, based on Haghtalab, Immorlica, Lucier, Wang '20 ([8]), we discuss learning against boundedly rational agents in Stackelberg Security Games. Second, we discuss a new framework for modeling learning in settings where a small number of arbitrarily irrational agents exists, based on the work of Krishnamurthy, Lykouris, Podimata '20 ([9]). Lastly, we outline the approach taken by researchers recently regarding mitigating the negative effects of strategic classification by identifying social-welfare-improving classification mechanisms that inventivize particular types of behaviors from the agents.
  • Future Directions.

References

  1. Efficient algorithms for online decision problems, Journal of Computer and System Sciences '05
    Adam Kalai, Santosh Vempala
  2. Self-Financed Wagering Mechanisms for Forecasting, EC'08.
    Nicolas Lambert, John Langford, Jennifer Wortman Vaughan, Yiling Chen, Daniel Reeves, Yoav Shoham, David M. Pennock
  3. Learning Optimal Commitment to Overcomes Insecurity, NeurIPS '14.
    Avrim Blum, Nika Haghtalab, Ariel D. Procaccia
  4. Commitment Without Regrets: Online Learning in Stackelberg Security Games, EC '15.
    Maria-Florina Balcan, Avrim Blum, Nika Haghtalab, Ariel D. Procaccia
  5. Learning Strategy-Aware Linear Classifiers, preprint '20
    Yiling Chen, Yang Liu, Chara Podimata
  6. Oracle-Efficient Learning and Auction Design, FOCS '17
    Miroslav Dudik, Nika Haghtalab, Haipeng Luo, Robert E. Schapire, Vasilis Syrgkanis, Jennifer Wortman Vaughan
  7. No-Regret and Incentive-Compatible Online Learning, ICML '20
    Rupert Freeman, David M. Pennock, Chara Podimata, Jennifer Wortman Vaughan
  8. Maximizing Welfare with Incentive-Aware Evaluation Mechanisms, IJCAI'20
    Nika Haghtalab, Nicole Immorlica, Brendan Lucier, and Jack Wang
  9. Corrupted Multidimensional Binary Search: Learning in the Presence of Irrational Agents, preprint '20
    Akshay Krishnamurthy, Thodoris Lykouris, Chara Podimata