Title – Data-driven optimization: test score algorithms and distributionally robust approach
Abstract
The ever-increasing availability of data has motivated novel optimization models that can incorporate uncertainty in problem parameters. Data-driven optimization frameworks integrate sophisticated statistical estimation parts into optimization frameworks. This often leads to computationally challenging formulations that mandate new technologies to address scalability issues. For combinatorial optimization, algorithms need to be adapted based on parameter estimation schemes, and at the same time, need to provide near-optimal performance guarantees even under uncertain parameters. For mathematical programming models, efficient solution methods are required to deal with added complexity from applying data-driven frameworks. In this talk, we discuss test score algorithms for stochastic utility maximization and talk about distributionally robust chance-constrained programming.
Test score algorithms are based on carefully designed score metrics to evaluate individual items, called test scores, defined as a statistic of observed individual item performance data. Algorithms based on individual item scores are practical when evaluating different combinations of items is difficult. We show that a natural greedy algorithm that selects items solely based on their test scores outputs solutions within a constant factor of the optimum for a broad class of utility functions. Our algorithms and approximation guarantees assume that test scores are noisy estimates of certain expected values with respect to marginal distributions of individual item values, thus making our algorithms practical.For the second part of the talk, we consider distributionally robust optimization (DRO) frameworks, which allow interpolating between traditional robust optimization and stochastic optimization, thereby providing a systematic way of hedging against the ambiguity in underlying probability distributions. In particular, we apply the DRO framework defined with the Wasserstein distance to chance-constrained programming (CCP), an optimization paradigm that involves constraints that have to be satisfied with high probability. We develop formulations by revealing hidden connections between the Wasserstein DRO framework and its nominal counterpart (the sample average approximation), and propose integer programming based solution methods. Our formulations significantly scale up the problem sizes that can be handled by reducing the solution times from hours to seconds,compared to the existing formulations.
This talk is based on joint works with Nam Ho-Nguyen, Fatma Kılınç-Karzan, Simge Küçükyavuz, MilanVojnovic, and Se-young Yun.
For zoom information email Meg Tully – [email protected]