# Automatic synthesis of mathematical programming models for business processes

A research project funded by the National Science Centre, Poland, grant no. 2016/23/D/ST6/03735, 07.2017-07.2020. Principal investigator: Tomasz Pawlak.

## Research project objectives

The project is aimed at reducing costs of building Mathematical Programming (MP) models for business processes by automating this task. The common practice is to prepare MP models manually using expert knowledge, however, this is a laborious task that requires deep insight into both, the business process and the modeling techniques. We propose algorithms that automate model building using examples of the process states acquired using monitoring process execution. We decompose the problem of MP model synthesis into subproblems of constraint synthesis and objective function synthesis and tackle them separately using methods from Machine Learning, Computational Intelligence, and Operational Research. To handle the constraint synthesis problem we propose algorithms based on one-class classifiers (e.g., Support Vector Data Description, POSC4.5), Genetic Programming, local search heuristics (e.g., Simulated Annealing, Tabu Search), and Linear Programming. To solve the objective function synthesis problem, we employ least-squares regression, including Gauss-Newton algorithm for non-linear regression, and symbolic regression using Genetic Programming. All the developed algorithms are analyzed theoretically and compared in computational experiments on their properties and the MP models that they synthesize. For objective assessment, we use benchmark problems of controlled complexity as well as problem instances of real-world processes.

## Research project methodology

We formulate three research tasks, related to constraint synthesis, objective function synthesis, and evaluation of the developed algorithms on modeling of real-world processes. In the first two tasks, we use synthetic benchmarks formulated as MP models and sampled to acquire the desired number of training examples. These examples are used by the developed algorithms to synthesize MP models. We assess both the algorithms and the MP models that they produce. The former ones are assessed on properties like computational complexity, scaling with problem size and mean square training error, and the latter ones on syntactic and semantic measures of fidelity to benchmark MP models, particularly on angles between constraints, test-set accuracy, Jaccard index of feasible regions, etc.

In the third task, we verify the applicability of the developed algorithms in real-world scenarios. We use data acquired by monitoring various processes to synthesize MP models for them. For this scope, we use public data sets from UCI Machine Learning Repository on e.g., production of energy at a power plant and production of cement mixture at a concrete plant. For a synthesized MP model, we verify its conformance with available domain knowledge of the modeled process. We apply the MP model to process simulation under different working conditions, i.e., with some variables set to specific values, and analyze how these settings influence the MP model’s feasible region and objective function. We also apply the MP model to process optimization and verify the feasibility of the optimal solution in reality and improvement w.r.t. the current process outcome.

The significance of quantitative differences between properties of algorithms and MP models is verified statistically, particularly using ANOVA and Friedman’s test.

## Expected impact of the research project on the development of science, civilization, and society

Nowadays, most MP models are created manually by domain experts offering specialized consulting services. Only a limited number of organizations afford to employ these services due to significant costs, however, process optimization using MP models would bring them substantial profits and savings that increase their advantage in a competitive business environment. We automate model building and expect that the developed algorithms decrease the amount of expert knowledge needed to build correct MP models for real-world processes, reducing thus human-related effort and cost of modeling tasks. The fidelity of the MP model to reality is crucial in optimization, where the optimal solutions usually lie at the boundary of the feasible region determined by the constraints, thus we expect that the synthesized MP models reflect reality to the extent that allows their adoption to optimize, simulate and explain process behavior. Achievement of that goal would lead to the availability of modeling and optimization services to a broader range of entities than the contemporary methods. The improved efficiency of processes should directly contribute to increasing of income and competitiveness of the organizations employing these processes and in the effect to increase of gross domestic product and wealth of society.

## Publications about project achievements

Tomasz P. Pawlak, Michael O'Neill, Grammatical evolution for constraint synthesis for mixed-integer linear programming, Swarm and Evolutionary Computation 64(100896):1-16, Elsevier, July 2021.

Tomasz P. Pawlak, Bartosz Litwiniuk, Ellipsoidal one-class constraint acquisition for quadratically constrained programming, European Journal of Operational Research 293(1):36-49, Elsevier, 16 August 2021.

Marcin Karmelita, Tomasz P. Pawlak, CMA-ES for One-Class Constraint Synthesis, Genetic and Evolutionary Computation Conference (GECCO ’20), July 8–12, 2020, Cancún, Mexico. ACM, New York, NY, USA.

Tomasz P. Pawlak, Synthesis of Mathematical Programming models with one-class evolutionary strategies, Swarm and Evolutionary Computation 44:335-348, Elsevier, 2019.

Tomasz P. Pawlak, Performance Improvements for Evolutionary Strategy-based One-Class Constraint Synthesis, GECCO '18, pp. 873-880, ACM, 2018.

Daniel Sroka, Tomasz P. Pawlak, One-Class Constraint Acquisition with Local Search, GECCO '18, pp. 363-370, ACM, 2018.

Patryk Kudła, Tomasz P. Pawlak, One-class synthesis of constraints for Mixed-Integer Linear Programming with C4.5 decision trees, Applied Soft Computing 68:1-12, Elsevier, 2018.