(1/25) - Overview of the AEOLUS 2021-2022 research highlights.
(2/25) - Information theorectic reduction of parameters and data in Bayesian inverse problems
(3/25) - Information theorectic reduction of parameters and data in Bayesian inverse problems
(4/25) - Inadequacy characterization in minimization-based models
(5/25) - Inadequacy characterization in minimization-based models
(6/25) - Bayesian model calibration for diblock copolymer self-assembly with azimunthally-averaged power spectrum of microscopy image data
(7/25) - Bayesian model calibration for diblock copolymer self-assembly with azimunthally-averaged power spectrum of microscopy image data
(8/25) - Performance bounds for PDE-constrained optimization under uncertainty
(9/25) - Performance bounds for PDE-constrained optimization under uncertainty
(10/25) - Data-driven learning of nonlinear manifolds for model reduction
(11/25) - Data-driven learning of nonlinear manifolds for model reduction
(12/25) - Optimal experimental design for hyperpolarized MRI measurements
(13/25) - Optimal experimental design for hyperpolarized MRI measurements
(14/25) - Multifidelity uncertainty quantification methods for nonlocal problems
(15/25) - Multifidelity uncertainty quantification methods for nonlocal problems
(16/25) - Optimal design under uncertainty for the directed self-assembly of block copolymers
(17/25) - Optimal design under uncertainty for the directed self-assembly of block copolymers
(18/25) - Robust importance sampling for error estimation in the context of optimal Bayesian transfer learning
(19/25) - Robust importance sampling for error estimation in the context of optimal Bayesian transfer learning
(20/25) - Topology optimization via component-based reduced modeling
(21/25) - Topology optimization via component-based reduced modeling
(22/25) - Reduced-order model for polycrystalline grain growth in additive manufacturing
(23/25) - Reduced-order model for polycrystalline grain growth in additive manufacturing
(24/25) - Scalable algorithms for optimal experimental design for large-scale Bayesian inverse problems
(25/25) - Scalable algorithms for optimal experimental design for large-scale Bayesian inverse problems

Research

High fidelity phase-field simulations are an indispensable tool for modeling microstructure evolution in a wide range of physical systems including additive manufacturing of metal alloys. However, the phase-field PDEs are computationally expensive even in spatial resolutions at the grain scale. This computational cost prevents ensemble simulations to estimate statistical averages taken over the initial grain distribution and makes downstream tasks like optimal control and uncertainty quantification extremely challenging.In this work, we present an efficient and accurate reduced-order model (ROM) for phase-field simulations using deep neural networks. First, we introduce three spatio-temporal features, interface height, width and excess area of each grain as a function of time, from the phase-field simulations. Then we introduce a sequence-to-sequence LSTM to learn the evolving features. Our LSTM encodes the spatial interactions of grains with attention mechanisms using a phase-field specific transformer architecture. The network is designed and trained to enable predictions for physical parameters not in the training set, grain size and distribution, and the number of grains. The neural network is accurate in terms of predicting phase fields with a DICE error (similar to a relative L-2 norm mismatch) typically within 10%. More important it accurately reconstructs the quantities of interest using ensemble simulations. The ROM is several orders of magnitude faster than solving the PDEs, and this speed-up includes the cost of the high-fidelity simulations used to generate the training set.

Responsive image Responsive image

Comparison of network prediction and high fidelity simulations.

Responsive image Responsive image

Network architectures for spatial encoding and time prediction.

We study the epitaxial, columnar growth of (multiply oriented) dendrites/cells for a spot melt in a polycrystalline Al-Cu substrate using two-dimensional, phase-field, direct numerical simulations (DNS) at the full-melt-pool scale. Our main objective is to compare a computationally expensive DNS model to a much cheaper but approximate “line” model in which a single-crystal phase-field simulation is confined to a narrow rectangular geometry. This line model amounts to a physics-based reduced order model for microstructure evolution.To perform this comparison, we develop algorithms that automatically extract quantities of interest (QoIs) from both DNS and line models. These QoIs allow us to quantitatively assess the assumptions in the line model and help us analyze its discrepancy with the DNS model. We consider four sets of heat source parameters, mimicking welding and additive manufacturing conditions, that create a combination of shallow and deep melt pools. Our largest DNS simulation used 16K × 14K grid points in space. Our main findings can be summarized as follows. Under AM conditions, the QoIs of line models are in excellent agreement with the full DNS results for both shallow and deep melt pools. Under welding conditions, the primary spacing of the DNS model is smaller than the prediction of line model. We identify a geometric crowding effect that accounts for the discrepancies between the DNS and line models. We propose two potential mechanisms that determine the response of the microstructure to geometric crowding.

Relevant AEOLUS publications
  1. Qin, Y., Bao, Y., DeWitt, S., Radhakrishnan, R., and Biros, G., Dendrite-resolved, full-melt-pool phase-field simulations to reveal non-steady-state effects and to test an approximate model, Computational Materials Science, Volume 207, pp.111262
Responsive image

Comparison of dendrites morphology for different models.

Evolution of the concentration field in a deep melt pool under welding condition.

The AEOLUS team is interested in solving the inference and optimal design problems that are crucial for enabling the nanolithography application of block copolymer (BCP) self-assembly. To facilitate problem solving, we first develop fast and robust solvers for the continuum models of BCP self-assembly in order to accelerate the computational screening of the space of model and design parameters. These continuum models characterize the separation of monomer phases with Ginzburg--Landau type free energies. Conventionally, the free energies are minimized through the H-1 gradient flow. However, such an approach involves solving a stiff time-evolving partial differential equation until the steady state solution is reached. It typically requires a large number of small time steps and results in slow linear convergence.

In a published work [1], we proposed a fast and robust algorithm for the direct minimization of the Ohta-Kawasaki (OK) energy functional that greatly reduces the computational cost of using the OK model of BCP self-assembly. We developed a globally convergent modified Newton method, with inexact line search and adaptive Gauss--Newton convexification of the Hessian operator, for the minimization of the OK energy. The method is proven to generate iterations that are monotonically energy decreasing, mass conservative, and quadratically convergent. We numerically showed that the proposed scheme is typically three orders of magnitude faster at finding local minimizers than the conventionally gradient flow approach. The proposed Newton method and its mathematical analysis set a solid foundation for the investigation of the inference and optimal design problems associated with BCP self-assembly conducted by the AEOLUS team.

Relevant AEOLUS publications
  1. Cao, L., Ghattas, O. and Oden, J.T., 2022. A Globally Convergent Modified Newton Method for the Direct Minimization of the Ohta--Kawasaki Energy with Application to the Directed Self-Assembly of Diblock Copolymers. SIAM Journal on Scientific Computing, 44(1), pp.B51-B79.
Responsive image

The equilibrium patterns of diblock copolymer thin film self-assembly generated by the conventional H-1 gradient flow approach and the proposed Newton approach at two sets of model parameters, along with the number of time steps or Newton iterations taken. The proposed Newton scheme is seen to have the same asymptotic complexity as the gradient flow approach per time step or iteration.

Responsive image Responsive image

Top: The chemical pattern placed on the substrate of a diblock copolymer film. Bottom: The resulting equilibrium structures of the diblock copolymer film at different parameters of the polymer--substrate interaction model.

Phase-field models are the state-of-the-art approach for high-fidelity modeling of microstructure formation during solidification, which is critical for understanding the properties of additively manufactured components. Standard phase-field models exhibit a “diffuse interface”, where an order parameter smoothly varies from one phase (e.g. the liquid) to another (e.g. the solid). The diffuse interface approach has many benefits including that the interface doesn’t need to be explicitly tracked. However, it has a major drawback – the finite thickness of the interface can introduce artifacts in the evolution of coupled fields (e.g. temperature, composition). A heavy focus of phase-field model development over the last 20 years has been to correct for these artifacts. These corrections have been effective for many cases, but the rapid solidification regime relevant to additive manufacturing remains a challenge.

As an alternative to the correction schemes used historically, in AEOLUS we are taking a very different approach: developing nonlocal phase-field models of solidification that permit computationally sharp interfaces. This approach sidesteps the artifacts induced by the finite thickness of a diffuse interface while still retaining the benefit of not explicitly tracking the motion of the interface. This research direction builds on previous work in AEOLUS demonstrating that a nonlocal Cahn-Hilliard model (a prototypical phase-field model) can admit sharp interfaces [1]. Work on the nonlocal solidification model began with a model for pure material solidification, where we first analyze the problem and the properties of the solution including the conditions under which sharp interfaces can be obtained [2]. Furthermore, we have developed numerical methods and observed through mathematical analyses and numerical illustrations that the nonlocal model can accurately represent the nonlocal phase-field evolution with a much sharper interface on the same grid than a local equivalent. Detailed investigation and a comparative study with existing models and extensions to alloy solidification models are ongoing.

Relevant AEOLUS publications
  1. Burkovska, O. and Gunzburger, M., 2021. On a nonlocal Cahn–Hilliard model permitting sharp interfaces. Mathematical Models and Methods in Applied Sciences, 31(09), pp.1749-1786.
  2. Burkovska, 0., 2022. Nonlocal phase-field models for solidification: analysis, sharp interfaces and discretization. (in preparation)
Responsive image

Comparison of the local and nonlocal solutions in one and two dimensions. The local and nonlocal solutions are initialized with the same initial condition and plotted at the same time step.

Responsive image

Seaweed structure that forms during a simulation using the nonlocal solidification model for pure materials.

Nonlocal models feature a finite length scale, referred to as the horizon, such that points separated by a distance smaller than the horizon interact with each other. Due to the reduced sparsity resulting from distant interactions, nonlocal models are generally computationally more expensive compared to their local PDE counterparts. This drawback becomes even more significant for outer-loop applications where numerous model evaluations are required. Multifidelity methods aim at reducing the computational cost of outer-loop applications by splitting the budget between high-fidelity model evaluations (used for unbiasedness and fidelity) and a set of low-fidelity model evaluations (used for speedup).

We developed a multifidelity method for uncertainty quantification of nonlocal problems [1] and tested it on the nonlocal diffusion problem. We measured the efficacy of the proposed multifidelity method by comparing the case of the single high-fidelity model with multifidelity cases using surrogate models with smaller horizons, coarser grids, or both coarser grids and smaller horizons. The number of model evaluations in the multifidelity Monte Carlo estimator is determined from an optimization problem such that for a given computational budget, the variance of the mean-squared error is minimized. The optimization problem reveals conditions that prevent the use of surrogate models that are both inaccurate and costly to evaluate. The multifidelity method achieves the speedup in Monte Carlo estimations of the quantity of interest by allocating most model evaluations to the cheap surrogate models while keeping high-fidelity model evaluations at a relatively low number. It is shown that the multifidelity method achieves above two orders of magnitude speedups compared to the high-fidelity model. As the next step, we are testing multifidelity methods for the nonlocal Cahn-Hilliard model to capture sharp interfaces with a lower computational cost (refer to Research Highlight 1 for more information).

Relevant AEOLUS publications
  1. Khodabakhshi, P., Willcox, K. and Gunzburger, M., 2021. A multifidelity method for a nonlocal diffusion model, Applied Mathematics Letters, 121:107361.
  2. Khodabakhshi, P., Burkovska, O, Willcox, K., and Gunzburger, M., Multifidelity uncertainty quantification methods for nonlocal Cahn-Hilliard models (in preparation).
Responsive image

The figure represents the share of samples for the different multifidelity cases. The bottom box in each column represents the share of the high-fidelity model. For the multifidelity cases most model evaluations are allocated to the cheap surrogate models.

Responsive image

The figure shows the decay of the estimated mean-squared error with computational budget. It can be seen that the multifidelity method achieves a desired means square error with about two orders of magnitude less cost when compared to using only the high-fidelity model, or looking at it another way, for the same computational budget, the multifidelity method achieves a two orders of magnitude reduction in the estimated mean-squared error compared to the using only the high-fidelity model.

The derivation of low-dimensional models for high-dimensional dynamical systems from data is a ubiquitous task in many scientific and engineering settings. The vast majority of model reduction methods are intrusive in nature: they require access to the source codes that implement the high-dimensional operators of the original equations (or access to their actions on a vector). Provided the complexity of the AEOLUS control and optimization target applications, there is a growing recognition of the need for non-intrusive reduction methods. To this end we employ the operator inference framework for reducing these complex PDE systems. Operator inference approaches are fully data-driven and exploit knowledge of the underlying high-fidelity problem and the full-order model structure, even when not given access the full-order operators that produced the simulation data.

The task of deriving efficient reduced models for materials science applications poses many challenges. For instance, the governing equations include Arrhenius reaction terms (e.g., in reacting flow models) and thermodynamic terms (e.g., the Helmholtz free energy terms in a phase-field solidification model) when lifted to polynomial form. This renders the dynamical system no longer amenable to low-dimensional approximations. To this end, AEOLUS researchers developed a model reduction method tailored for differential–algebraic equations derived from lifting transformations [1]. Other work has focused on developing localized operator inference approaches using local reduced bases whose dimension is smaller than the one of the global linear trial manifold [2]. This approach is particularly tailored for problems that exhibit rich dynamics across multiple spatial and temporal scales, such as problems with evolving interfaces and microstructures.

Relevant AEOLUS publications
  1. Khodabakhshi, P. and Willcox, K.E., 2022. Non-intrusive data-driven model reduction for differential–algebraic equations derived from lifting transformations. Computer Methods in Applied Mechanics and Engineering, 389, p.114296.
  2. Geelen, R. and Willcox, K, 2022. Localized non-intrusive reduced-order modeling in the operator inference framework, Philosophical Transactions A, The Royal Society, accepted for publication.
Responsive image

A comparison of the ROM and FOM solutions in solidification problems. The dashed line represents the location of the interface. The ROM solutions are shown for the retained POD energy of 99.9%.

Responsive image

Workflow of the proposed localized operator inference approach.

Bayesian inversion, optimal control and design under uncertainty, and optimal experimental design all require running expensive computational simulations numerous times within an outer loop. In scientific problems involving partial differential equations (PDEs), each simulation may require minutes, hours, or even days to run on a powerful computer, making these outer loop problems prohibitive, particularly when the parameter spaces are high (or infinite) dimensional. To address the twin curses of dimensionality and complexity, we are developing, analyzing, and applying general-purpose algorithms that (1) reduce the required number of outer loop iterations, and (2) find and exploit hidden low-dimensional structure in the problem at hand. Typically this is achieved by using gradient, Hessian, and even higher order derivative information along with randomized algorithms. Below we highlight one instantiation of this theme, namely Stein variational methods for solving Bayesian inverse problems.

Stein variational methods are one of the most promising methods for highly nonlinear Bayesian inference problems and offer a transport theory-based alternative to MCMC sampling methods. However, these methods typically suffer from the curse of dimensionality. This is a problem because the parameter of interest in scientific applications is typically high- or infinite-dimensional (e.g., initial or boundary conditions, forcing function, or heterogeneous coefficient). To address this problem, we proposed the projected Stein variational gradient descent and projected Stein variational Newton methods, which project parameter samples onto a lower-dimensional subspace in which the parameters are informed by the data. This subspace is defined by dominant eigenvectors of the expected Hessian, efficiently computed using randomized algorithms. Stein variational methods are then used to transport particles in this lower-dimensional subspace. We observe that these projected methods converge much more rapidly than their full space counterparts. The projected Stein variational Newton method, in particular, exhibits a cost that is independent of the parameter and sample dimensions. Applications include inference of high-dimensional, highly-heterogeneous COVID-19 spread models.

Relevant AEOLUS publications
  1. Chen, P., Wu, K., Chen, J., O'Leary-Roseberry, T. and Ghattas, O., 2019. Projected Stein variational Newton: A fast and scalable Bayesian inference method in high dimensions. Advances in Neural Information Processing Systems, 32.
  2. Chen, P. and Ghattas, O., 2020. Projected Stein variational gradient descent. Advances in Neural Information Processing Systems, 33, pp.1947-1958.
  3. Chen, P. and Ghattas, O., 2021. Stein variational reduced basis Bayesian inversion. SIAM Journal on Scientific Computing, 43(2), pp.A1163-A1193.
  4. Chen, P., Wu, K. and Ghattas, O., 2021. Bayesian inference of heterogeneous epidemic models: Application to COVID-19 spread accounting for long-term care facilities. Computer Methods in Applied Mechanics and Engineering, 385, p.114020.
Responsive image

Project high-dimensional parameters to data-informed low-dimensional subspaces that are adaptively constructed using the gradient or Hessian of the parameter-to-observable map at current samples. SVGD is used in each of the low-dimensional subspaces.

Responsive image

pSVGD (bottom) is more accurate than SVGD (top) for both the parameter samples and their corresponding solutions of a high-dimensional conditional-diffusion model. The reported results are the true sample, sample mean, 90% credible interval, and noisy data.

One of the ways to overcome the often prohibitive computational cost in ``external loop'' problems, such as inverse, optimization, and optimal control problems, is to employ more efficient but lower accuracy models in place of high-fidelity ones. Evidently, this leads to additional errors, and, to maximize the utility of such low-fidelity modeling, it is necessary to have reliable means of assessing these errors. Establishing tight and accurate bounds on the discrepancies between high- and low-fidelity models is typically a non-trivial and highly intrusive task. Our work in AEOLUS in this direction is aimed at the development of probabilistic methodologies for the characterization of such errors for a class of models that mathematically can be represented as minimization problems. This is motivated by the target applications of the AEOLUS center in the field of block copolymer self-assembly, in which energy-minimization models such as the Self-Consistent Field Theory (high-fidelity) and Ohta-Kawasaki phase-field model (low-fidelity) are extensively used.

Treating the discrepancies between low-fidelity models and their high-fidelity counterparts as a source of uncertainty, we have developed a methodology to characterize such errors. The approach is based on representing the mismatch between high-fidelity and low-fidelity models in terms of the energy error. This error as well as the corresponding errors in any given Quantities of Interest (QoIs) are evaluated based on a formal non-intrusive a posteriori error analysis. The key ingredient is a carefully constructed convergent expansion for the linear part of the error with readily computable terms. Thus, the methodology automatically improves the low-fidelity predictions of QoIs and estimates the uncertainty of such predictions simultaneously, without requiring an in-depth physical analysis of any modeling approximations or assumptions relating the high- and low-fidelity models.

Relevant AEOLUS publications
  1. Bochkov, D., Oliver, T. and Moser, R., 2022. Stochastic a posteriori inadequacy characterization in minimization based models. (in preparation)
Responsive image

Application of the developed methodology to a mass-spring toy model: comparison of predictions by the high-fidelity model, low-fidelity model, and proposed technique for the chain's shape (left), total energy (center), and total length (right).

Statistical inference is a ubiquitous task in engineering and science applications. High-dimensional inference problems pose significant challenges to current methodologies, all of which suffer from the curse of dimensionality in form. Transportation of probability measure is an increasingly common approach for inference problems, wherein one constructs a deterministic transformation, called a transport map, to couple two probability distributions. For example, in the Bayesian setting, one can approximate the (intractable) posterior distribution as a transformation of a tractable distribution like a standard Gaussian. With an accurate map, one can perform tasks like sampling or computing summary statistics directly. In practice, one parameterizes the map within an approximating class of functions (e.g., polynomials or neural network-based normalizing flows). Then one optimizes the map's parameters so that the approximation matches the target distribution. While transport methods have shown success across many applications, high-dimensional problems often necessitate transport maps with prohibitively many parameters---making the representation and optimization of maps too costly. One way to bypass this curse of dimensionality is by exploiting intrinsic low dimensionality or sparsity in the problem.

Building on the theory and computational diagnostics proposed in [1], our "lazy map" framework [2] identifies the directions most essential to approximating the target probability distribution. These directions are determined by solving an eigenvalue problem that minimizes an upper bound on the Kullback-Leibler (KL) divergence between the target distribution and its approximation. We then parametrize the map to capture only the most critical directions, significantly reducing the dimension of the problem. We extend our one-step lazy framework with a *greedy algorithm* for building *deep compositions* of lazy maps ("deeply lazy" maps) that can iteratively approximate general high-dimensional target distributions. This sequential framework enables efficient layer-wise training of high-dimensional maps and controls the curse of dimensionality in certain transport classes. Empirically, these methods improve the accuracy of inference and manage the complexity of transport maps to improve the tractability of transport methods. Overall, the greedy lazy framework focuses expressiveness and creates target-informed architectures for any class of transport maps or normalizing flows.

Relevant AEOLUS publications
  1. Zahm, O., Cui, T., Law, K., Spantini, A. and Marzouk, Y., 2022. Certified dimension reduction in nonlinear Bayesian inverse problems. Mathematics of Computation. (in press)
  2. Brennan, M., Bigoni, D., Zahm, O., Spantini, A. and Marzouk, Y., 2020. Greedy inference with structure-exploiting lazy maps. Advances in Neural Information Processing Systems, 33, pp.8330-8342.
Responsive image

The leading eigenvalues of the diagnostic matrix that reveals low-dimensional structure for a Bayesian logistic regression problem throughout the training of a deeply lazy map. The spectrum of the diagnostic flattens and falls as we add additional lazy layers to the transport map, and as the approximation to the posterior distribution improves.

Responsive image

The progressive Gaussianization of a two-dimensional distribution to visualize the deeply lazy training process. Each map only acts on the single direction determined to be most essential to approximating the posterior after previous steps.

A wide variety of engineering, scientific discovery, and biomedicine problems can be formulated as building a mathematical model, including surrogate models by machine learning, to describe the system of interest and then finding an optimal operator to minimize a cost function with respect to an operational objective, such as in discovering novel materials or effective vaccines as countermeasures for pandemics. However, the underlying real-world systems are often complex and cannot be perfectly modeled or accurately identified, and the resulting model uncertainty will affect the estimated operational objective and thereafter the operational efficacy, requiring a robust operator over the uncertainty. To reduce the model uncertainty and thereby facilitate the efficient attainment of the final operational objective, observations or experiments related to the complex system are required. The experiments may directly probe the underlying system states or collect measurements from the system. From the experimental results, one may gather information of the system and therefore reduce the model uncertainty. However, these experiments can be both resource- and time-consuming in many areas such as materials discovery, cell signaling pathway identification and customer preference understanding. Therefore, efficient experimental design methods are needed.

We have been studying the performances when applying uncertainty quantification and optimal experimental design strategies to machine learning. The potential performance bottleneck of sequential experimental design, in particular its inherent myopic behavior, has been identified and analyzed rigorously for the first time in the context of Bayesian experimental design. More critically, we have proposed several novel Bayesian experimental design strategies that have shown to overcome the myopic issues, thanks to our objective-driven uncertainty quantification, thorough theoretical analysis of the mathematical properties of the expected uncertainty reduction tied to the operational objectives that guide experimental design, and efficient computational algorithms. We have demonstrated in our comprehensive evaluation experiments the desired long-term optimality, scalability, as well as sample and computational efficiency. The proposed experimental design methods can have significant impact in many engineering, scientific discovery, and biomedicine problems involving complex systems, from which the data are difficult to acquire and many of the ongoing machine learning efforts requiring big data may not work.

Relevant AEOLUS publications
  1. Zhao, G., Dougherty, E., Yoon, B.J., Alexander, F. and Qian, X., 2021. Efficient active learning for Gaussian process classification by error reduction. Advances in Neural Information Processing Systems, 34.
  2. Zhao, G., Dougherty, E., Yoon, B.J., Alexander, F. and Qian, X., 2021, January. Uncertainty-aware active learning for optimal Bayesian classifier. In International Conference on Learning Representations (ICLR 2021).
  3. Zhao, G., Dougherty, E., Yoon, B.J., Alexander, F.J. and Qian, X., 2021, March. Bayesian active learning by soft mean objective cost of uncertainty. In International Conference on Artificial Intelligence and Statistics (pp. 3970-3978). PMLR.
  4. Zhao, G., Qian, X., Yoon, B.J., Alexander, F.J. and Dougherty, E.R., 2020. Model-based robust filtering and experimental design for stochastic differential equation systems. IEEE Transactions on Signal Processing, 68, pp.3849-3859.
Responsive image

Soft-MOCU (SMOCU) is proposed to overcome the myopic performance limitations of MOCU-based Bayesian active learning due to the piecewise linearity of the original MOCU acquisition function. Bayesian active learning with SMOCU has shown superior sample complexity compared to the state-of-the-art active learning strategies, which can help efficiently identify phase transitions when study materials properties.

In many scientific or clinical settings, training data are typically limited, which impedes the design and evaluation of accurate classifiers. While transfer learning (TL) can improve the learning in the target domain by incorporating data from relevant source domains, it has received little attention for error estimation.

In our recent work, we investigated the knowledge transferability in the context of error estimation within a Bayesian paradigm. We introduced a novel class of Bayesian minimum mean-square error (MMSE) estimators for optimal Bayesian transfer learning (OBTL), which enables rigorous evaluation of classification error under uncertainty in a small-sample setting. In our method, the relatedness between the target and source domains are mathematically represented through a joint prior of the model parameters, based on which useful knowledge and data can be transferred across domains. A key property of the proposed TL-based BEE is its inherent ability to handle the uncertainty about the model parameters in a Bayesian paradigm by integrating the prior with data, deducing robust estimates by accounting for all possible parameter values. Except for very simple cases, the error estimates based on the TL-based posterior probabilities cannot be analytically computed, and we proposed an efficient and robust importance sampling strategy that allows one to obtain TL-based Bayesian error estimates in practical applications.

Through extensive experiments based on both synthetic data as well as real-world RNA sequencing (RNA-seq) data, we have investigated the performance of the proposed estimator for a broad family of classifiers that span diverse learning capabilities. Experimental results have clearly shown that our proposed TL-based error estimation scheme clearly outperforms standard error estimators, especially in a small-sample setting, by tapping into the data from other relevant domains. Technical details of our method, experimental results, and further discussions can be found in [1]. The ideas proposed in [1] can be further extended to enable optimal experimental design [2,3] across multiple domains through optimal Bayesian transfer learning. Furthermore, we can build on the multi-objective uncertainty quantification scheme proposed in [4] to enable OED for multiple objectives as well as across multiple domains.

Relevant AEOLUS publications
  1. Maddouri, O., Qian, X., Alexander, F.J., Dougherty, E.R. and Yoon, B.J., 2022. Robust importance sampling for error estimation in the context of optimal Bayesian transfer learning. Patterns, p.100428.
  2. Hong, Y., Kwon, B. and Yoon, B.J., 2021. Optimal experimental design for uncertain systems based on coupled differential equations. IEEE Access, 9, pp.53804-53810.
  3. Woo, H.M., Hong, Y., Kwon, B. and Yoon, B.J., 2021. Accelerating optimal experimental design for robust synchronization of uncertain Kuramoto oscillator model using machine learning. IEEE Transactions on Signal Processing, 69, pp.6473-6487.
  4. Yoon, B.J., Qian, X. and Dougherty, E.R., 2021. Quantifying the multi-objective cost of uncertainty. IEEE Access, 9, pp.80351-80359.
Responsive image

Graphical abstract of robust importance sampling for error estimation in the context of optimal Bayesian transfer learning.