Validation Metrics For Turbulent Plasma Transport

Validation metrics for turbulent plasma
transport
Cite as: Phys. Plasmas 23, 060901 (2016); https://doi.org/10.1063/1.4954151
Submitted: 07 March 2016 . Accepted: 05 June 2016 . Published Online: 22 June 2016
C. Holland
ARTICLES YOU MAY BE INTERESTED IN
Verification and validation for magnetic fusion

Physics of Plasmas 17, 058101 (2010); https://doi.org/10.1063/1.3298884
Comparisons and physics basis of tokamak transport models and turbulence simulations
A theory-based transport model with comprehensive physics

Phys. Plasmas 23, 060901 (2016); https://doi.org/10.1063/1.4954151 23, 060901
© 2016 Author(s).
PHYSICS OF PLASMAS 23, 060901 (2016)
Validation metrics for turbulent plasma transport

C. Hollanda)
Center for Energy Research, University of California, San Diego, La Jolla, California 92093-0417, USA
(Received 7 March 2016; accepted 5 June 2016; published online 22 June 2016)
Developing accurate models of plasma dynamics is essential for confident predictive modeling of
current and future fusion devices. In modern computer science and engineering, formal verification
and validation processes are used to assess model accuracy and establish confidence in the
predictive capabilities of a given model. This paper provides an overview of the key guiding
principles and best practices for the development of validation metrics, illustrated using examples
from investigations of turbulent transport in magnetically confined plasmas. Particular emphasis is
given to the importance of uncertainty quantification and its inclusion within the metrics, and the
need for utilizing synthetic diagnostics to enable quantitatively meaningful comparisons between
simulation and experiment. As a starting point, the structure of commonly used global transport
model metrics and their limitations is reviewed. An alternate approach is then presented, which
focuses upon comparisons of predicted local fluxes, fluctuations, and equilibrium gradients against
observation. The utility of metrics based upon these comparisons is demonstrated by applying them
to gyrokinetic predictions of turbulent transport in a variety of discharges performed on the DIII-D
tokamak [J. L. Luxon, Nucl. Fusion 42, 614 (2002)], as part of a multi-year transport model valida-
C 2016 Author(s). All article content, except where otherwise noted, is licensed under
tion activity. V
a Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
[http://dx.doi.org/10.1063/1.4954151]
I. INTRODUCTION understanding of the fundamental nonlinear plasma dynam-

ics that govern fusion plasmas. As such, validation studies
All models are wrong, but some are useful.
which assess the accuracy and fidelity of these models in pre-
—George E. P. Box
dicting (or postdicting in many cases) experimental observa-
The concept of model validation, defined as1 tions represent one of the clearest and most direct means of
quantifying our progress in developing a true understanding
The process of determining the degree to which a model of fusion-relevant systems. More importantly, through these
is an accurate representation of the real world from the assessments, validation allows us to rigorously identify those
perspective of the intended uses of the model parameter regimes where current models do not provide suf-
ficiently accurate descriptions of reality, and where improve-
has become an increasingly important part of research and ments to the models are needed.
engineering efforts across many disciplines, as numerical Given the importance of rigorous validation studies in
simulation results play an ever larger role in diverse areas, many fields, it is no surprise that there is now a broad and
such as the design of buildings, aircraft, and automobiles, well-established literature in the field, ranging from guide-
weather forecasting and climate modeling, and epidemiol- lines and best practices prescribed by professional soci-
ogy. One such discipline is fusion energy research, where de- eties1,5 to journal reviews6–11 and textbooks,12,13 in addition
velopment of a validated predictive modeling capability is to the numerous articles and reports detailing the outcomes
now commonly identified as a top priority for the US fusion of individual studies. In the area of plasma physics and
energy program.2,3 In the nearest term, such a capability is fusion energy research, key review articles include those by
desired to maximize the scientific returns from future burn- Terry et al.14 and Greenwald.15 These articles lay out many
ing plasma experiments such as ITER,4 while minimizing of the key fundamental ideas and concepts for validation
the risks of operating in regimes or configurations likely to research in the context of magnetic confinement based fusion
damage the extremely expensive device infrastructure. On a energy (MFE) research. A common feature of both these
longer timescale, the hope is that a sufficiently accurate pre- reviews and the broader literature is the identification of
dictive modeling capability can be applied to confidently validation metrics as key components of any serious, robust
design future experimental devices and reactors, reducing validation study. While both the Terry et al. and Greenwald
the cost and time to deploy fusion energy as a viable com- reviews discuss these metrics in some detail, including
mercial energy source. More broadly, the conceptual models potential mathematical formulations, both also emphasize
and computational codes that constitute a predictive model- the need for further work. This paper aims to build upon
ing capability represent the most tangible realization of our those discussions, by focusing on the practical issues often
encountered in formulating validation metrics for plasma
a)
Electronic mail: [email protected] physics studies. In order to provide a “hands-on” illustration
1070-664X/2016/23(6)/060901/31 23, 060901-1 C Author(s) 2016.

V
060901-2 C. Holland Phys. Plasmas 23, 060901 (2016)
of these issues, this paper will use the formulation of valida- implements the conceptual model.” The key here is the
tion metrics relevant for turbulent transport in MFE plasmas separation of the system of equations under consideration,
as a type of worked example. However, the underlying con- along with specifications of domain geometry and initial and
cepts and challenges are broadly relevant to a wide variety boundary conditions, from their specific computational
of different phenomena. The most important of these con- implementation. Thus, what is often referred as a model in
cepts are the need to incorporate both experimental and plasma physics literature corresponds to the conceptual
model uncertainties into the validation metrics, as a funda- model, regardless of whether the equations under considera-
mental part of any assessment of model performance, and tion are obtained directly from fundamental physics relation-
using synthetic diagnostics for meaningful model–experi- ships such as the Boltzmann equation or Maxwell’s
ment comparisons. equations (sometimes termed “first-principles modeling” or
The remainder of this paper is structured as follows. In even just “theory”), or incorporate some amount of free
Sec. II, an overview of key validation concepts is presented, parameters whose values are determined via analytic (e.g.,
including both formal and “plain English” definitions. In moment closures used in gyrofluid theory19) or empirical
Sec. III, a brief review of turbulence and transport modeling (e.g., confinement scaling laws derived from database regres-
in MFE plasmas is presented, including a review of the sions16,20) methods. In this paper, this shorthand of using the
widely used global transport validation metrics defined in the term model to refer to the conceptual model will be used
ITER physics basis16 in Sec. III A. Sec. IV then presents an frequently, while the term code will be used to refer to a par-
alternate approach based on local transport metrics, which ticular computational model. Also implicit throughout this
builds upon the pioneering work by Ross and co-work- discussion is that the conceptual model(s) being tested are
ers.17,18 Extensions of this approach to local turbulence char- sufficiently complex as to require numerical computation to
acteristic validation metrics, including the role of the obtain solutions sufficiently accurate for meaningful quanti-
synthetic diagnostics, are detailed in Sec. V. In Sec. VI, con- tative comparison against experiment. In some cases the
struction of composite metrics is discussed, along with the numerical implementation may be fairly trivial, but nonethe-
related concept of the primacy hierarchy, while conclusions less some level of computation is always needed in valida-
and future research directions are presented in Sec. VII. tion studies.
As most every set of physics-based model equations is
II. OVERVIEW OF KEY VALIDATION CONCEPTS based upon some set of simplifying assumptions and mathe-
matical orderings, it is important to understand the limits
A useful entry point to the study of model validation is
these assumptions place on when and where the model is
to begin by reviewing and defining the terminology com-
expected to be valid, or at least useful. Identifying the
monly used in the literature. In this paper, the definitions
domain in parameter or operational space where a model is
given by Terry et al.14 will be used. As noted above, alterna-
expected to perform acceptably is termed model qualifica-
tive definitions and terminology discussions can be found in
tion, formally defined as the “theoretical specification of the
professional association guidelines,1,5 journal reviews,6–11
expected domain of a conceptual model and/or approxima-
and textbooks.12,13 A summary of simple “plain English”
tions made in its derivation.”14 Experimentally assessing
descriptions (not definitions) of the key concepts is provided
whether a given model performs acceptably (as quantified by
in Table I.
validation metrics) over this theoretically specified domain is
As a starting point, one must specify what the term
one of the most important functions careful model validation
model means in model verification and validation. The
studies provide.
AIAA guidelines define a model as “A representation of a
Given the distinction between conceptual and computa-
physical system or process intended to enhance our ability to
tional models, the ideas of verification and validation natu-
understand, predict, or control its behavior.” Because this
rally arise. Verification is formally defined as “The process
definition is so broad, a more practical starting point is to fol-
of determining that a model implementation accurately rep-
low Terry et al. and distinguish between a conceptual model
resents the developer’s conceptual description of the model
and computational model. A conceptual model is defined as
and the solution to the model.” It is distinct from validation,
“The observations, mathematical modeling data, and mathe-
defined as “The process of determining the degree to which a
matical (partial differential) equations that describe the
model is an accurate representation of the real world from
physical system, including initial and boundary conditions,”
the perspective of the intended uses of the model.” However,
while a computational model is “The program or code that
the two are clearly related and often discussed together, lead-
ing to the commonly used acronym “V&V” to refer to verifi-
TABLE I. “Plain English” descriptions of common V&V terms.
cation and validation. Thus, one might describe (but should
Term Description not define) verification as asking “Does a code solve the
model equations correctly?” and validation as “Does a model
Model The set of equations being tested against experiment have the right equations?”12 Although a detailed discussion
Code The numerical implementation of the model
of verification is outside the scope of this paper, it is impor-
Verification “Does the code solve the model equations correctly?”
tant to emphasize that verification encompasses far more
Validation “Does the model have the right equations?”
Validation metric A measure used to quantify the experimental
than elimination of coding errors and other software quality
fidelity of the model assurance activities (both of which are themselves intensive
efforts). Verification also includes quantification of inherent
numerical errors due to sources such as discrete spatiotempo- knowledge of r M and r E, one cannot assess whether differ-
ral grids, accuracy of time-integration algorithms, and noise ences (or agreement) between model predictions and experi-
accumulation in Monte Carlo simulations, all of which must ment are truly meaningful or relevant for building
be known to assess whether a given code yields sufficiently confidence in the predictive capabilities of the model. The
accurate solutions of the model equations for their intended processes used to quantify experimental and model uncer-
use. It is also clear that verification must precede validation. tainties are often referred to in the literature as uncertainty
One cannot assess the experimental fidelity of the conceptual quantification (UQ). Research into sophisticated UQ meth-
model without first knowing the accuracy of the computa- ods, particularly into efficient and robust approaches for
tional model it is implemented in, for the specific parameters propagating uncertainties in model inputs into uncertainties
or regime being considered. The process of ensuring that the in model outputs, is currently a rapidly growing field in its
model equations and computational algorithms are correctly own right with entire journals dedicated to its study,21 and
implemented is known as code verification, while testing some aspects of this work will be touched upon in Sec. IV
that a particular code result has been correctly calculated to and V of this paper.
the desired accuracy is known as solution verification. Note To illustrate the importance of incorporating uncertain-
that one must verify all computational tools used in a valida- ties into validation metrics, consider the progression of met-
tion study, including synthetic diagnostics (discussed further ric visualizations discussed in Oberkampf et al.,9 reproduced
in Sec. V A). here in Fig. 1. At one end of the spectrum lies the purely
Finally comes the concept of validation metrics, defined qualitative “viewgraph norm” comparison (Fig. 1(a)), which
by Terry et al. as “A formula for objectively quantifying a consists of two figures separately visualizing (presumably
comparison between a simulation result and experimental comparable) experimental data and simulation results. While
data.” Similarly, Oberkampf and Roy define a validation such an approach may provide useful insights into the gross
metric as “A mathematical operator that measures the differ- performance of the model, most often for experts in the prob-
ence between a system response quantity (SRQ) obtained lem under consideration who have already developed an
from a simulation result and one obtained from experimental intuition for where the model is most likely to perform well
measurements.”13 Thus, validation metrics are the mathe- or poorly, it does not provide a quantitative assessment of
matical relationships used to quantify how closely the model
model fidelity. Judgments based on such visualizations are
predictions reproduce the experimental observations of inter-
always subjective, and easily influenced by the presentation
est. As such, they provide the basis for assessing the fidelity
(e.g., due to choice of colors used in visualizing the data22).
of a given model, and establishing confidence in using that
Moreover, in many cases such as the comparisons of contour
model predictively. This formal quantification of model
plots as shown in Fig. 1(a), no information is (or can be)
accuracy, as opposed to more subjective assessments of
included as to the experimental and model uncertainties,
model fidelity often found in the literature such as “good
which prevents one from determining whether any suggested
agreement,” “qualitatively consistent,” or “clearly differs,” is
(dis)agreement is (un)fortuitous. A significantly improved
what distinguishes validation as an activity distinct from
comparison is shown in Fig. 1(b), which directly visualizes
many earlier model–experiment comparison studies. Indeed,
Oberkampf et al.9 state that “[t]he specification and use of magnitudes and trends in both the measured and simulated
validation metrics comprises the most important practice in quantity of interest (the system response) as a function of a
validation studies.” given parameter (the system input). However, without inclu-
Examination of the broader validation literature indi- sion of uncertainties, the significance of differences between
cates a number of characteristics well-designed validation model and experiment still cannot be assessed. Figs.
metrics should incorporate,10,11 which will be discussed in 1(c)–1(e) address this issue through increasingly comprehen-
detail in Sec. IV. In the author’s opinion, the most important sive inclusion of uncertainties in both the input variable and
of these characteristics is the incorporation of experimental, the measured and predicted system responses. This progres-
modeling, and computational uncertainties and errors into sion culminates in Fig. 1(f), which represents a robust quan-
the metric’s assessment of model accuracy. Uncertainties titative comparison between the experiment and simulation.
and errors are a fundamental reality of both experiment and Note that it transitions from separate plotting of the simu-
simulation, and without accounting for them no meaningful lated and measured responses as a function of the input vari-
assessment of model validity can be made. Without such an able to plotting their difference, and that it includes
assessment, one cannot build confidence in the predictive visualization of the total uncertainty in both the difference of
capabilities of a model, undermining the entire goal of a the predicted and measured SRQ and system input parameter
validation study. As a practical illustration, consider the (here one and two standard deviation contours are visual-
common case where one has a set of model predictions with ized). Although Fig. 1(f) contains the same information as
associated uncertainties yM ðxÞ6r M ðxÞ (e.g., predictions of Fig. 1(e), it provides a clearer visualization of both the dis-
energy confinement time in a tokamak as a function of crepancy between model and experiment and the statistical
plasma current) to be compared against a corresponding set significance of this difference as a function of input parame-
of experimental data yE ðxÞ6r E ðxÞ. One can only assess the ter, and as such forms the most robust basis for a validation
level of model accuracy (i.e., the difference between yM ðxÞ metric. Developing metrics (and their corresponding visual-
and yE ðxÞ) to a level of precision set by the combination of izations) of this form suitable for plasma turbulence model-
uncertainties r E and r M. Put another way, without ing is the focus of Secs. IV–VI of this paper.
FIG. 1. Illustration of validation met-

rics of increasing quality and utility.
Reprinted with permission from
Oberkampf et al., App. Mech. Rev. 57,
345 (2004). Copyright 2004 ASME
Publishing.9
As the above discussion makes clear, the process of vali- to magnetic pressure b ¼ 2l 0 nT=B2 ) to characterize plasma
dation rests upon comparisons of model predictions to exper- regimes, and formulation of model behavior in terms of scal-
imental data. The astute reader might therefore note a ings with these dimensionless parameters. When combined
contradiction between the necessity of experimental data for with application of dimensional analysis and scale invariance
validation, and the goal of much fusion-related validation techniques to the model equations, understanding gained
work, which is to build confidence in models used to predict from the study of current-day experiments can be extrapo-
future experiments that lie outside the parameter space lated to future devices with increased confidence, in a man-
spanned by current-day experiments. In the strictest and ner analogous to how wind tunnel experiments are used in
most rigorous sense, validating a model for one experimental the design of aircraft and automobiles. This approach has
condition or regime does not necessarily imply anything been employed with great success to advance our under-
about its predictive capability for regimes where it has not standing of transport physics by combining data from multi-
been validated. Thus, demonstrating a model that works well ple experimental devices with different physical parameters
in describing current-day experiments does not necessarily (size, density, temperature, etc.) but matched dimensionless
mean that it will perform as well in future burning plasma parameters.23–25 In this context, basic plasma physics and
experiments such as ITER. This tension is a specific instance smaller-scale experiments focused on specific plasma dy-
of a general challenge for validation research, namely, namics play a key role in MFE plasma model validation, by
understanding to what degree validation of a model for a enabling access to parameter regimes significantly different
given area of parameter space enables meaningful extrapola- than those obtainable in large high-power fusion experi-
tion of the model to other parameter regimes. ments. Such experiments also often allow use of diagnostic
Although the tension between validation and extrapola- approaches and measurement techniques not available on
tion has not been explicitly discussed much within the fusion larger devices, enabling testing of predictions of phenomena
community to date, the implicit approach taken to resolving which cannot otherwise be measured easily, if at all. While
it has been to focus on understanding the relevant underlying the physics-based approach to model extrapolation offers
physical processes in great detail, rather than investing heav- many advantages, it is important to keep in mind its key li-
ily in empirical models calibrated to current experiments. mitation, which is that it presumes that no significant new
This approach is based upon the idea that while the absolute physics or dynamics arise in future devices that are not cap-
values of many plasma quantities (such as plasma volume, tured in the model equations. Thus, while confidence in
density, temperature, etc.) will be different in future experi- extrapolation to future regimes can be increased through bet-
ments from current ones, many of the fundamental physical ter physical understanding, there is no substitute for actual
processes governing the dynamics of those future plasmas experimental data for truly validating a model in the parame-
will be the same. Thus the focus of much current work is on ter regime of interest.
developing models of these processes built upon physical
understanding, and eliminating free parameters and depend- III. BASICS OF TURBULENCE AND TRANSPORT
encies determined by calibration of the models to current MODELING IN MAGNETICALLY CONFINED FUSION
PLASMAS
experiments. This approach is mostly clearly seen in the
widespread use of various dimensionless parameters (such as One of the greatest challenges for plasma theory is devel-
the ion-electron mass ratio mi/me and ratio of plasma thermal oping an accurate understanding of plasma confinement: the
physics which relates the equilibrium density n and tempera- toroidally axisymmetric plasma (to which we will restrict
ture T of magnetically confined plasmas to externally applied our attention from here onwards), and absent large-scale
fueling and heating sources. Since plasma confinement in (i.e., device-size) instabilities such as sawteeth or tearing
most MFE devices is believed to be determined by plasma modes, these profiles are approximately constant on closed
turbulence, building, validating, and qualifying models of magnetic flux surfaces, essentially as a consequence of the
this turbulence is essential for establishing an accurate predic- plasma equilibrium satisfying the force balance equation J~
tive capability for designing and optimizing future MFE devi- $B ~ ¼ rP ~ which implies that B ~ ¼ 0. However, as will
~ % rP
ces. Given the importance of this topic to predicting future be seen below, small deviations and fluctuations about this
performance, as well as the extensive active research efforts equilibrium result in particle, energy, and momentum fluxes
in MFE plasma turbulence measurements, simulation, and which transport the plasma across flux surfaces, limiting the
validation, we have chosen to use it as the example problem level of confinement achieved. Describing the relationship
for this tutorial. In this section, we briefly review the funda- between the dynamics of these kinetic profiles (i.e., the equi-
mental physics of plasma confinement and approaches taken librium density, temperature, and rotation profiles) and their
to modeling it, to inform the development of turbulent associated sources and sinks is the goal of transport model-
transport validation metrics in Secs. IV and V. This discus- ing, and the equations used to describe this relationship are
sion is intended to provide a complete and concise introduc- called the transport equations. These equations are generally
tion to turbulent transport in magnetically confined plasmas formulated as a set of one-dimensional (1D) fluid balance
for those who are not already expert in the area. Those who equations of the form31
are experts can likely skip ahead to Sec. III A for a discussion " #
of global transport metrics or Sec. IV for discussion of local dhnj i 1 d V 0 Cj
¼ 0 þ Sn;j ; (1)
turbulence metrics. dt V dx
There are a number of different approaches taken to " #
dhWj i 1 d V 0 Qj dXtor
quantifying and understanding confinement in MFE-relevant ¼ 0 þ Pj þ Sint;j þ Saux;j ; (2)
plasmas discussed in the literature, including the transport dt V dx dx
$ P %
and confinement chapters of the original16 and updated20 d hR2 iXtor j nj Mj " 0P #
1d V j Pj
X
ITER physics basis reports, which form the basis of the dis- ¼ 0 þ T j; (3)
cussion here. The most basic measure of plasma confinement dt V dx j
is the energy confinement time sE, which is equal to the ratio
of theÐ volume-integrated stored energy of the plasma for each species (ions or electrons) j. Here, the brackets
Wp ¼ dV 3nT to the injected power Pinj. Since the total denote an average over magnetic flux surface, Wj ¼ ð3=2Þ
fusion power produced in a given experiment is proportional nj Tj is the energy of species j, Xtor ðxÞ is a common toroidal
to the square of the stored energy, models which describe the rotation frequency of all species, Sn,j corresponds to the par-
dependence of sE on various global system parameters such ticle source and sink terms, Saux to auxiliary (external) heat-
as toroidal magnetic field and total plasma current provide a ing and fueling systems such as neutral beams or various
compact way of roughly predicting the net power production wave-based systems such as ion or electron cyclotron reso-
(and thus performance) expected for a given device configu- nance heating, Sint to internal heating and cooling processes
ration and heating scenario. A variety of empirical and ana- such as ionization, recombination, radiation, and collisional
lytic scaling laws for sE have been derived, which are often energy exchange, and T j to auxiliary momentum sources.
used as measures of performance in current experiments. For The quantities Cj, Qj, and Pj correspond to the magnetic
instance, the H89 and H98ðy;2Þ measures are the ratios of the flux-surface-averaged radial fluxes of particle, energy, and
energy confinement time in a given plasma to the empirical toroidal angular momentum, respectively. There are a variety
ITER89P16 or IPB98(y,2)20 scaling laws for sE which have of possible choices for radial coordinate x, each of which
been extensively used in the design of ITER target scenar- provides a unique label for a given closed magnetic flux sur-
ios.26 Each of these measures provides a basis for assessing face. Two of the most common choices are pthe square root of
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
the level of confinement achieved in a given experiment normalized toroidal magnetic flux q tor ¼ vtor ðxÞ=vtor ðaÞ or
relative to what is required for a given ITER operational sce- the normalized poloidal magnetic flux wN ¼ fw ' wð0Þg=
nario, and a significant fraction of current experimental work fwðaÞ ' wð0Þg, where vtor and w are Ð the unnormalized toroi-
is focused on demonstrating plasmas which can maintain dal and poloidal magnetic fluxes B ~ passing through
~ % dA
these normalized confinement levels in different operational the a surface labeled by x, respectively, and a is the label of
scenarios (such as non-inductive steady-state operation,27 the last closed flux surface (LCFS).
edge localized mode (ELM)28 suppression via application of Given these equations, and specifications of the vacuum
resonant magnetic perturbations,29 or operation in the pres- toroidal magnetic field, assorted initial and boundary condi-
ence of “ITER-like” metal walls30). tions, and auxiliary sources, one can then predict the evolu-
While the energy confinement time provides a useful tion (or steady-state profiles) of the plasma equilibrium
global measure of plasma confinement, it is not sufficient for density, temperature, rotation, and current, for a given model
accurate predictions of many important aspects of reactor of the plasma fluxes Cj, Qj, and Pj. The level of plasma con-
performance such as plasma stability, because it does not finement obviously depends directly upon these fluxes—the
contain information about the spatial variations or dynamics larger the fluxes (in particular, the energy flux Q), the faster
of the equilibrium density and temperature profiles. In a the energy leaves the system, and thus the poorer the
confinement. The irreducible minimum for these fluxes is collisionality. In many (if not all experiments), multiple insta-
determined by collisional diffusion, the details of which can bilities are simultaneously present, operative, and nonlinearly
be derived via use of generalized Chapman–Enskog theory coupling with each other throughout the plasma. Thus, in
in analogy to neutral fluids; this approach is detailed in the order to carry out accurate predictive transport modeling, we
classic review paper by Braginskii.32 The key difference must develop microturbulence models which can correctly
from neutral fluid collisional transport is that in a well- describe the dependence of the cross-field turbulent particle,
magnetized plasma, the typical diffusive step-size for cross- energy, and momentum fluxes on various driving gradients
field (i.e., radial) transport is given the gyroradius q ffi¼ vT =Xc
pffiffiffiffiffiffiffiffiffiffiffiffiffi and parameters as the mix of instabilities present changes.
of the species in question (where vT ¼ kB T=m and Xc Additional information on the physics of these instabilities
¼ qB=mc are the thermal velocity and cyclotron frequency, can be found in a variety of review articles and textbooks on
respectively) rather than the mean free path kmf p ¼ vT =!coll the subject, such as those by Horton35 and Weiland.36
(where ! coll is a collisional scattering rate), leading to radial Because these instabilities generally have small ampli-
energy fluxes of the form Q ¼ 'nvdT=dx, with v scaling as tudes, characteristic cross-field correlation lengths on the
q 2 !coll . In toroidal plasmas, additional complications arise order of 1–10 q i or smaller, much smaller than the plasma
due to the 1/Rmaj dependence of the toroidal magnetic field minor radius a, and characteristic frequencies small relative
(where Rmaj is the major radius) that leads to trapped par- to the ion cyclotron frequency, they are most accurately
ticles and larger values of v. Collisional transport in this case described via the coupled gyrokinetic-Maxwell equa-
is described by neoclassical theory.33,34 The practical impli- tions.31,37 In their most common form, these equations
cation for toroidal MFE devices is that while this neoclassi- describe the self-consistent gyromotion-averaged dynamics
cal transport is often an order of magnitude larger than of small fluctuations f~ðX;~
~ v; tÞ in the ion and electron distri-
collisional processes in cylindrical geometry, it is still suffi- bution functions and their associated electromagnetic fields,
ciently small that if it were the only process acting, the for a specified set of equilibrium electric and magnetic fields
needed confinement for net energy gain could be obtained in and equilibrium kinetic distribution functions f0 ðX;~ ~ v; tÞ
relatively small devices. which include the radially varying equilibrium density, tem-
Unfortunately, many tokamaks (as well as other MFE perature, and rotation profiles that drive the turbulence. This
devices) observe thermal confinement levels approxi- model is often referred to as the “df” formulation since it is
mately 10–100 times worse than what is expected from based upon the assumption that df ¼ f~=f0 is a small parame-
neoclassical transport theory. In most of these devices, ter on the same order as q/L, where q is the gyroradius of the
this difference is now commonly ascribed to the presence species in question, and L a characteristic equilibrium scale
of “microturbulence”—small scale (i.e., correlation length such as the major radius Rmaj, minor radius a, or an
lengths much less than the plasma minor radius), small ampli- equilibrium gradient scale length. A kinetic rather than fluid
tude (~ n =n0 ( 1 where n~ is the density fluctuation and n0 the approach is generally required for accurate description of
equilibrium density) fluctuations driven by the inherent cross- microturbulence in order to correctly capture important
field gradients of the equilibrium plasma density, tempera- velocity-space dynamics such as particle trapping and
ture, and rotation.35,36 These fluctuations nonlinearly couple Landau damping which play significant roles in determining
and exhibit collective behavior which manifests as cross-field mode growth rates. However, sophisticated generalized fluid
fluxes of the form Qturb ¼ ð3=2Þh~ p ~v r i (where p ¼ nT, p~ is the equations have been developed which can approximately
pressure fluctuation, and ~v r the radial velocity fluctuation), capture the dominant gyroaveraging and velocity-space
which act to reduce the driving equilibrium gradient(s) and effects through combinations of higher-order moments and
thereby limit confinement achieved. Given the number of closure formulations.38–45 The most sophisticated of these
potential free energy sources—the cross-field gradients of gyrofluid models can accurately reproduce many characteris-
density, temperature, and toroidal rotation of multiple ions as tics of the full gyrokinetic model (including electromagnetic,
well as electrons—it is perhaps not surprising that a corre- geometric shaping, and collisional effects) at significantly
spondingly wide array of instabilities driven by these reduced computational cost.46
gradients has been identified. These instabilities are often Another important aspect of these equations is that in their
classified by their dominant driving mechanism. In current conventional “df” formulation (and corresponding gyrofluid
tokamak plasmas, the dominant instabilities are generally reductions), they can be derived as part of a formal expansion
observed to be the ion temperature gradient (ITG) mode
theory in q ) ¼ q i =a, starting from the Fokker–Planck equa-
which operates on ion gyroradius q i scales and the corre-
tion. This hierarchy can be summarized as requiring (in the
sponding electron temperature gradient (ETG) mode which
limit of small rotation) that:
operates on electron gyroradius q e scales, as well as the
trapped electron mode (TEM) driven by both the electron 1. to order unity, each equilibrium guiding-center distribu-
density and temperature gradients, and which spans the ion tion is a time-stationary Maxwellian with constant density
and electron gyroradius scales (where it smoothly transitions and temperature on a given magnetic flux surface,
to the ETG mode). Beyond these modes, both resistive and 2. to 1st-order in q* fast, small-scale fluctuations in the
kinetic ballooning modes may be unstable (most often in the distribution function are described via the gyrokinetic
near-edge and pedestal regions close to the LCFS of the equations, while slowly varying, large-scale distribution
plasma), while microtearing modes may be unstable at suffi- function fluctuations and corrections are described by the
ciently high normalized plasma pressure b ¼ 2l 0 nT=B2 and drift-kinetic equation,
3. to 2nd-order in q* the equilibrium profiles slowly evolve regardless of the computational cost required to obtain those
due to the fluxes arising from the turbulent and neoclassi- results. Specifically for validation of plasma turbulence mod-
cal processes, as well as internal and external sources, as els, the appropriate comparison is between predictions of
described by the transport equations Eqs. (1)–(3). assorted statistical quantities derived from simulation and
experimental data, and not specific time traces or “snapshot”
From the perspective of validation, the greatest advant-
visualizations (i.e., viewgraph norm comparisons).
age of this approach is that it provides a rigorous definition
Since these nonlinear gyrokinetic simulations can
of the conceptual model to be validated against experiment,
require 103 processor-hours or more (even 107 and beyond in
which provides clarity in formulating the desired compari-
some cases65–68) to calculate the statistics at a single location
sons. The primary drawback is that the idealized physical sit-
in the plasma, reduced models of the turbulence have been
uation this ordering describes—the slow evolution of
developed which can make predictions on the processor-
equilibrium profiles on perfectly nested axisymmetric flux
minute or less timescale, to facilitate practical transport mod-
surfaces due only to neoclassical and small-amplitude fluctu- eling with feasible resource requirements. The general
ations—is in practice almost never realized experimentally. approach of most such models is to decompose the turbulent
For example, large-amplitude fluctuations near the LCFS, fluxes into two components along the lines of (using
non-axisymmetric equilibria arising from both internal (e.g., P the ion
energy fluxPQi as an example) Qi ¼ ð3=2ÞReh k p~)i;k ~v r;k i
sawteeth47 and tearing modes48) and external (e.g., error ¼ ð3=2ÞRe k Rpk i hj~v r;k j2 i where the angular brackets denote
fields49 and resonant magnetic perturbations29) processes,50 averaging over both magnetic flux surfaces and fast turbulent
and rapidly varying external heating sources operated in timescales, and the sum is a sum over wavenumbers k. In these
feedback to maintain plasma performance51 all lead to viola- models, the wavenumber-dependent ion pressure response
tions of the formal ordering outlined above in different function Rpk i ¼ p~)i;k =~v r;k is generally calculated via direct solu-
regions of the plasma. However, in many cases, these viola- tion of linearized (gyro)fluid equations. It is then convolved
tions are weak, or localized to certain regions of the plasma, with a second model for the fluctuation spectral intensity
and the underlying physical picture embodied in this model hj~v r;k j2 i which may come from calibration of the model
is believed to represent a useful practical paradigm for against experimental data,69 analytic theory arguments,36,70,71
understanding and predicting plasma confinement. or fits to databases of nonlinear simulations.46,72,73 The under-
As might be expected, there are very few (if any) ana- lying physical motivation for this approach is that unlike
lytic solutions to the gyrokinetic-Maxwell equations useful many neutral fluid turbulence systems, in the core region of
for quantitative predictions of turbulent transport, and so MFE-relevant core plasmas, the fluctuations saturate at small
numerical solution (i.e., a code implementing the model) is amplitude and retain many of the linear wave characteristics
required. The most accurate solutions are obtained through of the underlying instabilities. As discussed above, this
nonlinear, initial value simulations of either the gyrokinetic- assumption of small amplitude levels often breaks down
Maxwell equations or their fluid variants, analogous to direct toward the LCFS, limiting the region of plasma to which these
numerical simulation (DNS) of neutral fluid turbulence. models can be appropriately applied. Obviously such
A variety of such codes have been developed, which can “quasilinear” approaches contain many more approximations
generally be divided between those taking a continuum than the DNS approach, and as such effectively constitute sep-
approach52–56 and those that use particle-in-cell meth- arate conceptual models than the DNS approach. Nonetheless,
ods.57–62 As in neutral fluid DNS, both approaches initialize they also often provide sufficient experimental fidelity at such
the simulation with some very small amplitude fluctuations, greatly reduced computational cost as to currently be the only
which first grow exponentially at the linear growth rate(s) of practical models for predictive and interpretive transport mod-
the instabilities being considered (the “linear” phase), and eling. Finally, one should note that both approaches implicitly
then saturate at a finite amplitude set by the balance of these assume that there is a single, unique nonlinear solution for the
linear drives and nonlinear couplings between different specified input parameters. While to the author’s knowledge
wavenumbers (the “saturated” phase). The statistics of vari- there are no known counterexamples that disprove this
ous quantities (such as mean energy flux or fluctuation assumption for numerical converged simulations of gyrofluid
power) from this saturated phase are then used for transport or gyrokinetic simulations, the possibility of multiple physical
modeling predictions,63,64 as well as comparisons with other solutions cannot be formally ruled out.
models and experiments in V&V studies. Implicit in this More extensive discussions of the underlying physics of
approach is the assumption that the saturated phase repre- the various microturbulence-relevant instabilities are avail-
sents a “statistical steady-state” from which well-converged able in a wide variety of review articles and textbooks such
estimates of the quantities of interest can be made, and that as Refs. 35 and 36, and further details on their dynamics can
the results are independent of the initial conditions. be found therein. For the purposes of this paper, we need to
Appropriate calculation of these statistics and their related consider only one particular defining feature of their dynam-
uncertainties is discussed further in Secs. IV and V. Here, it ics—a hypersensitivity to the primary driving gradient. In
is important only to emphasize that for virtually any V&V many cases (most notably the ITG and ETG modes), there is
effort, it is only comparisons of these converged, initial- a critical value of the driving gradient that must be exceeded
condition independent quantities that are meaningful. Claims for the mode to become unstable. Moreover, once this gradi-
based upon small averaging windows (or sample sizes) and ent is exceeded, the turbulent fluxes are observed to increase
early simulation times have no physical value or relevance, superlinearly, whereas the neoclassical collisional fluxes
scale linearly with driving gradients (Fig. 2). This phenom- TABLE II. ITER physics basis figures of merit for evaluating transport mod-
enon is often referred to as transport stiffness in the literature els. Adapted with permission from ITER Physics Basis Expert Groups on
Confinement and Transport and Confinement Modelling and Database and
(see, e.g., Refs. 74–78), and has two important implications. ITER Physics Basis Editors, Nucl. Fusion 39, 2175 (1999). Copyright 1999
Experimentally, stiff transport means that once the critical IAEA.16
gradient has been exceeded, it becomes increasingly hard to
sim exp
increase the core pressure by increased heating (hence the 1: Ratio of incremental total stored energy Winc =Winc
ð " #
term stiff). From a modeling standpoint, first note that most 3
where Winc ¼ dV ne Tê þ ni Tî and T^ ¼ TðqÞ ' Tðq BC Þ
2
microturbulence models predict the turbulent fluxes and fluc- sim exp sim exp
2: ðWinc =Winc Þe and ðWinc =Winc Þi
tuation levels for a given (i.e., input) set of local parameters (separate e and i)
and gradients. The inherent stiffness of the turbulence mag- 3: ðni;q¼0:3 Ti;q¼0:3 WÞsim =ðni;q¼0:3 Ti;q¼0:3 WÞexp
P
nifies any uncertainty in the driving gradient(s) into larger 4: v2 ¼ ½ ðTsim ' Texp Þ2 +=Nr 2 , where r is the expt. error and N the number
uncertainties in the predicted quantities, which must be of observations
Ð
included in any comparison against measured values of the 5: b )2 )2
sim =b exp where b
)2
¼ dV n2i Ti2
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ð q BC (sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ð q BC
predicted quantities. Confronting this property of the turbu- 2 2
6a: STD ¼ dx ðTsim ' Texp Þ dx Texp
lence is therefore essential for any useful turbulent transport 0 0
validation metric, and is discussed in detail in Sec. IV B. )ð q BC *(sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ð q BC
6b: OFF ¼ dx ðTsim ' Texp Þ dx Texp 2
Note that since experimental determinations of these gra- 0 0
dients are obtained from derivatives of profile measurements,
the gradients can have non-negligible uncertainties even if
the profile measurements themselves have small
uncertainties. simplicity and accessibility should not be underestimated,
particularly for communicating the results of validation stud-
A. Historical transport modeling metrics ies to management and decision-makers that may be
removed from details of day-to-day research. These metrics
Historically, the first widely utilized validation metrics
also focus on quantifying the ability of transport models to
in MFE transport and turbulence modeling were the six
accurately predict the key global measures of reactor per-
“figures of merit” detailed in the ITER physics basis,16 listed
formance, such as stored energy and hb 2 i / Qf usion , which
in Table II. Most of these metrics are focused only upon core
are the bottom-line quantities such modeling is intended to
confinement, often defined as the region inside of some
predict. However, the metrics also have several drawbacks.
boundary radius q BC (¼0.9 in the ITER physics basis analy-
First, only figure of merit 4, which to the author’s knowledge
sis). Of these six metrics, the most commonly used are met-
has not been widely used in published MFE transport model-
ric 1, the ratio of predicted to measured incremental stored
ing validation studies, incorporates any estimate of experi-
energy Winc (Fig. 3), and metric 6, the normalized mean off-
mental or model uncertainties. Previous studies using other
set and root-mean-square (RMS) error between individual
figures of merit implicitly address this issue by assessing
predicted and measured profiles.
model performance using databases of experimental dis-
These metrics have several appealing features.
charges (including those assembled by the ITER Topical
Foremost, they all have fairly simple mathematical forms
Physics Activity Transport and Confinement working
corresponding to basic statistical measures (e.g., means and
group79), and looking at ensemble statistics of the metrics
standard deviations). They are therefore easily accessible to
(illustrated in Fig. 4), but this approach is not a completely
a broad audience, particularly non-experts. The value of such
satisfying substitute for explicitly confronting the
FIG. 2. Schematic illustration of transport stiffness, showing the scaling of

the total energy flux (—) as a function of driving gradient. Once the critical
gradient (– – –) for a given microturbulence instability is exceeded, the tur- FIG. 3. Schematic illustration of the fraction of plasma pressure which con-
bulent flux (red line) rapidly increases and quickly exceeds the neoclassical tributes to the incremental stored energy Winc, using the q BC ¼ 0.9 boundary
flux (blue line) to become the dominant component of the total flux. condition of Ref. 16.
FIG. 4. Calculations of ensemble-averaged (a) incremental stored energy ratio (Figure of Merit 1) and (b) STD error (Figure of Merit 6a) in predicted Ti pro-
files for a variety of transport models and confinement scenarios. Reprinted with permission from ITER Physics Basis Expert Groups on Confinement and
Transport and Confinement Modelling and Database and ITER Physics Basis Editors, Nucl. Fusion 39, 2175 (1999). Copyright 1999 IAEA.16
uncertainties in the measurements of each discharge mod- responses and experimental uncertainties in equilibrium
eled. Second, because all of these metrics are global (in that profiles and gradients—is explicitly incorporated into the
they average over the entire simulation domain), one cannot metric design.
easily determine where (or why) the model disagrees with
In order to address these issues, the plasma microturbu-
the experiment. Namely, from these metrics alone one could
lence community has begun to converge on a common
not discriminate between a case where the model had only a
approach of using local sensitivity plots for presenting verifi-
significant error in a small fraction of the domain, or was off
cation80,81 and validation17,18 results. In this approach, one
by a uniform amount everywhere. Finally, and most impor-
identifies a single input parameter (often the driving gradient
tantly for MFE transport modeling where the implicit
of the dominant microinstability), and performs a discrete set
assumption is that validation against current experiments
of simulations in which this parameter is systematically var-
will enable more confident extrapolation to the prediction of
ied about the experimental value, holding all other model
future regimes, there is no clear connection to the underlying
inputs fixed. A typical example is shown in Fig. 5(a), where
turbulence physics which is presumably determining the
results from (effectively) three different models of ITG dom-
confinement. Thus, it is hard to determine whether model
inant transport are compared against an independent power
performance (particularly good performance) is due to a suf-
balance flux calculation as a function of the normalized ion
ficient underlying understanding and encapsulation of the
temperature gradient R=LTi ¼ 'ðR=Ti ÞdTi =dx, also known
relevant physics, or is simply (un)fortuitous. Addressing
these issues requires local turbulence-focused (rather than as the normalized inverse ion temperature gradient scale
global transport and confinement) metrics. length. Here we have used the major radius R to normalize
the ion temperature scale length since we are most often con-
IV. BUILDING ROBUST LOCAL TURBULENT cerned with the curvature-driven ITG instability in tokamak
TRANSPORT VALIDATION METRICS plasmas. As R serves to parameterize the strength of the
toroidal curvature and corresponding drift velocity, R=LTi is
In order to go beyond the global transport metrics the dimensionless control parameter that appears in analytic
discussed in Sec. III A, we must construct metrics that will calculations of the mode growth rate, rather than simply
allow us to systematically quantify and compare the experi- dTi =dx or 1=LTi . Other common normalization choices
mental fidelity of different turbulence models at multiple include the plasma minor radius a and density scale length
radial locations in multiple tokamak discharges. Building Ln (particularly for the slab ITG instability). It is important
upon the discussions of Secs. II and III, and drawing from to note that this discussion implicitly assumes a local model
experience gained by the community in performing gyroki- of the turbulence, in which the local turbulence properties
netic validation exercises over roughly the last decade, we (including cross-field fluxes) depend only upon the local gra-
can identify a set of criteria that these metrics should meet, dients and other plasma parameters. A full discussion of
beyond those outlined in Sec. II; local versus nonlocal turbulence models is beyond the scope
1. The metrics should utilize simple mathematical formulas to of this paper; we focus on the local approach here since it is
make the results as transparent as possible to non-experts. widely used in both turbulence and transport modeling, and
2. The initial set of metrics should be easily extensible to note that the conceptual approach to validation described in
incorporate comparisons of additional quantities. this paper can be readily adapted to nonlocal turbulence and
3. Calculation of the metrics should be practical for use with transport modeling.
nonlinear gyrokinetic simulations, which can individually This sensitivity plot approach was first published as a
require 1000 or more core-hours to obtain converged means of plasma microturbulence validation studies in a pair
results. of seminal papers by Ross and co-workers,17,18 and then
4. The fundamental challenge for validating plasma micro- utilized in a number of subsequent studies.66,82–94 Many
turbulence models—the interplay between stiff model other validation studies have assessed the sensitivity of
FIG. 5. Local sensitivity plots compar-

ing predictions of (a) total energy flow
n e =ne0 j2 from GS2
Pi ¼ Qi V 0 and (b) j~
gyrokinetic simulations using different
E $ B shear suppression “quench rules”
to experiment, as a function of normal-
ized inverse gradient scale length R=LTi .
Reprinted with permission from Phys.
Plasmas 9, 5031 (2002). Copyright 2002
AIP Publishing LLC.18
inputs but not explicitly presented the results in the form the local approach and local simulations is that they allow
shown in Fig. 5. For the purposes of this paper, we note that one to decouple variations in the local value of dTi/dx from
this approach meets all of the desired criteria for a validation variations in Ti itself, and thus one can isolate the impact of
metric except for explicit inclusion of experimental and variations in the relevant control parameter R/LTi while hold-
modeling uncertainties. In fact, there are three specific sets ing Ti and its associated dimensionless quantities that appear
of uncertainties that must be quantified and incorporated into in the gyrokinetic equations such as Ti =Te fixed. Therefore, in
the metric: the discussions that follow which focus upon the local
approach, variations in dTi =dx are implicitly assumed to be
1. The uncertainty in the power balance flux calculation (or
performed at fixed Ti, and we will treat discussions of varia-
more generally the measured system response quantity in
tions or uncertainties in the local value of dTi =dx as equiva-
the language of Oberkampf et al.8–10,13).
lent to those in R=LTi unless otherwise noted.
2. The uncertainty in the measurement of the local normal-
ized driving gradient (the system input quantity). A. Quantifying uncertainties in power balance fluxes
3. And the uncertainty in the simulation predictions of the
flux (the predicted system response quantity). The most common assessments in plasma microturbu-
lence validation studies are comparisons of the predicted mag-
Quantifying the uncertainty in each of the quantities is a
netic flux-surface averaged particle, energy, and momentum
challenging process that requires careful consideration, and fluxes to what are often referred to as the experimental values.
is discussed further below in Secs. IV A–IV C. However, this label is incorrect, as in the core of MFE-relevant
As will be discussed further in Sec. IV A, direct measure- plasmas, such fluxes are never directly measured—there is no
ments of core tokamak cross-field fluxes are virtually non- diagnostic capable of measuring these quantities that could sur-
existent, and what are commonly termed the “experimental vive at or access the multi-keV core plasma temperatures.
fluxes” are in fact the results of power balance calculations Instead, the turbulence model predictions are compared with
performed independently of the turbulence modeling. An im- independent power balance calculations, which first calculate
portant implication of this lack of direct flux measurements is internal Sint ðxÞ and auxiliary source Saux ðxÞ terms, and then
that the comparison between power balance and turbulence calculate fluxes from the transport equations, e.g.,
model fluxes presented in Fig. 5(a) cannot rigorously be inter-
ð + ,
preted as a validation of the turbulence model, since it 1 x 3 0 0 dW
involves the comparison of two computational model outputs QPB ð xÞ ¼ 0 d x V Sint þ Saux ' ; (4)
V 0 dt
rather than a computational model to experiment. However,
one can interpret Fig. 5(a) as a validation of a joint power bal- where V(x) is the plasma volume enclosed within the flux
ance and turbulence model calculation. In this view, the joint surface labeled by x and V 0 ðxÞ ¼ dV=dx is the surface area of
model calculation yields the prediction of a local inverse gra- that flux surface. The dW/dt terms on the right-hand side
dient scale length R=Lsim
Ti (or whichever other input quantity related to the temporal evolution of the kinetic quantities are
was varied) at which the separate power balance and turbu- often small relative to the internal and auxiliary source terms
lence models fluxes are equal to each other, which can be and so are frequently neglected in the calculation of QPB.
compared with the corresponding measured gradient scale Verification and validation of the various codes and models
length R=Lexpt
Ti . To the extent one has confidence in the power used in these power balance calculations are challenging
balance analysis, such a comparison becomes primarily (but exercises in their own right, and it is essential to always bear
never entirely) an assessment of turbulence model fidelity. in mind that the lack of direct measurements of local fluxes
This interpretation of the local sensitivity plot approach is is a significant constraint on the validation of these tools.
used to formulate a quantitative metric based upon the differ- However, through combinations of extensive verification
expt
ence of R=Lsim
Ti and R=LTi in Sec. IV E. In order to remove exercises,95 global cross-checks (e.g., comparisons of pre-
the dependence upon the power balance analysis, sensitivity dicted and measured neutron production rates to constrain
plots comparing predictions for directly measured turbulence fast ion densities injected by neutral beams96–98 or hard
characteristics can also be made, such as is shown in Fig. X-ray emissions associated with interactions between fast
5(b). These calculations and their associated challenges are electrons and lower hybrid, electron cyclotron, and ion cy-
discussed in Sec. V. Finally, we note that a key advantage of clotron waves99–101) and comparisons with both measured
changes in equilibrium profiles102–106 and core fluctua- emission (ECE), charge exchange recombination (CER), X-
tions,107,108 quantitative confidence in these models to a fairly ray crystal spectroscopy (XCIS), and motional Stark effect
high level has been established for many operating conditions polarimetry (MSE). Each such technique has its own chal-
of interest. Nonetheless, power balance analyses are still sub- lenges and uncertainties that must be understood and incor-
ject to significant aleatory (i.e., statistical) and systematic porated for this uncertainty quantification approach to be
uncertainties. meaningful. An example of this approach can be seen in
One can identify two main sources of uncertainties in White et al.111 in which a 100-element ensemble of profiles
power balance calculations. The first is the inherent statisti- “sets” was propagated through the ONETWO power balance
cal uncertainty due to uncertainties in the magnetic and code112 to generate a corresponding ensemble of power bal-
kinetic equilibrium profiles input to the power balance analy- ance fluxes and thermal diffusivities.
sis. For instance, both the ion and electron energy transport In the second approach, the temporal variation of the
equations (as defined for each species by Eq. (2)) have an in- measurements is used to generate the ensemble. The full
ternal source term related to collisional inter-species energy experimental time-averaging window is broken up into a
transfer Sexch
e=i ¼ ne ðTi=e ' Te=i Þ=sei , where sei is the electron- series of subwindows (whose length is often set by the sam-
ion collision time. Obviously, any uncertainties in the plasma pling rate of the slowest relevant profile diagnostic), and
density or temperatures will translate directly into an uncer- profile fits to the measurements in each subwindow are gen-
tainty in this exchange term, and its corresponding contribu- erated to create an ensemble of time slices. An example of
tions to the energy fluxes. In dense, highly coupled plasmas this approach is shown in Fig. 6, where a 11-element profile
(such as is expected for ITER or a future fusion reactor), this fit ensemble is generated by decomposing temperature
exchange term can be a dominant component in the total profile point data from a typical DIII-D181 L-mode dis-
power balance analysis of the individual ion and electron charge113,114 collected over 220 ms into eleven 20 ms sub-
energy channels. Therefore, if there is a large uncertainty in windows, and fitting the profile point data in each
this term, it can be impossible to determine the relative subwindow (Fig. 6(a)) via splines. The ONETWO code is
amounts of power transported through the ion and electron then used to calculate a power balance analysis for each sub-
channels with confidence, even though the total energy flow window, and the statistics of this temporal ensemble is uti-
is well known since the ion and electron exchange terms lized to calculate the uncertainty in the fluxes (Fig. 6(b)).
exactly cancel when Eq. (2) is summed over all species. If Here, a 95% confidence interval is calculated via
this ratio of ion to electron energy fluxes cannot be deter-
sffiffiffiffiffiffiffiffiffiffiffi
mined with confidence, it is unlikely that comparisons with
r 2R=LTi
turbulence models (which make specific predictions of this r 95 ¼ t) ð0:025; N Þ ; (5)
ratio based upon the mixture of underlying instabilities) will N
be of significant validation utility.
In order to formally determine these power balance where t* is Student’s t-statistic, r R=LTi is the standard devia-
uncertainties, one must first construct probability distribution tion of the ensemble of R/LTi profiles derived from the fits to
functions (PDFs) for the various magnetic and kinetic pro- the experimental data, and N the number of elements in the
files, and then propagate these PDFs through the power ensemble (N ¼ 11 here). For the case shown in Fig. 6, note
balance model to yield PDFs of source and flux profiles. In that although there is quite small statistical uncertainty in the
practice, what is commonly done is to create an ensemble of mean temperature profiles, the variations are still large
equilibrium profiles, from which an ensemble of power enough that the power balance 95% confidence intervals cor-
balance calculations can be made, the statistics of which are respond to an approximately 10% uncertainty in the energy
used to calculate the (mean) source terms and their uncer- fluxes at larger radii (q tor > 0.5). One could also apply this
tainties (generally quantified as the standard deviation of the approach to comparable time windows from multiple
ensemble). Obviously the key to obtaining meaningful “repeat” discharges which hold equilibrium parameters and
results through this method lies in appropriate construction profiles constant across each discharge.
of the profile ensembles. For most experimental MFE situa- In addition to these inherent statistical uncertainties, one
tions, there are two (non-exclusive) approaches to generating must also consider potential systematic uncertainties and
these ensembles. First, recognizing that the equilibrium pro- errors in the power balance analysis. To illustrate the chal-
files to be used are almost always parameterized fits (usually lenges in quantifying these uncertainties and errors more con-
via various splines or polynomials) to experimental point cretely, consider the various individual source and sink terms
measurements, one can transform the uncertainties in the of the electron energy transport equation in the same DIII-D
measured data points into uncertainties in the fits. This prop- discharge, illustrated in Fig. 7. The net electron heating
agation can be achieved in principle through the computa- source (and thus energy flux) is composed of four terms:
tional fitting routines themselves, or more commonly by a direct heating of the electrons by collisions with injected fast
Monte Carlo approach in which ensembles of data points are neutral deuterium beams, resistive Ohmic heating, collision
generated by randomly varying each point in proportion to energy exchange with the plasma thermal ion populations,
the quoted uncertainty, and then generating corresponding and radiation. The calculation of these individual terms will
ensembles of fits. Measurement of different kinetic profiles only be as good as the individual models and assumptions
requires use of different diagnostic techniques,109,110 such used. In these calculations, propagation of the ensemble of
as Thomson scattering, reflectometry, electron cyclotron profiles shown in Fig. 6 enables calculation of the
FIG. 6. Ensemble of fits to measured

($) (a) Ti and (c) Te profiles in a DIII-
D L-mode discharge, derived from
decomposing a 220 ms collection win-
dow into eleven 20 ms subwindows.
The fits are then propagated through
the ONETWO power balance code to
generate ensembles of (b) Qi and (d)
Qe. Individual ensemble members are
plotted as dashed lines (- - -), the en-
semble mean as a solid line (—), and
the 95% confidence interval for each
ensemble mean is denoted by the
shaded region.
corresponding statistical uncertainties of each term, denoted 2. Anomalous fast ion transport, most often due to Alfv!en
by the shaded bands. However, potential systematic errors in eigenmodes, which would broaden the beam heating
the calculations could arise from a variety of sources, such profile.115
as: 3. The availability and quality of measured, calibrated radia-
tion profiles to be used as inputs.
1. Choice of fast beam ion slowing down and equilibration
4. Correct reconstruction of the current profile in the
model.
magnetic equilibrium calculation.
5. Choice of resistivity model and code used (i.e.,
Chang–Hinton analytic theory,116 NCLASS,117 NEO,118
etc.).
6. Uncertainties
P P or errors in the effective charge Zef f
¼ i q2i ni = i qi ni (and resulting uncertainties or errors
in sei) which impact both the Ohmic heating calculation
and collisional exchange term.
7. Systematic errors or biases in the profiles fits due to
choice of fitting form or algorithm (discussed further in
Sec. IV B).
Similar issues arise in the calculation of other source
terms of other profiles; particularly important are charge-
exchange and prompt fast ion losses in the calculation of ion
thermal and momentum sources, and edge neutral penetra-
tion and ionization in determining the total particle flux. As
with most systematic errors, one can only try to minimize
these errors through judicious choice of appropriate models
and experimental design, and maintain awareness of them in
interpreting results.
B. Quantifying the uncertainty in the local driving

gradient
Equally important to quantifying the uncertainty in the
power balance calculations is quantifying the uncertainty in
the driving gradient (or other model input parameter under
consideration) a particular local sensitivity analysis is focus-
FIG. 7. Illustration of various components of total electron energy source
term Se and their uncertainties, for the ensemble of fits and corresponding ing upon. Knowledge of this uncertainty is essential regard-
power balance analysis shown in Fig. 6. less of whether one is approaching the problem in terms of
predicting fluxes and fluctuations, in which case the gradient approach to quantifying these crucial uncertainties. If the
uncertainty translates into the dominant turbulence model current ensemble statistics approach is not deemed sufficient,
input uncertainty, or as a joint power balance-turbulence one might consider utilizing a uniform fractional uncertainty
model prediction of the local gradients, whose sensitivity is in the gradients based on the radially averaged ratio of the
obviously limited by the uncertainty in the measured local 95% confidence interval to the mean spline R/LTi
gradient. For most MFE experiments and conditions, the key ð
1 a r 95
challenge for quantifying this uncertainty arises from the davg95
LTi ¼ dr ; (6)
fact that the equilibrium kinetic profiles used in turbulence a 0 R=LfTiit
and transport modeling are generally smoothly varying fits to
assorted point measurements, as discussed in Sec. IV A. or a RMS fractional difference between the point measure-
These fits are made not just for convenience of analysis, but ments and spline R/LTi profile fits
also reflect an underlying assumption (embodied in the vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
!
u ð f it 2
u1 a R=Ldata
gyrokinetic formulation described above in Sec. III) that the RMS t Ti ' R=LTi
dLTi ¼ dr : (7)
equilibrium profiles vary slowly (relative to turbulent fluctu- a 0 R=LfTiit
ations) spatially as well as temporally.
Fig. 8 shows R/LTi calculated from both the ensemble of Both of these suggested measures are based upon the idea of
spline fits and direct finite difference of the CER point meas- using radial averaging to smooth out the fast variations of
urements shown in Fig. 6(a). The horizontal error bars on the R/LTi and its associated confidence interval due to either the
point values of R/LTi equal the radial separation of the chan- spline knot locations or inherent scatter of the point measure-
nels used in the finite differencing calculation. One can im- ments, at the cost of sacrificing information about physically
mediately see that while spline fit R/LTi profile captures the meaningful radial variations in the uncertainty of the local
bulk trend of the point measurements, it by design does not R/LTi value. The results of applying Eqs. (6) and (7) to the
capture the rapid radial variation and scatter of the point data shown in Fig. 8 yield values of davg95 LTi ¼ 19% and
measurements. At the same time, one can see that the 95% dRMS
LTi ¼ 43%; these results are also illustrated by additional
confidence interval of the spline fits is both much smaller shaded regions in Fig. 8. Alternatively, a more modest radial
than the variations of the point measurements, and itself has smoothing could be utilized, although this would require
non-negligible radial variations arising from the locations of identification of some objective method for selecting the
the spline knots. Thus, neither approach is fully satisfactory smoothing function and width. More generally, it would be
for quantifying the uncertainty in either the local gradients or desirable to increase utilization of more sophisticated fitting
inverse gradient scale lengths. In practice, to the extent these and uncertainty quantification techniques that have been
uncertainties are considered in transport and modeling analy- developed in other communities. One promising technique in
sis, the fit ensemble approach is more commonly used, most this vein is Gaussian process regression (GPR),119 which
frequently simply using the standard deviation of the ensem- takes a Bayesian approach to determining the probability dis-
ble to estimate the uncertainty. tribution function of the profile and its gradient. Chilenski
Obviously, this issue is one that could benefit from fur- et al.120 have recently applied this technique to modeling of
ther study and work toward a common, community-accepted impurity transport in the Alcator C-mod tokamak, and adap-
tation of the approach to other studies appears tractable.
Certainly further study of GPR, and more detailed assess-
ments of its potential benefits and costs relative to traditional
MFE profile-fitting algorithms, is warranted. It should also
be noted that for some profiles, multiple independent diag-
nostics are often combined (such as Thomson scattering and
electron cyclotron emission for measurements of Te profiles)
to increase confidence in the final fit. In such cases, the use
of integrated data assessment (IDA) techniques121–124 can
enable significantly improved estimates of profile and gradi-
ent uncertainties. Finally, in some experiments, it is possible
to use small “jogs” or scans of the plasma through the diag-
nostic viewing locations to obtain more smoothly varying
profile (and thus gradient) measurements;125 similar results
for diagnostics such as ECE can be obtained through small
FIG. 8. Calculation of R=LTi from data and fits shown in Fig. 6(a). The cal- variations in toroidal field strength.126
culation of R=LTi from direct finite differencing of the measurements is plot-
ted as ($), with associated horizontal bars indicating the separation of CER C. Quantifying the uncertainty in the simulation
channels differenced to obtain that point. Calculations of R=LTi from indi-
vidual member profile fits are plotted as (red dashed line), and the ensemble
predictions
mean as (red line). The shaded region bounded by solid lines denotes the
The third group of uncertainties that we must quantify is
95% confidence interval on the ensemble mean R=LTi . Additional bounded
avg95
shaded regions indicate the uncertainty intervals associated with dLTi ¼ the simulation output uncertainties. For nonlinear initial-value
19% (blue dashed line) and dRMS
LTi ¼ 43% (gray dashed line). simulations, one source of such uncertainty arises from the
time-averaging of the simulation results, which as described one of the most straightforward to minimize. Toward this end,
above is necessary for any meaningful V&V study. To see average values of Qi for all possible choices of averaging win-
how one should address and quantify these particular uncer- dow, parameterized in terms of starting time tstart and averag-
tainties, consider the time trace of the magnetic flux-surface ing window size tavg, are shown in Fig. 9(b). So long as the
averaged ion energy flux Qi from a nonlinear gyrokinetic sim- initial transient linear phase (t ! 150 a=cs ) is not included, and
ulation of a DIII-D discharge shown in Fig. 9(a). Qi is output
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi a sufficiently long averaging window (tavg > 100 a=cs ) is uti-
every Dt ¼ 1 a=cs (where cs ¼ kB Te =mi is the ion sound lized, the exact values of the time-averaged ion energy flux
speed), and the thick bar indicates the mean value of Qi ¼ are insensitive to the choices made.
1:26 W=cm2 averaged over the time window of 200 a= Although Fig. 9(b) suggests that the time-averaging
cs , t , 600 a=cs . Note that the averaging does not begin uncertainty in the mean value of Qi is “small” for appropri-
until after the initial transients have damped away and the ately chosen averaging windows, a more rigorous and impar-
early linear physics (exponential growth) phase has ended. tial means of quantifying this uncertainty is needed. To date,
Although there is no easy formal rule for defining when this there has not been a widely adopted common approach to
linear phase ends and the nonlinear phase of physical interest this question within the MFE turbulence community. One
begins, the practical rule of thumb to use is that any quoted possible approach is illustrated by the series of thin horizon-
results should be insensitive to the choice of averaging tal lines plotted in Fig. 9(a), which mirrors standard experi-
window endpoints. More broadly, any nonlinear initial value mental signal processing techniques. In this approach, the
simulations used in V&V studies should be run for sufficiently full averaging window is first decomposed into a series of
long times that the uncertainty associated with the time- sequential subwindows, each of which is chosen to be long
averaging is subdominant relative to all of the other sources of enough to average over fast variations, such that the mean
input and output uncertainty, since it is (relatively speaking) values of Qi calculated for each subwindow can be taken to
form an ensemble of uncorrelated, independent estimates of
the “true” mean value of Qi. The uncertainty in this “true”
mean value is then taken to be the standard deviation of the
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
PN
mean r M ¼ j¼1 ðQi;j ' Q" i Þ2 =N 2 (where Qi,j is the mean
value of Qi in the j-th subwindow, and Q" i ensemble mean
value), calculated from the ensemble of subwindow means.
For the particular Qi timeseries shown in Fig. 9(a), applying
this approach to a full averaging window of 200 a=cs , t ,
600 a=cs with 50 a=cs subwindows yields a mean Qi of
1:26 W=cm2 with r M ¼ 0:03 W=cm2 . While there have been
some efforts to formalize this approach,127 there remains a
significant opportunity and need to develop more rigorous
algorithms for identifying appropriate time-averaging win-
dows and their corresponding uncertainties. Future studies in
this direction should look to draw upon the experiences and
expertise of other communities which routinely utilize initial
value nonlinear fluid turbulence simulations.
In applying this subwindowing technique, one should
note that it assumes that the simulation saturates about a con-
stant mean value that is large relative to amplitude of the fluc-
tuations about that mean level. Such a result is often obtained
for cases when the plasma is robustly unstable. However, for
cases near marginal stability (R=LTi - R=LTi;crit ), gyrokinetic
simulation outputs are often observed to exhibit slow secular
dynamics and significantly skewed fluctuations about mean
values. For such cases, there is no commonly accepted meth-
odology in the MFE community known to the author of cal-
culating a well-justified mean value (in terms of choosing an
averaging window) or associated uncertainty. Given the inter-
est in improving predictions of ITER and reactor plasmas
FIG. 9. (a) Time trace of Qi from gyrokinetic simulation of a DIII-D dis- which are expected to lie in such a near-marginal regime over
charge. The thick line (—) indicates the average value over the window
200 a=cs , t , 600 a=cs , and the thin lines (red line) indicate mean values much of the plasma volume, it is clear that more work is
of sequential 50 a/cs subaveraging windows. Adapted from Phys. Plasmas needed to define appropriate analysis and uncertainty quanti-
18, 056113 (2011). Copyright 2011 AIP Publishing LLC. (b) Contour plot fication methods for such cases.
of mean Qi values averaged over the window tstart , t , tstart þ tavg , for ar-
Beyond these finite time-averaging uncertainties, there
bitrary values of tstart and tavg. The average value is seen to be insensitive to
the choice of specific averaging window when tstart > 150 a=cs and will be additional uncertainties in the model outputs due to
tavg > 100 a=cs , indicated by (– – –). uncertainties in any model inputs other than the control
variable under consideration. In the context of the discussion Finally, we note that one could also potentially classify
thus far, these would be uncertainties in any model inputs numerical errors due to effects such as finite grid resolution,
other than R=LTi . In particular, uncertainties in other key boundary conditions, domain size, source and sink terms, or
instability-drive gradients such as the electron temperature the accuracy of the time-integration scheme as a third source
(R=LTe ) and density (R=Ln ), as well as the equilibrium E~$ B~ of contributions to model output uncertainties. However,
shearing rate cExB (which in general suppresses the turbu- since these are more properly viewed as (hopefully known
lence128,129) and key dimensionless parameters such as mag- and minimized when feasible) systematic errors rather than
netic safety factor q, magnetic shear s ¼ ðx=qÞdq=dx, and uncertainties, we do not consider them further. In cases
impurity fraction (sometimes expressed in terms of Zeff) can where sufficient computational resources are not available to
yield significant uncertainties in model outputs. These uncer- minimize these errors, it is important to include them in the
tainties exist for both nonlinear initial value turbulence simu- formulation of the validation metrics.
lations as well as purely deterministic reduced turbulent
transport models. Figure 10 shows values of Qi predicted by D. Example: Quantifying local gyrokinetic
GYRO simulations of an ITG-dominant DIII-D L-mode dis- performance in DIII-D L-mode discharges
charge for the nominal “base case” parameters, along with
In order to illustrate the practical application of the local
625% variations in R=LTi , R=LTe ; R=Ln , and cExB, all of
sensitivity plot analysis combined with uncertainty quantifica-
which are within experimental uncertainties. One can see
tion to a “real-world” validation problem, we consider in this
that the model exhibits a nonlinear response to changes in
section an assessment of gyrokinetic predictions of ion and
these parameters as well as in R=LTi (i.e., a 625% variation
electron energy fluxes at different radii in a set of seven
in any input does not necessarily yield a uniform 6 X%
DIII-D neutral beam heated L-mode discharges. Experimental
change in Qi), which significantly complicates formulation
details of these discharges can be found in Refs. 111, 114,
of a simple statistical uncertainty estimate. Moreover, each
130, and 131, and key global experimental parameters are
such variation requires its own execution of the model,
which quickly becomes prohibitively expensive on currently summarized in Table III. Each of these discharges was per-
available computing platforms for nonlinear gyrokinetic sim- formed as part of a coordinated transport model validation
ulations. In addition, “cross-terms” and couplings between effort by the DIII-D experimental team, with particular atten-
different parameters are possible (due to, e.g., a mix of tion paid to obtaining comprehensive, well-converged profile
strong ITG and TEM instabilities being simultaneously pres- and fluctuation measurements in repeatable conditions.
ent in the simulation) that may not be well captured by vary- For each case, the experimental data are averaged over at
ing each input individually. Given these challenges, there is least 200 ms during which the plasma is slowly evolving, and
currently no widely used practical and robust model for esti- uncertainties in experimental profiles, gradients, and power
mating such uncertainties for gyrokinetic simulations, and to balance calculations estimated by ensemble calculations gener-
the author’s knowledge no significant exploration of poten- ated from subdivision of the full averaging window into 20 ms
tial methods for use with computationally cheap reduced subwindows. Uncertainties in R=LTi are calculated using r RMS LTi ,
models has been undertaken. As such, it represents one of defined above in Eq. (7). The power balance analysis was per-
the ripest areas for more research and collaboration between formed with the ONETWO code,112 using the Callen analytic
the MFE community and broader computer science, applied model132 to calculate neutral beam sources and TORAY-GA
math, and UQ communities. Whether next-generation exas- code133 to calculate electron cyclotron heating sources. The
cale computing platforms can be profitably engaged to pro- gyrokinetic simulations are performed using the GYRO
ductively address this challenge remains to be seen. code,54 and all simulations were averaged for over 250 a=cs .
However, we utilize a constant 10% fractional uncertainty for
all GYRO flux predictions as a conservative estimate of the fi-
nite time-averaging statistical uncertainty. Since we have no
tractable way of fully quantifying uncertainties in the GYRO
outputs due to uncertainties in inputs other than R=LTi , we
leave these uncertainties as unspecified “known unknowns.”
All simulations were performed with resolutions and
algorithms similar to those reported in Refs. 111, 114, 130,
and 131, but using a common version of the GYRO source
code.134 Time integration was performed with a 4th-order
Runge–Kutta scheme that treats fast parallel electron dynam-
ics implicitly and other terms explicitly. The integration
timestep h was less than or equal to 0:01 a=cs in all cases,
such that estimates of the numerical integration error are less
than 0.1%. Each simulation used a standard 128-point veloc-
ity space discretization (eight energy points, eight pitch
angles, and two signs of parallel velocity), and physical sim-
FIG. 10. Illustration of sensitivity of GYRO predictions of Qi to 625% ulation domains of approximately 100 q s across in both the
changes in R=LTi ; RLTe ; RLn , and cExB. radial and binormal dimensions, where q s ¼ cs =Xci , with
TABLE III. Global parameters for DIII-D transport model validation discharges. The plotting symbol column indicates the plotting symbol to be used for that
discharge in Figs. 11–14 and 19–21.
Discharge number Avg. window (ms) BT (T) Ip (MA) n"e (1019 m–3) PNBI (MW) PECH (MW) Plotting symbol Reference
128913 1400–1600 2.05 1.05 2.3 2.6 0 ! (filled circle) 114

136674 1300–1500 2.05 1.15 3.2 2.6 0 " (brown triangle) 130
136693 1300–1500 2.05 0.7 4.1 5.2 0 # (red inverted triangle) 130
138038 1400–1650 2.05 1.0 2.3 2.6 2.2 3 (blue left pointing triangle) 111
138040 1400–1650 2.05 1.0 2.3 2.6 0 " (blue right pointing triangle) 111
142351 1400–1600 2.1 0.98 2.3 2.6 0 $ (green rhombus) 131
142371 1800–2000 2.1 0.98 2.3 2.6 3.2 % (green square) 131
additional 10 q s wide buffer regions on at either end of the also yields a simple physical interpretation. If the curve of
radial domain. The radial grid resolution was approximately DQi vs. DLTi passes within its uncertainties through the origin
0.5 q s for simulations at q tor ’ 0.25 and 0.5, and 0.3 q s for (highlighted by the bold star symbol in Fig. 11(b)), then one
those simulations at q tor ’ 0.75, and all cases use 16 toroidal can legitimately claim that the model prediction is consistent
modes with separation Dn chosen such that binormal wave- with the experimental observations, given the known uncer-
numbers span the range 0 , kyq s ! 1, where ky ¼ nq=rmin . tainties. Conversely, if the distance between the DQi curve
Since these simulations consider only long-wavelength q s- and the origin is always larger than the uncertainties, one has
scale dynamics for which ky q e ¼ 60 ky q s ( 1, the electrons demonstrated a statistically significant difference between
are treated with a simpler drift-kinetic model (which assumes the model prediction and experimental measurements.
k? q e ¼ 0) rather than a full gyrokinetic description. The sim- Recasting the local sensitivity plot in terms of fractional
ulations include finite perpendicular (but not parallel) mag- differences has a second practical benefit, which is that it
netic fluctuations and two dynamic ion species (deuterium
and carbon), and use a generalized Miller representation135
to describe shaped geometry.
In Fig. 11(a), a comparison of the GYRO prediction of
the ion energy flux Qi with the ONETWO power balance
calculation at q tor ¼ 0:75 in the most well-studied of the
discharges considered (as seen in Refs. 113, 114, 131, and
136–139) is plotted as a function of R=LTi including all of
the uncertainties described above. One can clearly see that
the difference between the GYRO prediction of Qi and corre-
sponding ONETWO calculation at the nominal experimental
gradient, or alternatively the difference of the predicted
flux-matching gradient (i.e., the value of R=LTi for which
QGYRO
i ¼ QONETWO
i ) and the experimental gradient clearly lie
outside the net model and experimental uncertainties. The
question then arises as to whether this result is unique to this
particular location and discharge, or robust across multiple
radii and plasmas. To answer this question, we begin by
recasting the local sensitivity plot shown in Fig. 11(a) into a
comparison of fractional differences and uncertainties,
shown in Fig. 11(b). Thus, instead of plotting the power bal-
ance and gyrokinetic flux predictions in W=cm2 as a function
of R=LTi , we plot the fraction difference of the ion energy
fluxes DQi ¼ ðQGYRO i ' QONETWOi Þ=QONETWO
i as a function of
the fractional change in input value of dTi =dx ¼ rTi ;
DLTi ¼ ðrTiGYRO ' rTiexpt Þ=rTiexpt . The uncertainty in DLTi
is simply the fractional uncertainty dRMS LTi in the experimental
measurement of R=LTi , and the uncertainty in DQi is calcu-
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
lated as ðr ONETWO
i =QONETWO
i Þ2 þ ðr GYRO i =QGYRO
i Þ2 since
the propagation of profile uncertainties through ONETWO
represented by r ONETWO
i is completely independent of the
time-averaging uncertainty r GYRO . Recasting Fig. 11(a) in FIG. 11. Scaling of GYRO predictions (!) of Qi vs R=LTi at q tor ¼ 0.75 in
i DIII-D discharge 128913, in both (a) physical units and (b) fractional differ-
this form is analogous to the final step in the validation met- ences. The broken lines indicate where DQi and DLTi ¼ 0 and the joint
ric progression of Oberkampf et al. shown in Fig. 1(f), and experimental-power balance results are plotted as (red asterisk).
facilitates direct comparisons of multiple conditions (e.g., relative to theoretical scalings (e.g., gyroBohm scaling
different radii and/or discharges) that have different experi- PgB ¼ ni Ti vti q )2 ) or some combination of experimental and
mental and simulation values. This utility is shown in Fig. model uncertainties.
12, which plots the fractional ion and electron energy flux
differences for all seven discharges listed in Table III at E. Using flux-matching gradients to construct
q tor ¼ 0.25, 0.5, and 0.75. From these plots, we can immedi- validation metrics
ately draw two conclusions. First, we see that for all seven While the normalized local sensitivity plots shown in
discharges there is a systematic underprediction of Qi and Qe Fig. 12 provide a useful means of illustrating a model’s fidel-
at q tor ¼ 0.75 (or equivalently an overprediction of the flux- ity for a single experimental condition, they rapidly become
matching gradient) which is larger than the experimental and cluttered when multiple conditions are plotted. In particular,
model uncertainties. Second, for all seven discharges, the while one can clearly see in Fig. 12(c) that none of the simu-
model predictions are significantly more consistent with the lations at q tor ¼ 0.75 simultaneously match the fluxes and
power balance ion and electron energy fluxes at both R=LTi , there is enough scatter in the results at q tor ¼ 0.25
q tor ¼ 0.25 and 0.5. We can therefore conclude that this par- (Fig. 12(a)) and 0.5 (Fig. 12(b)) that it is not easy to deter-
ticular model (ion-scale microturbulence predicted with the mine how robust the model (dis)agreements are at those
GYRO code) cannot simultaneously match the experimental radii. Moreover, in order to better quantify how a model’s
gradients and power balance fluxes (within uncertainties) in performance varies with radius, or as a function of global
NBI-heated DIII-D L-mode discharges at q tor ¼ 0.75, but can parameters such as the auxiliary heating mix, plasma current,
(at least in some cases) at q tor ¼ 0.25 and 0.5. To make these density, etc. it is desirable to condense the local sensitivity
conclusions more quantitative requires formulation of an plot to a more compact and quantitative measure of model
explicit validation metric, which is the subject of Sec. IV E. fidelity. Since the goal of the turbulent transport models
There are other possible choices for normalization of the discussed here is the prediction of the equilibrium kinetic
simulation inputs and outputs beyond the experimental or profiles and gradients, the natural reduction of the local
power balance calculations, and the most useful choice will sensitivity plot is the difference between the measured exper-
be case-dependent. For example, while the choice of normal- imental gradient and predicted flux-matching gradient, i.e.,
izing quantities to the power balance and experimental levels the gradient for which the model prediction of the associated
works well here for Qi and R=LTi , such a choice is not appro- flux matches the power balance result. In the spirit of utiliz-
priate or possible when the experimental level is very small ing normalizations that allow comparisons across multiple
relative to the expected range of simulation inputs or outputs. experimental conditions, we define a flux-matching frac-
A typical example here would be predictions of momentum tional gradient error metric
transport in a case for whichP there was no meaningful auxil-
ELTi ¼ DLTi jDQ ¼0 (8)
iary torque source T inj ¼ j T j . In this Ð case, a power
i
balance analysis would predict PPB / dV T inj ’ 0, and

with an associated uncertainty
experimentally one often observes small rotation gradients in
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
these plasmas, particularly relative to their uncertainty.
r ELTi ¼ r 2DLT þ r 2DQ jDQ ¼0 : (9)
In such cases, one might choose to normalize quantities i i i
FIG. 12. Plots of (a)–(c) DQi and (d)–(f) DQe vs. DLTi at (a) and (d) q tor ¼ 0.25, (b) and (e) 0.5, and (c) and (f) 0.75 for all seven discharges listed in Table III.
The broken lines indicate where DQi and DLTi ¼ 0 and the origin is indicated by (red asterisk).
Although the choice of requiring a match between the turbu-

lent and power balance ion energy fluxes is natural (since Qi
is the relevant flux in the ion temperature transport equation
(Eq. (2)), and we are considering ion-scale simulations of
ITG turbulence), it is only one several possible choices.
Equally viable would be the electron energy flux, total energy
flux, particle or momentum flux, or even fluctuation ampli-
tudes or other characteristics (discussed further in Sec. V),
depending upon the specific goals of the validation exercise.
Once the fractional gradient error is defined, one can
define error metrics for any other comparison quantity such
as Qe based upon the difference between the power balance
or experimental measurements and the model predictions of
that quantity at the flux matching gradient. Thus we would
define a fractional error metric for Qe as
EQe ¼ DQe jDQ ¼0 : (10)

i
For this metric, we define the error only as
r EQe ¼ r DQe jD ; (11)

Qi ¼0
i.e., the uncertainty in DLTi is not included.

The relationship between ELTi and EQe is illustrated in
Fig. 13, and the results of calculating these metrics for the
simulation data shown in Fig. 12 are plotted in Fig. 14. In
FIG. 14. (a) ELTi and (b) EQe vs q tor for all seven discharges listed in Table
addition to the individual error metrics, the uncertainty- III. The uncertainty-weighted ensemble mean values at each radius are plot-
weighted average over all seven discharges (i.e., using ted as pink asterisk.
weights Wi ¼ 1=r 2i ) is also plotted vs. radius for both ELTi
and EQe in Fig. 14. Confirming the visual impressions of
model fidelity from Fig. 12, we see that at both q tor ¼ 0.25
and 0.5 both the mean ELTi and almost all individual cases
have values of ELTi smaller than the associated uncertainty,
while every case at q tor ¼ 0.75 predicts a positive value of
ELTi (i.e., overpredicts the local value of R/LTi) that is larger
than the associated uncertainty.
Even more interesting are the results for EQe , which
exhibit a fair amount of scatter. If we consider only the
ensemble-averaged value at each radius, we see that at
q tor ¼ 0.25 and 0.75, the mean value of EQe is slightly nega-
tive but with an absolute value larger than the associated
uncertainty, corresponding to a statically robust residual
underprediction of Qe even when Qi has been matched. On
the other hand, Qe is on average overpredicted at q tor ¼ 0.5.
However, in all cases we see clear individual outliers with
significantly different values than the ensemble mean. The
question here naturally arises as to whether these discrepan-
cies lie within the full range of variability and undetermined
uncertainty in model predictions related to uncertainties in
inputs other than R/LTi. The natural quantity to focus on for
Qe would be R/LTe, in order to determine how much uncer-
tainty there is in the level of Qe driven by the ITG modes
themselves, as well as any other ion-scale modes such as
TEMs which might be present but subdominant. To address
this question, one would extend the validation methodology
discussed so far to calculation of a local sensitivity map in
FIG. 13. (a) DQi and (b) DQe vs. DLTi for q tor ¼ 0.75 in discharge 128913,
which both R=LTi and R=LTe are varied, from which a flux-
illustrating the connection between ELTi ; EQe ; DQi ; DQe , and DLTi. The bro-
ken lines indicate where DQi,e and DLTi ¼ 0 and the origin is indicated by matching fractional gradient error vector E ~z ¼ ðELTi ; ELTe Þ
(red asterisk). could be calculated by determining the simultaneous values
of R=LTi and R=LTe which when input into the microturbu- Here x" and y" are the mean values of x and y, and r x and r y
lence model yield predictions of Qi and Qe that simultane- their standard deviations. As another example, Ricci
ously match the power balance Qi and Qe results. While the et al.147,148 define an uncertainty-normalized distance metric
computational resources needed to perform such an analysis vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
using long-wavelength gyrokinetic simulations (to say, u N
u 1 X ðx i ' y i Þ2
nothing of multiscale simulations which incorporate d¼t (15)
N i¼1 Dx2i þ Dy2i
electron-scale ETG dynamics that likely contribute to Qe in
many cases65–68,140–146) over many conditions or discharges
remain prohibitive for current-day computing platforms, for validations studies of the GBS Braginskii fluid code151
such approaches will likely be feasible on next-generation using data from the TORPEX experiment.152 A bounded
exascale platforms. Moreover, such an approach, or even error metric of the form
further generalizations to include matching of particle and 1 þ tanh½ðd ' d0 Þ=k+
momentum fluxes via additional variations of density and R¼ (16)
2
rotation gradients, is readily feasible now for most reduced
turbulent transport models with fairly modest computing for each observable is then defined, such the R ¼ 0 denotes
resources, and should be pursued further. perfect agreement and 1 complete disagreement. The quanti-
ties d0 and k are free parameters chosen to quantify the
F. Alternative metric formulations threshold level for agreement (d0) and sharpness of transition
While the structure of the ELTi and EQe metrics proposed from agreement to disagreement (k), with d0 ’ 1.4 corre-
above have a clear physical interpretation and are consistent sponding to the case of the distance between simulation and
with some recommendations in the literature,10,15 other experiment being comparable with their uncertainties. For
choices are possible and have been pursued. In particular, their studies, they found that the conclusions drawn were
normalizing the model–experiment differences in terms of fairly insensitive to the specific choice of d0 and k, so long as
uncertainties rather than mean experimental values is an they were in the ranges of 1 , d0 , 2 and 0.1 , k , 1.
equally viable choice, e.g., Whether this property would hold for other studies remains
to be seen. One particular advantage of this bounded metric
" model #
alt rTi ' rTiexpt -- formulation is that it lends itself well to incorporation into
ELT ¼ - : (12) composite metrics,14 which are discussed in Sec. VI.
i r rT expt DQi ¼0
i
V. VALIDATION METRICS FOR PREDICTIONS

The primary advantages of this approach are a further compac-
OF TURBULENT FLUCTUATIONS
tification of the metric, from ðvalueÞ6ðuncertaintyÞ to simply
(value), which facilitates an clear means of assessing whether In order to fully validate a turbulent transport
the model and experiment are consistent within uncertainties model, one must assess the fidelity of the predicted turbulent
(i.e., is the metric greater or smaller than some order-unity fluctuations themselves as well as the cross-field fluxes.
threshold value147,148). On the other hand, by using such a for- Comparisons of predicted and measured fluctuations serve
mulation one cannot discriminate between cases that achieve two specific, complementary purposes. First and foremost,
small metric values (“good agreement”) due to small differen- such comparisons provide a more stringent means of testing
ces between the model and experimental predictions or large our understanding of the fundamental underlying physics of
uncertainties in the experimental and/or simulation data. plasma transport, which is needed for confident extrapolation
More complex metrics have also been proposed in and to future regimes. More specifically, by assessing the ability
utilized in a variety of studies. For instance, drawing upon a of a given model to accurately predict a wide variety of
widely used climate validation metric,149 Terry et al.14 measured fluctuation characteristics (such as amplitudes,
assess the fidelity of ITG correlation lengths predicted by spectra, correlation lengths, etc.) and their scalings with
different models using the data published in Rhodes et al.150 plasma parameters in conjunction with the cross-field fluxes,
These approaches quantify agreement between model and we can determine whether or not the models are predicting
experiment using the correlation coefficient between the pre- (in)correct flux-gradient relationships because they have
dicted (x) and measured (y) values of a particular observable (in)correct models of the underlying turbulence dynamics. In
obtained at N distinct points doing so, these fluctuation comparisons also provide a means
of addressing the myriad potential systematic uncertainties
1X N
of the power balance analyses discussed in Sec. IV A.
ðxi ' x"Þðyi ' y"Þ
N i¼1
R¼ (13) A. Using synthetic diagnostics to enable quantitative
rxry code-experiment comparisons
and RMS deviation In order to carry out these comparisons of predicted and
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi measured plasma fluctuation characteristics, one must invari-
u N )
u 1 X xi ' x" yi ' y"*2 ably use synthetic diagnostics as part of the comparison.
E¼t ' : (14) Synthetic diagnostics are computational algorithms used to
N i¼1 x" y"
transform the quantities output by a simulation into
experimentally measured quantities to enable meaningful PSF is made by a separate code developed by the BES diag-
quantitative comparisons.153 While there is a wide array of nostic group;156 for the CECE system the PSF is a Gaussian
diagnostic techniques used to measure plasma fluctuations, in both R and Z with widths provided by the CECE diagnostic
in virtually every case the quantity measured by the diagnos- group. In both cases the toroidal extent of the experimental
tic differs in some way from the “native” variables of the integration volume is significantly smaller than the typical to-
various turbulence models. For instance, a synthetic roidal correlation lengths of the turbulence, which is approxi-
Langmuir probe109 would translate the electron density, tem- mated in the synthetic diagnostic as a perfect toroidal
perature, and plasma potential fluctuations predicted by a localization (i.e., wPSF does not depend upon toroidal angle).
simulation into the measured ion saturation current and float- In Fig. 16, typical lab-frame time traces and frequency spec-
ing potential fluctuations, while a synthetic gas puff imaging tra of the synthetic fluctuations are plotted against those from
diagnostic154 would calculate predicted light emission fluctu- corresponding unfiltered signals which are simply recorded at
ations based upon the underlying density and temperature the nominal diagnostic channel location (i.e., dXunf iltered
fluctuations and plasma ion and neutral species. In many ðR; Z; tÞ ¼ dXGYRO ðR; Z; tÞ). One can immediately see that as
cases, the synthetic diagnostic algorithm itself can be a would be expected for PSFs with spatial dimensions compa-
sophisticated computational model requiring its own verifi- rable to the turbulent eddy size, there is significant attenua-
cation and validation, possibly even including the use dedi- tion of fluctuation power at all frequencies in the synthetic
cated experimental devices. spectra, relative to the unfiltered case. However, the exact
For the family of discharges considered in Sec. IV D, frequency dependence of this attenuation depends upon the
both beam emission spectroscopy (BES)155 and correlation specific shape of the PSF. For instance, the synthetic CECE
electron cyclotron emission (CECE) radiometry113 measure- spectra shown in Fig. 16(d) is more heavily attenuated at
ments at multiple locations were obtained as part of the higher frequencies. This particular dependence arises from
experiments. Obtaining measurements from these diagnos- the poloidally extended but radially narrow PSF of the CECE
tics was prioritized in the design of the experiments because diagnostic (Fig. 15(b)), which preferentially attenuates higher
they provide spatially localized measurements of the long- poloidal wavenumbers and thus higher frequencies due to the
wavelength q i-scale fluctuations described by the GYRO strong Doppler shift driven by the finite rotation of the
simulations utilized in Sec. IV D. Both diagnostics work by plasma. In contrast, the BES PSF is more widely elongated
integrating plasma radiation emitted from small but finite radially, which leads to non-negligible attenuation of higher
spatial volumes, the intensity of which is then related back to radial wavenumbers for all poloidal wavenumbers and fre-
instantaneous local density or electron temperature values. quencies (Fig. 16(b)).
To model each diagnostic, a point-spread function (PSF) is
convolved with the relevant simulation outputs to account B. Frequency-spectra based fluctuation analysis
for the finite integration volume of the diagnostic, as and comparisons
described in Ref. 114. Denoting the diagnostic-specific struc-
Once the synthetic time series dXsyn has been generated,
ture of the PSF as wPSF ðR; ZÞ, synthetic electron density or
they can be analyzed in the same way as the experimental
temperature time traces are generated as
time series, with any differences primarily arising due to
ðð elimination of experimental noise sources (such as back-
dR dZ wPSF dXGYRO ground photon shot noise and electronic noise in the case of
dXsyn ¼ ðð ; (17) BES and CECE) that are not present in the synthetic data.
dR dZ wPSF Note that both diagnostics are comprised of multiple chan-
nels measuring fluctuations at distinct spatial locations, such
where dX can refer to either the normalized electron density that the synthetic diagnostic algorithm generates a set of
fluctuation dne ¼ n~e =ne0 or temperature fluctuation time series corresponding to the different spatial channels of
dTe ¼ T~e =Te0 . Typical PSFs for both BES and CECE are the diagnostic being modeled. For turbulence modeling, the
visualized in Fig. 15. For the BES system, calculation of the most common analysis and quantities of comparison are
FIG. 15. Visualization of 50% con-

tours of (a) BES and (b) CECE PSF
functions overlaid on corresponding
dne and dTe fluctuations from a GYRO
simulation. Nominal viewing locations
of individual BES and CECE channels
are shown as (&). Reprinted with per-
mission from J. Phys: Conf. Ser. 125,
012043 (2008). Copyright 2008 IOP
Publishing.157
FIG. 16. Time-traces of unfiltered (—)

and synthetic (red line) (a) BES and (c)
CECE signals. Corresponding (b) BES
and (d) CECE frequency spectra illus-
trate the attenuation of each synthetic
signal relative to the unfiltered case,
with a frequency dependence arising
from the interplay of PSF shape and
Doppler shifts. (b) and (d) adapted
with permission from Phys. Plasmas
16, 052301 (2009). Copyright 2009
AIP Publishing LLC.114
correlation functions and power spectra, which are often experimental BES measurements below 40 kHz are domi-
integrated over some frequency or wavenumber range to nated by fluctuations in the neutral beam voltage rather than
yield a fluctuation intensity or RMS amplitude. In many the microturbulence of interest, and so would be excluded
cases cross-spectra from neighboring diagnostic channels are from a comparison with the simulated fluctuations. In cases
utilized rather than single-channel auto-spectra when analyz- where rotation or the presence of large coherent modes leads
ing the experimental data to suppress uncorrelated noise to a clear relationship between frequency and mean wave-
sources; the synthetic analysis should follow the experimen- number, comparisons of different frequency bands can be
tal analyses in these cases. The calculation of auto- and used as an effective proxy for comparisons of different
cross-spectra, along with related quantities such as coherence wavenumbers.
and cross-phase, as well as more complex measures such as Although comparisons of fluctuation amplitudes are the
bispectra, are standard signal processing techniques which clearest zeroth-order test of consistency between simulated
are well-described in a variety of textbooks (e.g., Refs. 158 and measured turbulence properties, determining that the
and 159), to which the interested reader is referred to for fur- fundamental characteristics of the turbulence are being cap-
ther details. For the purposes of this discussion, we note only tured accurately by a given model also requires comparisons
that there are standard procedures for estimating the uncer-
tainties in these spectral quantities based upon the length and
number of averaging windows used, which should be calcu-
lated and included in any validation analysis. However, as
with the time-averaging uncertainties of the simulation flux
predictions described in Sec. IV C, the dominant uncertainty
in predictions of fluctuation quantities is likely to arise from
uncertainties in the model inputs, rather than the time-
averaging uncertainty.
Typical synthetic and experimental BES spectra are
shown in Fig. 17, from which quantitative comparisons can
be formulated in terms of different moments of the power
spectrum density S(f). The most commonly used comparison
is the total fluctuation power contained within some fre-
quency band, often expressed in terms of RMS fluctuation
amplitudes dXRMS, defined as
ð fmax FIG. 17. Comparison of synthetic and measured BES spectra at q tor ¼ 0.75
2
dXRMS ¼ df Sðf Þ: (18) in DIII-D discharge 128913, using data from a GYRO simulation with
fmin DQi ¼ 0. Dashed lines (- - -) indicate mean frequency f" (Eq. (20)) and spec-
tral width Wf (Eq. (21)) calculated for each spectrum over the range
Integration over a finite frequency band, rather than the 40 kHz , f , 400 kHz. The dark gray shaded region spanning 0 to 40 kHz
indicates the portion of the measured spectrum dominated by non-
entire spectrum, is often used to remove components of the
turbulence fluctuations, and thus should not be included in comparisons with
measured or calculated spectrum that are not related to the turbulence model. Adapted with permission from Phys. Plasmas 16,
the turbulence. In the example shown in Fig. 17, the 052301 (2009). Copyright 2009 AIP Publishing LLC.114
. /!
of the spatiotemporal structures of the turbulence, as well as Im hdn) ð f Þdvr ð f Þi
'1 . / :
of the couplings and cross-phases between different fields. Hð f Þ ¼ tan (25)
Re hdn) ð f Þdvr ð f Þi
The most obvious means of comparison would be the
frequency-integrated difference between predicted and meas- Therefore, testing the specific predictions of the coherency
ured spectra and cross-phase against measurements is greatly desirable
ð fmax for assessing whether the specific nonlinear dynamics of the
RdX ¼ df ½Ssim ðf Þ ' Sexpt ðf Þ+ (19) turbulence that determine the cross-field fluxes are being
fmin
accurately captured in the simulation. Unfortunately, meas-
perhaps normalized to the experimental value of dXRMS 2
. urements of correlations between fluctuations fields and
However, this formulation is sensitive to uncertainties in radial velocity fluctuations (either the generally dominant
~$ B
E ~ component or the magnetic flutter component
rotation and resulting Doppler shifts (described further dB
below), and moreover loses much of the information about vr ¼ vjj dBr ) are very rarely available on closed flux surfaces
the shape of the spectrum that is of interest. Therefore, addi- in high-power tokamaks due to the lack of diagnostic techni-
ques available for measuring either component of vr; they
tional comparisons of quantities such as higher moments of
can sometimes be obtained in the plasma edge and scrape-
the frequency spectra than simply the integrated power (or
off layer regions via Langmuir probes. However, more
differences in S(f)) are more desirable. For instance, one
broadly gyrokinetic theory predicts unique phase relation-
could compare not just the RMS fluctuation levels but also
ships between any two fields for each instability of interest
the mean frequency
(ITG, TEM, ETG, etc.). Therefore, comparisons of the meas-
ð fmax ured and predicted cross-phase of any two fluctuations (such
1
f" ¼ 2
df fSð f Þ (20) as ne and Te) provide at minimum a test of whether the mix
dXRMS fmin
of underlying instabilities predicted by the simulation is
and spectral width consistent with observations. Such comparisons have been
ð fmax documented in some validation studies,111,138 which found
1 " #2 that not only did nonlinear gyrokinetic simulations quantita-
Wf ¼ 2
df f ' f" Sð f Þ (21)
dXRMS fmin tively predict the cross-phases and their variations with
heating power, but also these results were close to the predic-
of the measured and simulated spectra, as illustrated in Fig. tions of cross-phases from linear stability calculations. These
17. Alternatively, one could transform the frequency spectra results provide support for the quasilinear transport modeling
into correlation functions via approach described above in Sec. III, which describes the
(ð )
fmax transport in terms of a variety of small-amplitude fluctua-
1
CðsÞ ¼ 2 Re df Sð f Þe'2pif s (22) tions that retain many of the linear dispersion and phasing
dXRMS fmin
properties.
and formulate comparisons in terms of different quantities One practical challenge for many of these frequency
derived from CðsÞ. The most common of these is the decor- spectra-based comparisons is that they can be highly sensitive
relation time, generally obtained by fitting an exponential or to uncertainties in the toroidal rotation of the plasma, which
Gaussian function to the envelope of CðsÞ, which can be cal- translates into uncertainties in the Doppler shift that often
culated using the Hilbert transform. Other measures such as dominates the lab-frame spectra. For example, the spectra
spectral indices a, obtained by fitting portions of the spec- shown in Fig. 17 are taken from the discharge documented in
trum as Sðf Þ / f 'a can be calculated and compared as well. Refs. 113 and 114, which has an approximately 10% uncer-
In addition to the single-field (e.g., electron density or tainty in the toroidal rotation velocity Vtor, which in this dis-
charge dominates the local E ~$ B ~ velocity that determines the
temperature) comparisons, comparisons of correlations
between different fluctuation fields have proved to be of sig- Doppler shift between the plasma and lab reference frames.
nificant utility when possible. Physically, the turbulent cross- The lab-frame frequency can be expressed as
field fluxes depend upon the correlation of density, velocity, ~ ~ExB
k%V
and temperature fluctuations with radial velocity fluctua- flab ¼ fplasma þ ¼ fplasma þ fDoppler ; (26)
tions, and as such can be represented spectrally in terms of 2p
coherency cðf Þ, cross-phase Hðf ), and the autospectra of the where fplasma is the plasma-frame frequency of the fluctuation
individual fluctuation fields as follows (using the normalized in question. Considering simply the kyq s ¼ 0.3 fluctuation
electron particle flux C ¼ hdndvr i as a specific example): (where the wavenumber spectrum peaks in the simulation),
. / and estimating fplasma as the linear mode frequency flin calcu-
Cðf Þ ¼ Re hdn) ðf Þdvr ðf Þi ; lated from a linear gyrokinetic simulation, which is consistent
¼ cðf Þhjdnðf Þj2 i1=2 hjdvr ðf Þj2 i1=2 cos Hðf Þ; (23) with comparisons of linear and nonlinear calculations shown
in Refs. 114 and 138, we find flin ¼ 12.7 kHz and
jhdn) ð f Þdvr ð f Þij fDoppler ¼ 191 kHz. Thus even a 10% uncertainty in VExB
cð f Þ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; (24)
yields a uncertainty in fDoppler comparable with flin. Given
hjdnð f Þj2 ihjdvr ð f Þj2 i
this sensitivity of the predicted spectra to the Doppler shift,
significant values of which are common for high-power toka- @ f~ðkx ; ky ; kz Þ X " 0 # " #
mak plasma conditions, the uncertainties in fDoppler must be ¼' SF
iky VExB kx ; 0; 0 f~ kx ' kx0 ; ky ; kz þ ::: :
@t k0
carefully considered before strong conclusions can be derived x
from comparisons of these frequency-based quantities with (27)

experiment. SF
Thus, the radially sheared axisymmetric shear flow VExB
transfers energy between fluctuations with different values
C. Fluctuation Comparisons Based on Spatial of radial wavenumber kx but same poloidal (ky) and toroidal
Correlation Properties (kz) wavenumbers. Through this and other nonlinear proc-
esses, energy is transferred from linearly unstable modes
Beyond the frequency-based model–experiment compari-
(generally with small kx) to stable modes at a variety of
sons described above, comparisons which utilize the local
different wavenumbers,161–163 saturating the turbulence and
spatial correlation and wavenumber properties of the turbu-
generally resulting in a wavenumber spectrum broad in both
lence can provide extremely useful tests of the predicted non-
kx and ky. Therefore, if we want to test whether a given
linear dynamics of the system. At a high level, this point can
model is accurately capturing the nonlinear dynamics of the
be understood by noting that the gyrokinetic-Maxwell equa-
turbulence at an even deeper level than the frequency-based
tions are more naturally expressed in terms of couplings
comparisons above allow, we should first examine the fidel-
between different spatial wavenumbers than frequencies, and
ity of the model in capturing this wavenumber spectrum.
so wavenumber-based comparisons can more directly connect By analogy to the frequency-based comparisons, these
to the underlying theory. More specifically, although one can wavenumber comparisons can be formulated in terms of
often draw connections between the poloidal wavenumber comparing mean wavenumbers and spectral widths in each
and lab-frame frequency via the Doppler shift and linear dis- spatial dimension. Alternatively, the wavenumber spectrum
persion properties of the turbulence in question, much of the can be Fourier-transformed into a correlation function, and
important nonlinear dynamics involves couplings of different comparisons formulated in terms of mean wavenumber and
radial wavenumbers which in general cannot be easily correlation length, obtained analogously to the correlation
mapped back to frequency space. In particular, one of the pri- time via a fit to the envelope of the correlation function. An
mary saturation mechanisms for ion-scale turbulence is now example of such a comparison is shown in Fig. 18, taken
known to be nonlinear energy transfer from unstable to stable from Shafer et al.164 These results illustrate the strong
modes mediated by radially sheared axisymmetric E ~$ B~ impact the BES PSF can have on the results, which is not
flows in the plasma. These flows can be both part of the equi- surprising since the spatial extent of the PSF is comparable
librium plasma (as a bulk rotation of the plasma), or nonli- with the size of the turbulent eddies, as shown in Fig. 15. In
nearly generated by the turbulence itself, in which case they Fig. 18 the wavenumber spectra obtained from the unfiltered
are often referred to as “zonal flows” reflecting their axisym- gyrokinetic results are compared against measured wave-
metric character. A full review of the details of these flows, number spectrum from which the k-space representation of
their generation, and back-reaction on the turbulence is the PSF has been deconvolved, as well as comparisons of the
beyond the scope of this paper; the interested reader is refer- synthetic spectrum to the unfiltered measured spectrum.
eed to the extensive literature on the topic for more informa- Their differences can be quantified in terms of wavenumber
tion (see, e.g., Ref. 160). The key point for this discussion is peaks and widths, as shown in Table IV. Most notable is the
that in wavenumber space, one can express this shearing pro- clear overprediction of mean kr at q tor ¼ 0.75, corresponding
cess in the nonlinear term of the gyrokinetic equation as to the simulation eddies being more “titled” in configuration
(using a simple Cartesian representation for clarity) space than what is observed.
FIG. 18. Comparison of GYRO-

predicted (—) and BES-measured (red
line) density fluctuation wavenumber
spectra at (a) and (c) r=a ¼ q tor ¼ 0:5
and (b) and (d) r=a ¼ 0:75. Raw (unfil-
tered) GYRO spectra are compared
with “corrected” BES spectra from
which the k-space structure of the PSF
has been deconvolved in (a) and (b),
while the synthetic GYRO spectra are
compared with raw BES measurements
in (b) and (d). Reprinted with permis-
sion from Phys. Plasmas 19, 032504
(2012). Copyright 2012 AIP Publishing
LLC.164
TABLE IV. Radial and poloidal wavenumber peaks and widths calculated Edne ¼ Ddne jDQ ¼0 ; (28)
i
for the spectra shown in Figs. 18(a) and 18(b). Adapted with permission
from Phys. Plasmas 19, 032504 (2012). Copyright 2012 AIP Publishing EdTe ¼ DdTe jDQ ¼0 ; (29)
LLC.164 i
kr width kh width kr peak kh peak

with errors defined equivalently to r EQe (Eq. (11)). These
(cm–1) (cm–1) (cm–1) (cm–1) error metrics are plotted in Fig. 20, along with their
uncertainty-weighted average values. Significant scatter is
Raw BES (r/a ¼ 0.5) 1.3 1.4 –0.1 1.3
seen in the results for both fields and both radii. Focusing on
Syn. GYRO (r/a ¼ 0.5) 1.6 1.2 0.2 1.0
the ensemble-averaged values, we see in Fig. 20(a) that the
Raw BES (r/a ¼ 0.75) 2.0 1.5 –0.5 1.3
Syn. GYRO (r/a ¼ 0.75) 2.2 1.5 0.1 0.9 predicted flux-matching density fluctuations at both radii
match the measured values within the uncertainty, albeit just
barely with, e.g., Edne only slightly smaller than r Edne . This
result supports the straightforward expectation from the
D. Fluctuation sensitivity plots and validation metrics underlying gyrokinetic model of a close correlation between
Once a particular fluctuation characteristic has been the amplitude of the q i-scale density fluctuations measured
chosen and quantified for comparison, it is straightforward to by BES and the ion temperature and velocity fluctuations at
generate local sensitivity plots and error metrics for this those scales that set Qi. Therefore, by matching the power
quantity analogous to those for the turbulent fluxes discussed balance and turbulent values of Qi, one would expect the pre-
in Sec. IV. In all of the discharges considered, BES and dicted q i-scale fluctuations to match the measured values.
CECE data are available at q tor ¼ 0.5 and 0.75. Using these On the other hand, and somewhat surprisingly, we see that
data, the same fractional difference approach as described (on average) the predicted q i-scale Te fluctuations match
Sec. IV E is applied to the RMS normalized electron density experimental levels at q tor ¼ 0.5, but are larger than experi-
and temperature fluctuation amplitudes to generate the ment levels at q tor ¼ 0.75 when the simulations match the
results shown in Fig. 19. Similar plots for other quantities power balance Qi values, even though the power balance Qe
discussed above could also be readily generated, but the use is overpredicted by the gyrokinetic model at q tor ¼ 0.5
of RMS fluctuation amplitude comparisons is sufficient for and (modestly) underpredicted at q tor ¼ 0.75 (as shown in
the illustrative goals of this paper. Comparing the results Fig. 14). Assuming that the electron thermal transport is also
shown in Fig. 19 with Fig. 12, one can draw the qualitative dominated by q i-scale fluctuations, one would generally
conclusion that the predicted fluctuation levels exhibit simi- expect the electron energy flux and temperature fluctuation
lar levels of agreement with the measured levels as did the error metrics EQe and EdTe to exhibit similar trends with
predicted turbulent fluxes and power balance calculations, radius.
namely, broad consistency between simulation and experi- How to resolve these findings with simple theoretical
ment at q tor ¼ 0.5, and systematic underprediction of the expectations remains an open question. One possibility
experimental observations at q tor ¼ 0.75. would be to investigate whether a more sophisticated syn-
Quantitative fractional error metrics for the fluctuations thetic CECE diagnostic, such as the one presented in G€orler
can be defined analogously to the fractional error metric for et al.,138 which utilizes the perpendicular electron tempera-
Ð
Qe (Eq. (10)) EQe as ture fluctuation T~e? ¼ d3 vðme v2? =2 ' 1Þf~e that more
FIG. 19. Local sensitivity plots for (a)

and (b) Ddne and (c) and (d) DdTe at (a)
and (c) q tor ¼ 0.5 and (b) and (d) 0.75
calculated from same simulations as
utilized in Fig. 12. The broken lines
indicate where Ddne ; DdTe , and DLTi ¼ 0
and the origin is indicated by (red
asterisk).
closely corresponds to the actual CECE measurement than of fluctuations than fluxes, or the comparisons with smallest
Ð
the total fluctuation T~e ¼ d 3 vðme v2 =2 ' 1Þf~e used in the uncertainties?
GYRO analysis, changes these findings significantly. In formulating Ri, the first challenge is to ensure that
Similarly, one could investigate more sophisticated synthetic individual metrics are combined in such as way as to avoid
BES algorithms that incorporate the impact of ion density cancellations which would yield overly favorable assess-
fluctuations on the atomic physics of the collisional radiation ments of total model fidelity. Thus, direct linear sums of
processes used to calculate the predicted emission levels, or signed metrics such as Ecomp ¼ ELTi þ EQe should not be
the neglect of parallel localization in both synthetic diagnos- used, to avoid a case where combining, e.g., ELTi ’ 1 and
tics employed here. And as with the discussion of the find- EQe ’ '1 yields a composite metric value Ecomp ’ 0, which
ings for EQe in Sec. IV E, a second natural avenue to pursue would suggest very close model–experiment agreement
(if tractable) would be to transition from varying only R=LTi based on two individual metrics indicating significant mode-
to match Qi to simultaneous variations of R=LTi and R=LTe to l–experiment disagreement. Terry et al. recommend utilizing
simultaneously match Qi and Qe. Calculating Edne and EdTe normalized goodness ratings Bi for Ri, where Bi is bounded
and varies from 0 (no agreement) to 1 (perfect agreement).
for these simulations and contrasting them to the results
Such a bounding can be achieved through a number of differ-
presented in Fig. 14 and 20 would provide insight into
ent means, the tanh function being a common one (as seen in
whether it is meaningful to consider the turbulence in these
formulation of the R in Eq. (16) by Ricci et al.147,148). For
discharges as dominated by a single instability with a single
the local flux-matching metrics defined in this paper, using
strong control parameter, or whether they in fact must be
ELTi as an example, a simple approach would be to define
considered a tightly coupled mix of different instabilities
(e.g., ITG and TEM) with multiple, equally important drives BTi ¼ 1 ' tanhðjELTi jÞ; (31)
in determining the transport and turbulence in each channel
and field. A third, even more computationally expensive or by analogy to a suggestion from Greenwald15
avenue of approach would be to investigate the impact of
multiscale simulations which self-consistently incorporate BTi ¼ 1 ' tanhðjELTi j þ jr ELTi jÞ; (32)
q e-scale ETG fluctuations into these q i-scale simula-
if we desired to incorporate the uncertainty into the goodness
tions.65–68,140–146 Other research avenues are possible as
rating. Alternatively one could use
well. However, for the goals of this paper, it should be clear
how utilizing a variety of local validation metrics for multi- BTi ¼ 1 ' tanhðjELTi =r ELTi ja Þ; (33)
ple predicted quantities can be employed to test model fidel-
ity and our physical understanding at a level not possible by where a ¼ 2 would be a natural choice by analogy to chi-
earlier global metrics. squared statistics. A transition parameter k analogous to that
of Eq. (16) could also be included, although one would need
VI. USING COMPOSITE METRICS FOR ASSESSING to carefully assess its impact on the interpretation of the
OVERALL MODEL FIDELITY results.
While the various individual metrics described above Terry et al. recommend using several criteria in choos-
provide extremely useful and detailed insight into how well ing how to weight the goodness measure associated with
a specific aspect of turbulence dynamics and transport is cap- each metric. The most important of these is what they term
tured by a model, it is also desirable to formulate composite the primacy hierarchy. The primacy hierarchy is a way of
metrics which integrate the information contained in the ranking different quantities based upon the number of exper-
individual metrics into assessments overall model fidelity. imental measurements it integrates. For instance, in edge
These composite metrics should be used to complement the turbulence studies using Langmuir probes, at the lowest
insights gained from the individual metrics, and reports of (most “fundamental”) level of the primacy hierarchy would
validation activities should include tables and figures docu- be measurements of equilibrium electrostatic potential and
electron density and temperature profiles, as well as their
menting both the single metric results as well as composite
fluctuations. At the next level of the primacy hierarchy
metric results. In general, these composite metrics will take
would be equilibrium profile gradients (derived from the
the form of weighted sums
measured profiles one level below) as well as auto- and
X
M¼ Wi R i (30) cross-correlations of fluctuation measurements, including
i cross-field fluxes based calculated using correlations of den-
sity or temperature and electric field fluctuations (derived
where Wi is the weight of the i-th comparison included and from finite differencing the measured potential fluctuations).
Ri the value representing the level of model–experiment And at the highest level of the primacy hierarchy would be
agreement found for that comparison. As discussed below, calculations of particle and thermal diffusivities that com-
Ri will generally be a function of the various individual met- bined the fluxes with equilibrium profile gradients. The
rics such as ELTi , but need not be exactly equal to them or higher a quantity is in the primacy hierarchy, the lower a
even a linear function of them. Beyond formulating Ri, a weight it is given. In addition to weighting by the level in the
second challenge lies in deciding how to weight different primacy hierarchy, Terry et al. also recommend including
individual metrics, e.g., should more weight be given to tests weighting factors accounting for the sensitivity of the
Through the use of such a weighting, one is able to account

for experimental simulation uncertainties in a composite
metric while still using formulations Ri that do not explicitly
reference these uncertainties.
Following these works, the values of several different
composite metrics calculated for the DIII-D modeling results
presented in Secs. IV E (Fig. 14) and V D (Fig. 20) are given
in Table V and plotted in Fig. 21. At each radius, a single
composite metric value is calculated from the four individual
metrics Ei ¼ fELTi ; EQe ; Edne ; EdTe g calculated and all seven
discharges considered. Using the definition of Bi;j ¼
1 ' tanhðjEi;j jÞ from Eq. (31), three composite metrics of
increasing complexity are formulated
X
N X
7
Bi;j
i¼1 j¼1
M0 ¼ ; (36)
7N
X
N X
7
Hi Bi;j
i¼1 j¼1
M1 ¼ ; (37)
X
N
7 Hi
i¼1
X
N X
7
Hi Sij Bi;j
FIG. 20. Dependence of (a) Edne and (b) EdTe on q tor for all seven discharges i¼1 j¼1
listed in Table III. The uncertainty-weighted ensemble mean values are plot- M2 ¼ : (38)
ted as (pink asterisk). X
N X
7
Hi Sij
comparison and whether it repeats similar measures (to en- i¼1 j¼1
courage testing of many different quantities, rather than

The quantity Hi is a primacy hierarchy weight; we use a
many tests of effectively the same quantity in slightly differ-
value of Hi ¼ 1 for Edne and EdTe , and Hi ¼ 0.5 for ELTi and
ent conditions). An application of this methodology can be
EQe . Since each r Eij represents the uncertainty in a frac-
found in the studies of Ricci et al.,147,148,165 where composite
tional error metric Eij, they are all treated equivalently in a
metrics of the form
simple sensitivity weighting Sij ¼ expð'r Eij Þ. At q tor ¼ 0.5
X
Ri Hi Si and 0.75, N ¼ 4 is used, corresponding to the four individ-
i ual metrics in Ei discussed in Secs. IV and V. However,
v¼ X (34)
Hi S i since there are no fluctuation measurements available at
i q tor ¼ 0.25, we use N ¼ 2 there, corresponding to
Ei ¼ fELTi ; EQe g. Alternatively one could set either Bij or
are used. Here Ri is the measure of agreement for a single Sij equal to zero for Edne and EdTe at this location, reflecting
comparison as defined in Eq. (16), Hi is a weighting for place this absence relative to the other radii. Examining the
in the primacy hierarchy, and Si the sensitivity weight. The results, one can see that each formulation gives a very sim-
hierarchy weight Hi is defined as Hi ¼ 1=hi , where hi is the ilar answer, namely, the highest values at q tor ¼ 0.25 and
level of the comparison in the primacy hierarchy. An inter- 0.5 (M . 0.7), and lower values at q tor ¼ 0.75 (M , 0.63).
esting refinement of the initial primacy hierarchy proposal Thus, these composite metrics provide a compact repre-
used in Ref. 148 is the definition of separate hierarchies for sentation of the trends consistently identified in earlier
the experimental data, simulation data, and comparisons, sections—that the experimental fidelity of the code is
depending on the number of assumptions or integrations significantly lower at q tor ¼ 0.75 than 0.25 and 0.5 in the
required for each case. The sensitivity weight Si is defined as modeled neutral beam heated DIII-D L-mode discharges,
0 X X 1 and that this trend is robust regardless of whether one
Deij þ Dsij
B C
B j j C
Si ¼ expB ' X X C; (35) TABLE V. Values of composite metrics M0, M1, and M2.
@ jeij j þ jsij j A
j j q tor M0 M1 M2
0.25 0.75 0.75 0.74

where ei;j and si;j are the j-th instance of the experimental 0.5 0.70 0.70 0.70
and simulation values, respectively, of the i-th comparison 0.75 0.60 0.63 0.63
quantity; Dei;j and Dsi;j are their associated uncertainties.
of predicted and measured fluctuation characteristics

(through the use of synthetic diagnostics) in conjunction
with comparisons with independent power balance flux
calculations is identified as a key component for rigorous
validation of plasma microturbulence models. Equally
important is the explicit incorporation of uncertainties in
measured profiles and fluctuation levels, power balance anal-
yses, simulation outputs, and synthetic diagnostics into these
validation metrics. The challenges in quantifying each of
these uncertainties are discussed in detail in Secs. IV and V.
Sec. VI discusses challenges and recommendations for com-
bining the many individual validation metrics developed in
Secs. IV and V into composite metrics that provided an inte-
grated overall assessment of model fidelity, including formu-
lation of bounded measures of agreement and use of primacy
FIG. 21. Comparison of composite metrics M0 (gray bar), M1 (red bar), and hierarchies as a means of weighting different comparisons.
M2 (blue bar), calculated using Eqs. (36)–(38) and fractional error metrics The key result found for the example experimental cases and
shown in Figs. 14 and 20. The numerical values of each metric can be found code considered is that the code fidelity is significantly lower
in Table V.
at q tor ¼ 0.75 than 0.25 and 0.5 in the modeled neutral beam
heated DIII-D L-mode discharges, and that this trend is
considers predictions of just gradients and fluxes, or
robust regardless of whether one considers predictions of
includes comparisons with measured fluctuation levels as
just gradients and fluxes, or includes comparisons with meas-
well. However, these particular composite metric formula-
ured fluctuation levels as well. These trends and results are
tions are only intended to be illustrative, and the results
common across each metric (ELTi ; EQe ; Edne and EdTe ) con-
can be sensitive to the specific mathematical formulations
sidered, and identify a need to improve model performance
used. Future validation studies should more thoroughly
in this region for this class of plasmas.
investigate other formulations motivated by the particular
Many of the challenges illustrated in this paper require
physics and intended uses of the models under
further study and attention. Perhaps most important is to de-
consideration.
velop more rigorous and extensive methods for UQ appropri-
ate for MFE plasmas. To do so, the MFE community should
VII. CONCLUSIONS AND FUTURE DIRECTIONS
draw from the extensive current, on-going work on this topic
Validated predictive models of plasma dynamics will by researchers in many other fields. Advances in UQ are
play an increasingly important role for fusion research, both needed on both the experimental and modeling sides, to
to ensure that planned future devices such as ITER can be better quantify and characterize the uncertainties in the equi-
operated safely and efficiently, and to help identify new and librium profiles and gradients, and then efficiently propagate
innovative configurations and operating scenarios that accel- these uncertainties through both power balance and turbulent
erate the realization of fusion as an economically viable transport models. On the experimental side, using techniques
commercial energy source. Validation is essential not only such as Gaussian process regression119,120 or integrated data
for building confidence in these predictions, but also for assessment121–124 should be investigated more widely. For
identifying parameter regimes where current models do not simulations, propagation of uncertainties requires a transition
perform acceptably and improvements are needed. Building from single, deterministic simulations at a given point to
upon a number of previous reviews of validation best prac- ensembles of calculations as the default workflow. Such en-
tices,10,14,15 this paper illustrates some of the practical vali- semble calculations are possible for conventional ion-scale
dation challenges for MFE validation research through the gyrokinetic and gyrofluid simulations, but require large
development of validation metrics suitable for testing of numbers of processor-hours to complete. As such, only a
turbulent transport models. In order to go beyond earlier modest number of such ensemble calculations are practically
global transport metrics, these new metrics utilize the local feasible on current high performance computing (HPC) plat-
sensitivity plots developed by the community for both verifi- forms on useful timescales. However, for robust validation
cation80 and validation17,18 to quantify model–experiment across the full parameter space of interest for fusion, one
agreement in terms of fractional flux-matching gradient must perform many such ensemble calculations, more than
errors, and associated residual discrepancies in other quanti- would currently be tractable. It is to be hoped that the next
ties such as fluctuation amplitudes. This choice of compari- generation of exascale computing platforms currently being
son quantities and metric formulation follows directly from developed supports the workflows needed to perform larger
the intended end use of the transport models—predicting numbers of such ensemble calculations. On the other hand,
equilibrium profiles and gradients in MFE plasmas as a func- many-realization ensemble simulations of the reduced trans-
tion of magnetic configuration and auxiliary applied heating, port models commonly used in current predictive and inter-
fueling, and torque. Since the cross-field fluxes associated pretive transport modeling workflows are already viable on
with these sources are never directly measured in the core of currently available HPC platforms. Since these models exe-
high-temperature MFE plasmas, simultaneous comparisons cute fairly quickly (on the core-minute or less timescale),
reduced-model ensemble calculations offer an enormous op- W. Heidbrink, N. T. Howard, F. Jenko, S. M. Kaye, J. E.
portunity to begin studying how to optimally propagate Kinsey, T. Luce, G. R. McKee, D. R. Mikkelsen, D. E.
uncertainties in an MFE-relevant and specific context. For Newman, W. M Nevins, S. E. Parker, C. C. Petty, M. J.
instance, techniques such as polynomial chaos expan- Puschel, T. L. Rhodes, L. Schmitz, M. W. Shafer, S. P.
sions166,167 should be investigated as means of more effi- Smith, P. Snyder, G. M. Staebler, P. Terry, G. R. Tynan, R.
ciently propagating uncertainties through plasma turbulence E. Waltz, and A. E. White. He thanks them all for their
models than brute-force Monte Carlo simulation. contributions, and apologizes in advance if any were
Finally, while the metrics that have been discussed in inadvertently omitted. This research was supported by the
this paper are appropriate for testing predictions of continu- Office of Science of the U.S. Department of Energy under
ously varying quantities, binary classification tests are also Contract Nos. DE-FG02-06ER54871 and DE-SC0006957. It
quite common and important in a variety of settings. In the used resources of the National Energy Research Scientific
context of MFE plasmas, these tests arise most frequently in Computing Center, a DOE Office of Science User Facility
tests of predicted global mode stability made by ideal or supported by the Office of Science of the U.S. Department of
resistive magnetohydrodynamic (MHD) calculations. In Energy under Contract No. DE-AC02-05CH11231.
these cases, the test is whether or not the plasma exhibits a
1
specific behavior (e.g., onset of a specific global mode or AIAA, “Guide for the verification and validation of computational fluid
dynamics simulations,” Technical Report No. AIAA G-077-1998(2002),
disruption168,169) at the time or condition predicted by a The American Institute of Aeronautics and Astronautics, 1998.
given model. In MFE studies, such tests are generally plotted 2
Fusion Energy Sciences Advisory Committee, Report on Strategic
in terms of parameter space visualizations which indicate the Planning, 2014.
3
predicted regions of (in)stability, combined with data points U.S. Department of Energy, The Office of Science’s Fusion Energy
Sciences Program: A Ten-Year Perspective (U.S. Dept. of Energy, 2015).
indicating where the measurements of whether the plasma is 4
ITER Physics Basis Editors and ITER Physics Expert Group Chairs and
observed to be (un)stable. Although such plots are widely Co-Chairs and ITER Joint Central Team and Physics Integration Unit,
utilized in the fusion community and provide an analogous 5
Nucl. Fusion 39, 2137 (1999).
function to the local sensitivity plot approach describe here, ASME, “Standard for verification and validation in computational fluid
dynamics and heat transfer,” Technical Report No. ASME Standard V&V
formal validation metrics explicitly quantifying the fidelity 20-2009, The American Society of Mechanical Engineers, 2009.
of these models in predicting the onset of the dynamics in 6
P. J. Roache, Ann. Rev. Fluid Mech. 29, 123 (1997).
7
question have not been widely utilized to date. Drawing F. Stern, R. V. Wilson, H. W. Coleman, and E. G. Paterson, J. Fluids
Eng. 123, 793 (2001).
from the broad literature on binary outcome validation met- 8
W. L. Oberkampf and T. Trucano, Prog. Aero. Sci. 38, 209 (2002).
rics in other fields such as medical research and machine 9
W. Oberkampf, T. Trucano, and C. Hirsch, App. Mech. Rev. 57, 345
learning, a number of different metric formulations could be (2004).
10
pursued within the MFE community. At the simplest level, W. L. Oberkampf and M. F. Barone, J. Comput. Phys. 217, 5 (2006).
11
Y. Liu, W. Chen, P. Arendt, and H.-Z. Huang, J. Mech. Design 133,
one could simply quantify model performance in terms of 071005 (2011).
fraction correct (i.e., for N experimental measurements or 12
P. J. Roache, Fundamentals of Verification and Validation (Hermosa
conditions, how many are correctly predicted to be (un)sta- 13
Publishers, 2009).
ble?). More sophisticated widely used binary classification W. L. Oberkampf and C. J. Roy, Verification and Validation in Scientific
Computing (Cambridge University Press, 2012).
metrics incorporate information about relative amounts of 14
P. W. Terry, M. Greenwald, J.-N. Leboeuf, G. R. McKee, D. R.
true and false positive and negative predictions (such as the Mikkelsen, W. M. Nevins, D. E. Newman, and D. Stotler, Phys. Plasmas
F1-score170) or are derived from receiver–operator character- 15
15, 062503 (2008).
M. Greenwald, Phys. Plasmas 17, 058101 (2010).
istic (ROC) plots.171,172 However, while specific metrics to 16
ITER Physics Expert Group on Confinement and Transport and ITER
test this aspect of MHD theory have not been widely applied, Physics Expert Group on Confinement Modelling and Database and ITER
many other aspects of MHD instability predictions, particu- Physics Basis Editors, Nucl. Fusion 39, 2175 (1999).
17
lar of mode structure and growth rate, have been tested D. W. Ross, R. V. Bravenec, W. Dorland, M. A. Beer, G. W. Hammett,
G. R. McKee, R. J. Fonck, M. Murakami, K. H. Burrell, G. L. Jackson,
extensively in a variety of machines. Examples include tests and G. M. Staebler, Phys. Plasmas 9, 177 (2002).
of mode growth rate,173,174 and resistive wall mode,173 neo- 18
D. W. Ross and W. Dorland, Phys. Plasmas 9, 5031 (2002).
classical tearing mode,175 and Alfv!en eigenmode struc- 19
G. W. Hammett and F. W. Perkins, Phys. Rev. Lett. 64, 3019 (1990).
20
ture.176–180 The same advances in data analysis, computing, E. J. Doyle, W. A. Houlberg, Y. Kamada, V. Mukhovatov, T. H. Osborne,
A. Polevoi, G. Bateman, J. W. Connor, J. G. Cordey, T. Fujita, X. Garbet,
and workflows required for improved validation of plasma T. S. Hahm, L. D. Horton, A. E. Hubbard, F. Imbeaux, F. Jenko, J. Kinsey,
microturbulence models be invaluable for improving our Y. Kishimoto, J. Li, T. C. Luce, Y. Martin, M. Ossipenko, V. Parail, A.
understanding of these equally important macroscopic Peeters, T. L. Rhodes, J. E. Rice, C. M. Roach, V. Rozhansky, F. Ryter, G.
phenomena. Saibene, R. Sartori, A. C. C. Sips, J. A. Snipes, M. Sugihara, E. J.
Synakowski, H. Takenaga, T. Takizuka, K. Thomsen, M. R. Wade, H. R.
Wilson, and ITPA Transport Physics Topical Group and ITPA
ACKNOWLEDGMENTS Confinement Database and Modelling Topical Group and ITPA Pedestal
and Edge Topical Group, Nucl. Fusion 47, S18 (2007).
The author has benefitted greatly from discussion on this 21
See http://juq.siam.org/cgi-bin/main.plex for SIAM/ASA Journal on
topic with many members of the research community over Uncertainty Quantification (JUQ).
22
the years, including (in alphabetical order): E. M. Bass, R. D. Borland and R. Taylor, IEEE Comput. Graphics Appl. 27, 14 (2007).
23
J. G. Cordey, B. Balet, D. Campbell, C. D. Challis, J. P. Christiansen, C.
V. Bravenec, K. H. Burrell, J. Candy, T. Carter, J. C. DeBoo,
Gormezano, C. Gowers, D. Muir, E. Righi, G. R. Saibene, P. M.
P. H. Diamond, A. Dimits, W. Dorland, T. G€ orler, B. A. Stubberfield, and K. Thomsen, Plasma Phys. Controlled Fusion 38, A67
Grierson, W. Guttenfelder, G. W. Hammett, D. R. Hatch, W. (1996).
24 64
C. C. Petty, Phys. Plasmas 15, 080501 (2008). M. Barnes, I. G. Abel, W. Dorland, T. Goerler, G. W. Hammett, and F.
25
T. C. Luce, C. C. Petty, and J. G. Cordey, Plasma Phys. Controlled Jenko, Phys. Plasmas 17, 056109 (2010), APS Invited Paper DI3.00001.
65
Fusion 50, 043001 (2008). N. Howard, A. White, M. Greenwald, C. Holland, and J. Candy, Phys.
26
B. J. Green and ITER International Team and Participant Teams, Plasma Plasmas 21, 032308 (2014).
66
Phys. Controlled Fusion 45, 687 (2003). N. T. Howard, C. Holland, A. E. White, M. Greenwald, and J. Candy,
27
C. Gormezano, A. Sips, T. Luce, S. Ide, A. Becoulet, X. Litaudon, A. Phys. Plasmas 21, 112510 (2014).
67
Isayama, J. Hobirk, M. Wade, T. Oikawa, R. Prater, A. Zvonkov, B. S. Maeyama, Y. Idomura, T.-H. Watanabe, M. Nakata, M. Yagi, N.
Lloyd, T. Suzuki, E. Barbato, P. Bonoli, C. Phillips, V. Vdovin, E. Miyato, A. Ishizawa, and M. Nunami, Phys. Rev. Lett. 114, 255002
Joffrin, T. Casper, J. Ferron, D. Mazon, D. Moreau, R. Bundy, C. Kessel, (2015).
68
A. Fukuyama, N. Hayashi, F. Imbeaux, M. Murakami, A. Polevoi, and H. N. T. Howard, C. Holland, A. E. White, M. Greenwald, and J. Candy,
S. John, Nucl. Fusion 47, S285 (2007). Nucl. Fusion 56, 014004 (2016).
28 69
A. W. Leonard, Phys. Plasmas 21, 090501 (2014). A. Fukuyama, K. Itoh, S. I. Itoh, M. Yagi, and M. Azumi, Plasma Phys.
29
T. E. Evans, Plasma Phys. Controlled Fusion 57, 123001 (2015). Controlled Fusion 37, 611 (1995).
30 70
E. Joffrin, M. Baruzzo, M. Beurskens, C. Bourdelle, S. Brezinsek, J. C. Bourdelle, X. Garbet, F. Imbeaux, A. Casati, N. Dubuit, R. Guirlet,
Bucalossi, P. Buratti, G. Calabro, C. D. Challis, M. Clever, J. Coenen, E. and T. Parisot, Phys. Plasmas 14, 112501 (2007).
Delabie, R. Dux, P. Lomas, E. de la Luna, P. de Vries, J. Flanagan, L. 71
T. Rafiq, A. H. Kritz, J. Weiland, A. Y. Pankin, and L. Luo, Phys.
Frassinetti, D. Frigione, C. Giroud, M. Groth, N. Hawkes, J. Hobirk, M. Plasmas 20, 032506 (2013).
Lehnen, G. Maddison, J. Mailloux, C. Maggi, G. Matthews, M. Mayoral, 72
M. Kotschenreuther, W. Dorland, M. A. Beer, and G. W. Hammett, Phys.
A. Meigs, R. Neu, I. Nunes, T. Puetterich, F. Rimini, M. Sertoli, B. Plasmas 2, 2381 (1995).
Sieglin, A. Sips, G. van Rooij, I. Voitsekhovitch, and J. Contributors, 73
R. E. Waltz, G. M. Staebler, W. Dorland, G. W. Hammett, M.
Nucl. Fusion 54, 013011 (2014). Kotschenreuther, and J. A. Konings, Phys. Plasmas 4, 2482 (1997).
31 74
H. Sugama and W. Horton, Phys. Plasmas 5, 2560 (1998). X. Garbet, P. Mantica, F. Ryter, G. Cordey, F. Imbeaux, C. Sozzi, A.
32
S. Braginskii, Review of Plasma Physics (Consultants Bureau, New York, Manini, E. Asp, V. Parail, R. Wolf, and the JET EFDA Contributors,
1965), p. 214. Plasma Phys. Controlled Fusion 46, 1351 (2004).
33
F. Hinton and R. Hazeltine, Rev. Mod. Phys. 48, 239 (1976). 75
F. Ryter, Y. Camenen, J. C. DeBoo, F. Imbeaux, P. Mantica, G. Regnoli,
34
P. Helander and D. Sigmar, Collisional Transport in Magnetized Plasmas C. Sozzi, U. Stroth, A. Upgrade, DIII-D, FTU, J.-E. contributors, TCV, T.
(Cambridge University Press, Cambridge, 2002). Supra, and W.-A. Teams, Plasma Phy. Controlled Fusion 48, B453
35
W. Horton, Rev. Mod. Phys. 71, 735 (1999). (2006).
36
J. Weiland, Collective Modes in Inhomogeneous Plasmas (Institute of 76
P. Mantica, D. Strintzi, T. Tala, C. Giroud, T. Johnson, H. Leggate, E.
Physics Publishing, London, 1999). Lerche, T. Loarer, A. G. Peeters, A. Salmi, S. Sharapov, D. Van Eester,
37
E. Frieman and L. Chen, Phys. Fluids 25, 502 (1982). P. C. de Vries, L. Zabeo, and K.-D. Zastrow, Phys. Rev. Lett. 102,
38
G. W. Hammett, W. Dorland, and F. W. Perkins, Phys. Fluids B 4, 2052 175002 (2009).
(1992). 77
J. C. DeBoo, C. C. Petty, A. E. White, K. H. Burrell, E. J. Doyle, J. C.
39
R. E. Waltz, R. R. Dominguez, and G. W. Hammett, Phys. Fluids B 4, Hillesheim, C. Holland, G. R. McKee, T. L. Rhodes, L. Schmitz, S. P.
3138 (1992). Smith, G. Wang, and L. Zeng, Phys. Plasmas 19, 082518 (2012).
40
W. Dorland and G. W. Hammett, Phys. Fluids B 5, 812 (1993). 78
J. C. Hillesheim, J. C. DeBoo, W. A. Peebles, T. A. Carter, G. Wang, T.
41
M. A. Beer and G. W. Hammett, Phys. Plasmas 3, 4046 (1996). L. Rhodes, L. Schmitz, G. R. McKee, Z. Yan, G. M. Staebler, K. H.
42
P. B. Snyder and G. W. Hammett, Phys. Plasmas 8, 3199 (2001). Burrell, E. J. Doyle, C. Holland, C. C. Petty, S. P. Smith, A. E. White,
43
G. M. Staebler, J. E. Kinsey, and R. E. Waltz, Phys. Plasmas 12, 102508
and L. Zeng, Phys. Rev. Lett. 110, 045003 (2013).
(2005). 79
C. Roach, M. Walters, R. Budny, F. Imbeaux, T. Fredian, M. Greenwald,
44
B. Scott, Phys. Plasmas 17, 102306 (2010).
45 J. Stillerman, D. Alexander, J. Carlsson, J. Cary, F. Ryter, J. Stober, P.
J. Madsen, Phys. Plasmas 20, 072301 (2013).
46 Gohil, C. Greenfield, M. Murakami, G. Bracco, B. Esposito, M.
G. M. Staebler, J. E. Kinsey, and R. E. Waltz, Phys. Plasmas 14, 055909
Romanelli, V. Parail, P. Stubberfield, I. Voitsekhovitch, C. Brickley, A.
(2007).
47 Field, Y. Sakamoto, T. Fujita, T. Fukuda, N. Hayashi, G. Hogeweij, A.
R. J. Hastie, Astrophys. Space Sci. 256, 177 (1997).
48 Chudnovskiy, N. Kinerva, C. Kessel, T. Aniel, G. Hoang, J. Ongena, E.
H. P. Furth, J. Killeen, and M. N. Rosenbluth, Phys. Fluids 6, 459 (1963).
49 Doyle, W. Houlberg, A. Polevoi, and ITPA Confinement Database and
R. Fitzpatrick, Nucl. Fusion 33, 1049 (1993).
50
D. A. Spong, Phys. Plasmas 22, 055602 (2015). Modelling Topical Group and ITPA Transport Physics Topical Group,
51
J. Scoville, D. Humphreys, J. Ferron, and P. Gohil, in Proceedings of the Nucl. Fusion 48, 125001 (2008).
80
24th Symposium on Fusion Technology SOFT-24 [Fusion Eng. Des. 82, A. M. Dimits, G. Bateman, M. A. Beer, B. I. Cohen, W. Dorland, G. W.
1045 (2007)]. Hammett, C. Kim, J. E. Kinsey, M. Kotschenreuther, A. H. Kritz, L. L.
52
M. Kotschenreuther, G. Rewoldt, and W. Tang, Comput. Phys. Commun. Lao, J. Mandrekas, W. M. Nevins, S. E. Parker, A. J. Redd, D. E.
88, 128 (1995). Shumaker, R. Sydora, and J. Weiland, Phys. Plasmas 7, 969 (2000).
81
53
F. Jenko, W. Dorland, M. Kotschenreuther, and B. Rogers, Phys. Plasmas G. L. Falchetto, B. D. Scott, P. Angelino, A. Bottino, T. Dannert, V.
7, 1904 (2000). Grandgirard, S. Janhunen, F. Jenko, S. Jolliet, A. Kendl, B. F. McMillan,
54
J. Candy and R. Waltz, J. Comput. Phys. 186, 545 (2003). V. Naulin, A. H. Nielsen, M. Ottaviani, A. G. Peeters, M. J. Pueschel, D.
55
A. G. Peeters, Y. Camenen, F. J. Casson, W. A. Hornsby, A. P. Snodin, Reiser, T. T. Ribeiro, and M. Romanelli, Plasma Phys. Controlled Fusion
D. Strintzi, and G. Szepesi, Comput. Phys. Commun. 180, 2650 (2009). 50, 124015 (2008).
82
56
S. Maeyama, T. Watanabe, Y. Idomura, M. Nakata, M. Nunami, and A. D. R. Ernst, P. T. Bonoli, P. J. Catto, W. Dorland, C. L. Fiore, R. S.
Ishizawa, Plasma Fusion Res. 8, 1403150 (2013). Granetz, M. Greenwald, A. E. Hubbard, M. Porkolab, M. H. Redi, J. E.
57
Z. Lin, T. Hahm, W. Lee, W. Tang, and R. White, Science 281, 1835 (1998). Rice, K. Zhurovich, and A. C.-M. Group, Phys. Plasmas 11, 2637 (2004).
83
58
W. Wang, G. Rewoldt, W. Tang, F. Hinton, J. Manickam, L. Zakharov, P. Mantica, C. Angioni, C. Challis, G. Colyer, L. Frassinetti, N. Hawkes,
R. White, and S. Kaye, Phys. Plasmas 13, 082501 (2006). T. Johnson, M. Tsalas, P. C. deVries, J. Weiland, B. Baiocchi, M. N. A.
59 Beurskens, A. C. A. Figueiredo, C. Giroud, J. Hobirk, E. Joffrin, E.
Y. Chen and S. E. Parker, J. Comput. Phys. 220, 839 (2007).
60 Lerche, V. Naulin, A. G. Peeters, A. Salmi, C. Sozzi, D. Strintzi, G.
V. Grandgirard, Y. Sarazin, P. Angelino, A. Bottino, N. Crouseilles, G.
Darmet, G. Dif-Pradalier, X. Garbet, P. Ghendrih, S. Jolliet, G. Latu, E. Staebler, T. Tala, D. Van Eester, and T. Versloot, Phys. Rev. Lett. 107,
Sonnendr€ ucker, and L. Villard, Plasma Phys. Controlled Fusion 49, B173 135004 (2011).
84
(2007). W. Guttenfelder, J. Candy, S. M. Kaye, W. M. Nevins, E. Wang, J.
61
S. Jolliet, A. Bottino, P. Angelino, R. Hatzky, T. Tran, B. Mcmillan, O. Zhang, R. E. Bell, N. A. Crocker, G. W. Hammett, B. P. LeBlanc, D. R.
Sauter, K. Appert, Y. Idomura, and L. Villard, Comput. Phys. Commun. Mikkelsen, Y. Ren, and H. Yuh, Phys. Plasmas 19, 056119 (2012).
85
177, 409 (2007). C. Holland, J. Kinsey, J. DeBoo, K. Burrell, T. Luce, S. Smith, C. Petty,
62
S. Ku, C. S. Chang, and P. H. Diamond, Nucl. Fusion 49, 115021 (2009). A. White, T. Rhodes, L. Schmitz, E. Doyle, J. Hillesheim, G. McKee, Z.
63
J. Candy, C. Holland, R. E. Waltz, M. R. Fahey, and E. Belli, Phys. Yan, G. Wang, L. Zeng, B. Grierson, A. Marinoni, P. Mantica, P. Snyder,
Plasmas 16, 060704 (2009). R. Waltz, G. Staebler, and J. Candy, Nucl. Fusion 53, 083027 (2013).
86 112
J. Citrin, F. Jenko, P. Mantica, D. Told, C. Bourdelle, J. Garcia, J. W. H. E. S. John, T. S. Taylor, Y. R. Lin-Liu, and A. D. Turnbull, Plasma
Haverkort, G. M. D. Hogeweij, T. Johnson, and M. J. Pueschel, Phys. Phys. Controlled Nucl. Fusion Res. 3, 603 (1994).
113
Rev. Lett. 111, 155001 (2013). A. E. White, L. Schmitz, G. R. McKee, C. Holland, W. A. Peebles, T. A.
87
J. Citrin, F. Jenko, P. Mantica, D. Told, C. Bourdelle, R. Dumont, J. Carter, M. W. Shafer, M. E. Austin, K. H. Burrell, J. Candy, J. C. DeBoo,
Garcia, J. Haverkort, G. Hogeweij, T. Johnson, M. Pueschel, and JET- E. J. Doyle, M. A. Makowski, R. Prater, T. L. Rhodes, G. M. Staebler, G.
EFDA contributors, Nucl. Fusion 54, 023008 (2014). R. Tynan, R. E. Waltz, and G. Wang, Phys. Plasmas 15, 056116 (2008).
88 114
D. Ernst, K. Burrell, W. Guttenfelder, T. Rhodes, L. Schmitz, A. Dimits, C. Holland, A. White, G. McKee, M. Shafer, J. Candy, R. Waltz, L.
E. Doyle, B. Grierson, M. Greenwald, C. Holland, G. McKee, R. Perkins, Schmitz, and G. Tynan, Phys. Plasmas 16, 052301 (2009).
115
C. Petty, J. Rost, D. Truong, G. Wang, L. Zeng, and the DIII-D and W. W. Heidbrink, Phys. Plasmas 15, 055501 (2008).
116
Alcator C-Mod Teams, in Proceedings of the 2014 IAEA FEC C. S. Chang and F. L. Hinton, Phys. Fluids 25, 1493 (1982).
117
Conference (2014), Paper No. EX/2. W. A. Houlberg, K. C. Shaing, S. P. Hirshman, and M. C. Zarnstorff,
89
N. T. Howard, A. E. White, M. Greenwald, C. Holland, J. Candy, and J. E. Phys. Plasmas 4, 3230 (1997).
118
Rice, Plasma Phys. Controlled Fusion 56, 124004 (2014). E. A. Belli and J. Candy, Plasma Phys. Controlled Fusion 50, 095010
90
J. Citrin, J. Garcia, T. G€orler, F. Jenko, P. Mantica, D. Told, C. (2008); 54, 015015 (2012).
119
Bourdelle, D. R. Hatch, G. M. D. Hogeweij, T. Johnson, M. J. Pueschel, C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for Machine
and M. Schneider, Plasma Phys. Controlled Fusion 57, 014032 (2015). Learning (MIT Press, Cambridge, MA, 2006).
91 120
A. Ba~ n!
on Navarro, T. Happel, T. G€orler, F. Jenko, J. Abiteboul, A. M. A. Chilenski, M. Greenwald, Y. Marzouk, N. T. Howard, A. E. White,
Bustos, H. Doerk, D. Told, and ASDEX Upgrade Team, Phys. Plasmas J. E. Rice, and J. R. Walk, Nucl. Fusion 55, 023012 (2015).
121
22, 042513 (2015). R. Fischer, C. j. Fuchs, B. Kurzan, W. Suttrop, E. Wolfrum, and ASDEX
92
N. Bonanomi, P. Mantica, G. Szepesi, N. Hawkes, E. Lerche, P. Upgrade Team, Fusion Sci. Technol. 58, 675 (2010).
122
Migliano, A. Peeters, C. Sozzi, M. Tsalas, D. V. Eester, and JET B. P. van Milligen, T. Estrada, E. Ascas!ıbar, D. Tafalla, D. L!
opez-Bruna,
Contributors, Nucl. Fusion 55, 113016 (2015). A. L. Fraguas, J. A. Jim!enez, I. Garc!ıa-Cort!es, A. Dinklage, and R.
93
S. Smith, C. Petty, A. White, C. Holland, R. Bravenec, M. Austin, L. Fischer, Rev. Sci. Instrum. 82, 073503 (2011).
123
Zeng, and O. Meneghini, Nucl. Fusion 55, 083011 (2015). J. Svensson, O. Ford, D. C. McDonald, A. Meakins, A. Werner, M. Brix,
94
A. E. White, N. T. Howard, A. J. Creely, M. A. Chilenski, M. Greenwald, A. Boboc, M. Beurskens, and JET EFDA Contributors, Contrib. Plasma
A. E. Hubbard, J. W. Hughes, E. Marmar, J. E. Rice, J. M. Sierchio, C. Phys. 51, 152 (2011).
124
Sung, J. R. Walk, D. G. Whyte, D. R. Mikkelsen, E. M. Edlund, C. Kung, M. Galante, L. Reusch, D. D. Hartog, P. Franz, J. Johnson, M. McGarry,
C. Holland, J. Candy, C. C. Petty, M. L. Reinke, and C. Theiler, Phys. M. Nornberg, and H. Stephens, Nucl. Fusion 55, 123016 (2015).
125
Plasmas 22, 056109 (2015). K. H. Burrell, W. P. West, E. J. Doyle, M. E. Austin, J. S. deGrassie, P.
95
R. Prater, D. Farina, Y. Gribov, R. Harvey, A. Ram, Y.-R. Lin-Liu, E. Gohil, C. M. Greenfield, R. J. Groebner, R. Jayakumar, D. H. Kaplan, L.
Poli, A. Smirnov, F. Volpe, E. Westerhof, A. Zvonkov, and the ITPA L. Lao, A. W. Leonard, M. A. Makowski, G. R. McKee, W. M. Solomon,
Steady State Operation Topical Group, Nucl. Fusion 48, 035006 (2008). D. M. Thomas, T. L. Rhodes, M. R. Wade, G. Wang, J. G. Watkins, and
96
R. Budny, Nucl. Fusion 34, 1247 (1994). L. Zeng, Plasma Phys. Controlled Fusion 46, A165 (2004).
97 126
R. V. Budny, D. R. Ernst, T. S. Hahm, D. C. McCune, J. P. Christiansen, H. Bindslev and D. Bartlett, A Technique for Improving the Relative
J. G. Cordey, C. G. Gowers, K. Guenther, N. Hawkes, O. N. Jarvis, P. M. Accuracy of JET ECE Temperature Profiles, Report No. JET-R-88-04,
Stubberfield, K.-D. Zastrow, L. D. Horton, G. Saibene, R. Sartori, K. 1988.
127
Thomsen, and M. G. von Hellermann, Phys. Plasmas 7, 5038 (2000). D. R. Mikkelsen and W. Dorland, Bull. Am. Phys. Soc. 50, 196 (2005).
98 128
W. Heidbrink and G. Sadler, Nucl. Fusion 34, 535 (1994). K. H. Burrell, Phys. Plasmas 4, 1499 (1997).
99 129
Y. Peysson and J. Decker, Physics of Plasmas 15, 092509 (2008). P. W. Terry, Rev. Mod. Phys. 72, 109 (2000).
100 130
O. Meneghini, S. Shiraiwa, I. Faust, R. R. Parker, A. Schmidt, and G. C. Holland, L. Schmitz, T. Rhodes, W. Peebles, J. Hillesheim, G. Wang,
Wallace, Fusion Sci. Technol. 60, 40 (2011). L. Zeng, E. Doyle, S. Smith, R. Prater, K. Burrell, J. Candy, R. Waltz, J.
101
J. C. Wright, A. Bader, L. A. Berry, P. T. Bonoli, R. W. Harvey, E. F. Kinsey, G. Staebler, J. DeBoo, C. Petty, G. McKee, Z. Yan, and A.
Jaeger, J.-P. Lee, A. Schmidt, E. D’Azevedo, I. Faust, C. K. Phillips, and White, Phys. Plasmas 18, 056113 (2011).
131
E. Valeo, Plasma Phys. Controlled Fusion 56, 045007 (2014). T. L. Rhodes, C. Holland, S. P. Smith, A. E. White, K. H. Burrell, J.
102
Y. Lin, S. J. Wukitch, P. T. Bonoli, E. Marmar, D. Mossessian, E. Candy, J. C. DeBoo, E. J. Doyle, J. C. Hillesheim, J. E. Kinsey, G. R.
Nelson-Melby, P. Phillips, M. Porkolab, G. Schilling, S. Wolfe, and J. McKee, D. Mikkelsen, W. A. Peebles, C. C. Petty, R. Prater, S. Parker,
Wright, Plasma Phys. Controlled Fusion 45, 1013 (2003). Y. Chen, L. Schmitz, G. M. Staebler, R. E. Waltz, G. Wang, Z. Yan, and
103
R. Prater, Phys. Plasmas 11, 2349 (2004). L. Zeng, Nucl. Fusion 51, 063022 (2011).
104 132
P. T. Bonoli, J. Ko, R. Parker, A. E. Schmidt, G. Wallace, J. C. Wright, J. Callen, R. Colchin, R. Fowler, D. McAlees, and J. Rome, in
C. L. Fiore, A. E. Hubbard, J. Irby, E. Marmar, M. Porkolab, D. Terry, S. Proceedings of the 5th International Conference on Plasma Physics and
M. Wolfe, S. J. Wukitch, the Alcator C-Mod Team, J. R. Wilson, S. Controlled Nuclear Fusion Research, Tokyo (1974), Vol. 1, p. 645.
133
Scott, E. Valeo, C. K. Phillips, and R. W. Harvey, Phys. Plasmas 15, K. Matsuda, IEEE Trans. Plasma Sci. 17, 6 (1989).
134
056117 (2008) The GYRO source code is available at https://github.com/gafusion/
105
P. T. Bonoli, Phys. Plasmas 21, 061508 (2014). gacode. Version ID r4-864-g6ea4 was used in this work.
106 135
W. W. Heidbrink, Rev. Sci. Instrum. 81, 10D727 (2010). J. Candy, Plasma Phys. Controlled Fusion 51, 105009 (2009).
107 136
Y. Lin, S. Wukitch, A. Parisot, J. C. Wright, N. Basse, P. Bonoli, E. R. V. Bravenec, J. Candy, M. Barnes, and C. Holland, Phys. Plasmas 18,
Edlund, L. Lin, M. Porkolab, G. Schilling, and P. Phillips, Plasma Phys. 122505 (2011).
137
Controlled Fusion 47, 1207 (2005). J. Chowdhury, W. Wan, Y. Chen, S. E. Parker, R. J. Groebner, C.
108
N. Tsujii, M. Porkolab, P. T. Bonoli, E. M. Edlund, P. C. Ennever, Y. Holland, and N. T. Howard, Phys. Plasmas 21, 112503 (2014).
138
Lin, J. C. Wright, S. J. Wukitch, E. F. Jaeger, D. L. Green, and R. W. T. G€orler, A. E. White, D. Told, F. Jenko, C. Holland, and T. L. Rhodes,
Harvey, Phys. Plasmas 22, 082502 (2015). Phys. Plasmas 21, 122307 (2014).
109 139
I. H. Hutchinson, Principles of Plasma Diagnostics (Cambridge T. G€orler, A. E. White, D. Told, F. Jenko, C. Holland, and T. L. Rhodes,
University Press, 2005). Fusion Sci. Technol. 69, 537 (2015).
110 140
A. Donn!e, A. Costley, R. Barnsley, H. Bindslev, R. Boivin, G. Conway, F. Jenko, J. Plasma Fusion Res. 6, 11 (2004).
141
R. Fisher, R. Giannella, H. Hartfuss, M. von Hellermann, E. Hodgson, L. R. E. Waltz, J. Candy, and M. Fahey, Phys. Plasmas 14, 056116 (2007).
142
Ingesson, K. Itami, D. Johnson, Y. Kawano, T. Kondoh, A. Krasilnikov, J. Candy, R. E. Waltz, M. R. Fahey, and C. Holland, Plasma Phys.
Y. Kusama, A. Litnovsky, P. Lotte, P. Nielsen, T. Nishitani, F. Orsitto, B. Controlled Fusion 49, 1209 (2007).
143
Peterson, G. Razdobarin, J. Sanchez, M. Sasao, T. Sugie, G. Vayakis, V. T. G€orler and F. Jenko, Phys. Rev. Lett. 100, 185002 (2008).
144
Voitsenya, K. Vukolov, C. Walker, K. Young, and the ITPA Topical T. G€orler and F. Jenko, Phys. Plasmas 15, 102508 (2008).
145
Group on Diagnostics, Nucl. Fusion 47, S337 (2007). L. Schmitz, C. Holland, T. L. Rhodes, G. Wang, L. Zeng, A. E. White,
111
A. E. White, W. A. Peebles, T. L. Rhodes, C. Holland, G. Wang, L. J. C. Hillesheim, W. A. Peebles, S. P. Smith, R. Prater, G. R. McKee, Z.
Schmitz, T. A. Carter, J. C. Hillesheim, E. J. Doyle, L. Zeng, G. R. Yan, W. M. Solomon, K. H. Burrell, C. T. Holcomb, E. J. Doyle, J. C.
McKee, G. M. Staebler, R. E. Waltz, J. C. DeBoo, C. C. Petty, and K. H. DeBoo, M. E. Austin, J. S. deGrassie, and C. C. Petty, Nucl. Fusion 52,
Burrell, Phys. Plasmas 17, 056103 (2010). 023003 (2012).
146 168
N. T. Howard, C. Holland, A. E. White, M. Greenwald, and J. Candy, ITER Physics Expert Group on Disruptions, Plasma Control, and MHD
Plasma Phys. Controlled Fusion 57, 065009 (2015). and ITER Physics Basis Editors, Nucl. Fusion 39, 2251 (1999).
147 169
P. Ricci, C. Theiler, A. Fasoli, I. Furno, K. Gustafson, D. Iraji, and J. T. Hender, J. Wesley, J. Bialek, A. Bondeson, A. Boozer, R. Buttery, A.
Loizu, Phys. Plasmas 18, 032109 (2011). Garofalo, T. Goodman, R. Granetz, Y. Gribov, O. Gruber, M.
148
P. Ricci, F. Riva, C. Theiler, A. Fasoli, I. Furno, F. D. Halpern, and J. Gryaznevich, G. Giruzzi, S. G€ unter, N. Hayashi, P. Helander, C. Hegna,
Loizu, Phys. Plasmas 22, 055704 (2015). D. Howell, D. Humphreys, G. Huysmans, A. Hyatt, A. Isayama, S.
149
K. E. Taylor, J. Geo. Res. 106, 7183 (2001). Jardin, Y. Kawano, A. Kellman, C. Kessel, H. Koslowski, R. L. Haye, E.
150
T. L. Rhodes, J.-N. Leboeuf, R. D. Sydora, R. J. Groebner, E. J. Doyle, Lazzaro, Y. Liu, V. Lukash, J. Manickam, S. Medvedev, V. Mertens, S.
G. R. McKee, W. A. Peebles, C. L. Rettig, L. Zeng, and G. Wang, Phys. Mirnov, Y. Nakamura, G. Navratil, M. Okabayashi, T. Ozeki, R.
Plasmas 9, 2141 (2002). Paccagnella, G. Pautasso, F. Porcelli, V. Pustovitov, V. Riccardo, M.
151
P. Ricci, F. D. Halpern, S. Jolliet, J. Loizu, A. Mosetto, A. Fasoli, I. Sato, O. Sauter, M. Schaffer, M. Shimada, P. Sonato, E. Strait, M.
Furno, and C. Theiler, Plasma Phys. Controlled Fusion 54, 124047 Sugihara, M. Takechi, A. Turnbull, E. Westerhof, D. Whyte, R. Yoshino,
(2012). H. Zohm, and the ITPA MHD, Disruption and Magnetic Control Topical
152
A. Fasoli, B. Labit, M. McGrath, S. H. M€uller, G. Plyushchev, M. Group, Nucl. Fusion 47, S128 (2007).
Podest!a, and F. M. Poli, Phys. Plasmas 13, 055902 (2006); A. Fasoli, A. 170
See https://en.wikipedia.org/wiki/F1_score for details on the how F1
Burckel, L. Federspiel, I. Furno, K. Gustafson, D. Iraji, B. Labit, J. Loizu, score is calculated and used for binary classification tests.
G. Plyushchev, P. Ricci, C. Theiler, A. Diallo, S. H. Mueller, M. Podest!a, 171
N. A. Macmillan and C. D. Creelman, Detection Theory: A User’s Guide
and F. Poli, Plasma Phys. Controlled Fusion 52, 124020 (2010).
153 (Lawrence Erlbaum Associates, Inc., 2005).
R. V. Bravenec and W. M. Nevins, Rev. Sci. Instrum. 77, 015101 (2006). 172
154 T. Fawcett, Pattern Recognit. Lett. 27, 861 (2006).
D. A. Russell, J. R. Myra, D. A. D’Ippolito, T. L. Munsat, Y. Sechrest, R. 173
A. Turnbull, D. Brennan, M. Chu, L. Lao, J. Ferron, A. Garofalo, P.
J. Maqueda, D. P. Stotler, S. J. Zweben, and T. N. Team, Phys. Plasmas
Snyder, J. Bialek, I. Bogatu, J. Callen, M. Chance, K. Comer, D. Edgell,
18, 022306 (2011).
155 S. Galkin, D. Humphreys, J. Kim, R. L. Haye, T. Luce, G. Navratil, M.
G. R. McKee, R. J. Fonck, D. K. Gupta, D. J. Schlossberg, M. W. Shafer,
Okabayashi, T. Osborne, B. Rice, E. Strait, T. Taylor, and H. Wilson,
and R. L. Boivin, Rev. Sci. Instrum. 77, 10F104 (2006).
156 Nucl. Fusion 42, 917 (2002).
M. W. Shafer, R. J. Fonck, G. R. McKee, and D. J. Schlossberg, Rev. Sci. 174
A. D. Turbull, D. P. Brennan, M. S. Chu, L. L. Lao, and P. B. Snyder,
Instrum. 77, 10F110 (2006).
157
C. Holland, J. Candy, R. E. Waltz, A. E. White, G. R. McKee, M. W. Fusion Sci. Technol. 48, 875 (2005).
175
Shafer, L. Schmitz, and G. R. Tynan, J. Phys.: Conf. Ser. 125, 012043 J. H. Yu, M. A. Van Zeeland, M. S. Chu, V. A. Izzo, and R. J. La Haye,
(2008). Phys. Plasmas 16, 056114 (2009).
176
158
J. S. Bendat and A. G. Piersol, Engineering Applications of Correlation M. A. Van Zeeland, G. J. Kramer, M. E. Austin, R. L. Boivin, W. W.
and Spectral Analysis (John Wiley & Sons, Inc., 1993). Heidbrink, M. A. Makowski, G. R. McKee, R. Nazikian, W. M. Solomon,
159
J. S. Bendat and A. G. Piersol, Random Data: Analysis and Measurement and G. Wang, Phys. Rev. Lett. 97, 135001 (2006).
177
Procedures (John Wiley & Sons, Inc., 2010). I. G. J. Classen, P. Lauber, D. Curran, J. E. Boom, B. J. Tobias, C. W.
160
P. H. Diamond, S.-I. Itoh, K. Itoh, and T. S. Hahm, Plasma Phys. Domier, N. C. Luhmann, Jr., H. K. Park, M. G. Munoz, B. Geiger, M.
Controlled Fusion 47, R35 (2005). Maraschek, M. A. V. Zeeland, S. da Graa, and the ASDEX Upgrade
161
P. W. Terry, D. A. Baver, and S. Gupta, Phys. Plasmas 13, 022307 Team, Plasma Phys. Controlled Fusion 53, 124018 (2011).
178
(2006). B. J. Tobias, I. G. J. Classen, C. W. Domier, W. W. Heidbrink, N. C.
162
K. D. Makwana, P. W. Terry, J.-H. Kim, and D. R. Hatch, Phys. Plasmas Luhmann, R. Nazikian, H. K. Park, D. A. Spong, and M. A. Van Zeeland,
18, 012302 (2011). Phys. Rev. Lett. 106, 075003 (2011).
163 179
D. R. Hatch, P. W. Terry, F. Jenko, F. Merz, and W. M. Nevins, Phys. B. J. Tobias, R. L. Boivin, J. E. Boom, I. G. J. Classen, C. W. Domier, A. J.
Rev. Lett. 106, 115003 (2011). H. Donn!e, W. W. Heidbrink, N. C. Luhmann, T. Munsat, C. M. Muscatello,
164 R. Nazikian, H. K. Park, D. A. Spong, A. D. Turnbull, M. A. Van Zeeland,
M. W. Shafer, R. J. Fonck, G. R. McKee, C. Holland, A. E. White, and D.
J. Schlossberg, Phys. Plasmas 19, 032504 (2012). G. S. Yun, and D.-D. Team, Phys. Plasmas 18, 056107 (2011).
165 180
P. Ricci, C. Theiler, A. Fasoli, I. Furno, B. Labit, S. H. M€ uller, M. D. A. Spong, E. M. Bass, W. Deng, W. W. Heidbrink, Z. Lin, B. Tobias,
Podest!a, and F. M. Poli, Phys. Plasmas 16, 055703 (2009). M. A. Van Zeeland, M. E. Austin, C. W. Domier, and N. C. Luhmann,
166
N. Wiener, Am. J. Math. 60, 897 (1938). Phys. Plasmas 19, 082511 (2012).
167 181
H. N. Najm, Ann. Rev. Fluid Mech. 41, 35 (2009). J. L. Luxon, Nucl. Fusion 42, 614 (2002).

Validation Metrics For Turbulent Plasma Transport

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Validation Metrics For Turbulent Plasma Transport

Uploaded by

Copyright:

Available Formats

Validation metrics for turbulent plasma

ARTICLES YOU MAY BE INTERESTED IN

Verification and validation for magnetic fusion

A theory-based transport model with comprehensive physics

Phys. Plasmas 23, 060901 (2016); https://doi.org/10.1063/1.4954151 23, 060901

Validation metrics for turbulent plasma transport

I. INTRODUCTION understanding of the fundamental nonlinear plasma dynam-

1070-664X/2016/23(6)/060901/31 23, 060901-1 C Author(s) 2016.

FIG. 1. Illustration of validation met-

FIG. 2. Schematic illustration of transport stiffness, showing the scaling of

FIG. 5. Local sensitivity plots compar-

FIG. 6. Ensemble of fits to measured

B. Quantifying the uncertainty in the local driving

128913 1400–1600 2.05 1.05 2.3 2.6 0 ! (filled circle) 114

balance analysis would predict PPB / dV T inj ’ 0, and

Although the choice of requiring a match between the turbu-

EQe ¼ DQe jDQ ¼0 : (10)

For this metric, we define the error only as

r EQe ¼ r DQe jD ; (11)

i.e., the uncertainty in DLTi is not included.

V. VALIDATION METRICS FOR PREDICTIONS

FIG. 15. Visualization of 50% con-

FIG. 16. Time-traces of unfiltered (—)

from comparisons of these frequency-based quantities with (27)

FIG. 18. Comparison of GYRO-

kr width kh width kr peak kh peak

FIG. 19. Local sensitivity plots for (a)

Through the use of such a weighting, one is able to account

courage testing of many different quantities, rather than

0.25 0.75 0.75 0.74

of predicted and measured fluctuation characteristics

You might also like