Abstract: We produce reasons and evidence supporting the informal rule that the
number of runs for an effective initial computer experiment should be about 10 times
the input dimension. Our arguments quantify two key characteristics of computer
codes that affect the sample size required for a desired level of accuracy when
approximating the code via a Gaussian process (GP). The first characteristic is
the total sensitivity of a code output variable to all input variables. The second
corresponds to the way this total sensitivity is distributed across the input variables,
specifically the possible presence of a few prominent input factors and many impotent
ones (effect sparsity). Both measures relate directly to the correlation structure in
the GP approximation of the code. In this way, the article moves towards a more formal
treatment of sample size for a computer experiment. The evidence supporting these
arguments stems primarily from a simulation study and via specific codes modeling climate
and ligand activation of G-protein.
This is joint work with Jerome Sacks (NISS) and William Welch (UBC).