QPRC Invited Session Abstracts
Session Title: Measurement System Analysis: Methods and Applications
Organizer: Connie Borror, Arizona State University West
Characteristics of Technical Variation in Expression Microarray Assays
Author: Walter Liggett (National Institute of Standards and Technology)
Speaker: Walter Liggett
email: walter.liggett "AT" nist.gov
Technical variation in expression microarray assays affects in the results of a particular biological study according to choices of study design, normalization, and other details. Questions about technical variation are often raised, however, before these choices have been made. In response to the need reflected in such questions, this talk discusses generally applicable statistical methods for summarizing the technical variation. In this presentation, I will discuss how to perform a quantitative assessment of the reproducibility of expression microarrays before it has been determined how microarrays are to be used in the experiment. I will also demonstrate how purposes served by the univariate measurement properties repeatability and reproducibility can be extended to the expression microarray case.
Assessing Binary Measurement Systems
Authors: Oana Danila, Stefan Steiner, and R. Jock MacKay (University of Waterloo)
Speaker: Stefan Steiner
email: shsteine "AT" uwaterloo.ca
Binary measurement systems are widely used in manufacturing, medical and other contexts. To support production and quality improvement, it is critical that the misclassification rates of such binary measurement systems are assessed. In this talk we compare and contrast some existing methods of assessing binary measurement systems and propose some new plans. We focus on the situation where there is no gold standard measurement method and thus each part’s true class is unknown. We quantify the benefit of assuming a known pass rate, as would be reasonable in situations where the binary measurement system is in current use, explore a variety of sampling plans for selecting the parts that will be repeatedly classified in the measurement study, and assess the precision of estimation of the proportion of good parts (a property of the manufacturing process) from the measurement assessment investigation.
Assessing the Effect of Measurement Error on Process Yield
Authors: Greg Larsen and Vick Agarwal (Agilent Technologies)
Speaker: Greg Larsen
email: greg_larsen "AT" agilent.com
Gas chromatographs (GC) are used to detect various organic compounds in samples of material. The GC uses thin glass columns containing a polymer which interacts with the compounds. The column manufacturing process yield is well below world class levels and presents a large opportunity for improvement. A Six Sigma project was undertaken to improve the yields and thereby improve product profitability. This talk describes the effort to model the measurement error and the total process variation so that yield improvements could be predicted for various hypothetical improvements to the measurement process. This information was then used to target the measurement parameters on which to focus improvement efforts. Simulation was used to estimate yield improvement and a regression model was created to generalize the simulation results for easier application to different test parameters. Ultimately, a spreadsheet tool was developed to support the project.
Session Title: Design and Analysis of Industrial Experiments
Organizer: Rob McLeod, University of Winnipeg
Axial Distance Choices in a Split-Plot CCD
Authors: Scott Kowalski, Li Wang, and Geoff Vining (Minitab Inc.)
Speaker: Scott Kowalski
email: SKowalski "AT" minitab.com
The central composite design is a popular response surface design for fitting second-order models. The axial distance can be chosen to satisfy various properties of the design such as rotatability and/or orthogonal blocking. In the completely randomized case, Box and Hunter (1957) provide the machinery for calculating these distances. However, many real life experiments have restrictions that do not allow for complete randomization. In these cases, it is common to run a split-plot experiment. This talk looks at choices for the axial distance within the split-plot framework.
On Frequentist and Bayesian Approaches to Inference in Industrial Split-plot Screening Experiments
Author: John Brewster (University of Manitoba)
Speaker: John Brewster
email: john_brewster "AT" umanitoba.ca
If some factors are hard to vary in industrial experiments, then restrictions on randomization often lead to the use of designs with a split-plot structure. Common designs for screening purposes are two-level fractional factorial split-plot (FFSP) designs. If center points are added to such designs, then estimates of the variance components associated with the whole-plot and subplot errors can be obtained. However, these estimates are often based on few degrees of freedom, particularly at the whole-plot level, because of the cost of resetting the levels of the whole-plot factors. As a consequence, it is not unusual to encounter difficulties with inferences. For example, the usual (frequentist) estimate of the whole-plot error variance component may be negative. Here we discuss this problem from both frequentist and Bayesian points of view and show that the post-data pivotal (PDP) approach can provide a useful way of thinking in such situations. This approach has both a Bayesian interpretation and a conditional (frequentist) interpretation and leads to estimates which satisfy the parameter constraints of the model. The PDP estimators also dominate the usual frequentist estimators in a decision-theoretic sense.
A Bayesian Framework for Planning Reliability Experiments
Author: Michael Hamada (Los Alamos National Laboratory)
Speaker: Michael Hamada
email: hamada "AT" lanl.gov
In this talk, I consider planning for reliability data collection, i.e., how to optimally collect data, given a limited amount of resources. I will discuss various planning criteria and present a simulation-based framework using a Bayesian approach to evaluate these criteria. Data collection planning can involve single and multiple planning variables.
For situations involving multiple planning variables, a genetic algorithm can be used to find a near optimal data collection plan. I will illustrate the Bayesian framework for reliability data collection planning with a number of examples.
Countering Common Misperceptions and Misunderstandings in Statistical Analysis
Organizer: Will Guthrie, National Institute of Standards and Technology
Chair: Lynne Hare, Kraft Foods
Misconceptions about Statistics in an Industrial Setting
Author: William Brenneman (The Procter & Gamble Company)
Speaker: William Brenneman
email: brenneman.wa "AT" pg.com
Industrial statisticians frequently encounter misconceptions about statistics when collaborating with non-statisticians. A few of these misconceptions involve sampling plans, design of experiments and the use of statistical software. We discuss the potential root causes of these misconceptions along with ways to contend with them.
Basic Points of Confusion with Inference
Author: Mark Bailey (SAS Institute Inc.)
Speaker: Mark Bailey
email: Mark.Bailey "AT" sas.com
Many consumers and producers of statistics remain confused by the basic terminology and reasoning involved in the frequentist version of a hypothesis test in spite of the broad applicability and popularity of this technique. They may eventually come to treat it like a 'black box,' applying it mechanically and losing the essential qualities of each component: two exclusive hypotheses, significance, a sample statistic and its sampling distribution, a sample p-value, and power. I will illustrate the meaning of each part and the reason why each is vital, even after you have done such a hypothesis test 'a million times.'
Countering Misconceptions Associated with Correlation in Uncertainty Assessment
Author: Will Guthrie (National Institute of Standards and Technology)
Speaker: Will Guthrie
email: will.guthrie "AT" nist.gov
The current de facto standard for uncertainty assessment in statistical metrology is essentially the delta method, as outlined in the ISO /Guide the Expression of Uncertainty in Measurement/. In the decade and a half since its publication, a continually-increasing number of metrologists and other scientists have begun using these methods to express the uncertainty in their measurement results. Although these new users have made excellent progress in understanding the statistical concepts underlying propagation of uncertainty, there are still misconceptions that cause confusion and can lead to inaccurate uncertainty statements. In particular, issues associated with the recognition and handling of correlations among the input values needed to obtain a measurement result are often a major source of confusion. Some common misconceptions include the ideas that 1) correlation is not frequently encountered in metrological applications, 2) correlations between parameter values affect the uncertainty of a measurement result, and 3) ignoring correlations in an uncertainty assessment will give conservative results. This talk will illustrate some of these misunderstandings in further detail and provide strategies that can help clarify these concepts for users.
Issues and Applications Involving SPC, Lean, and Six Sigma
Organizer: Aparna Huzurbazar, University of New Mexico
Linking Statistical Thinking to Six Sigma
Author: Lynne Hare (Kraft Foods)
Speaker: Lynne Hare
e-mail: lynne.hare "AT" kraft.com
While many will say they engage in Statistical Thinking, few can actually tell you what it is. We begin with an operational definition of Statistical Thinking, embellish it, and then explore its implications to the manufacturing environment, especially as Statistical Thinking leads to holistic programs like Six Sigma for continuous improvement. As Six Sigma is all about (manufacturing or other) process variation reduction, we motivate that and elaborate on elements necessary for success. Tools and techniques of Six Sigma, motivated by Statistical Thinking include process flow diagrams, cause and effect diagrams and matrices, careful assessments of process capability and performance and other tools necessary to assure that gains, once attained, are held.
Statistical Thinking also emphasizes the need for an understanding of variation. In practice, such understanding extends to the development of plans for its unbiased quantification in order that reliable estimates of process capability (“entitlement,” in Six Sigma parlance) and process performance (what the end user gets) may be derived. Done properly, the benefits of variation quantification include identification of opportunity, which is the difference between performance and capability, and the generation of clues that light the path to improvement.
Effects of Process Variation on the Use of Lean Principles in the US Air Force's Maintenance and Overhaul Facilities
Author: Elvira Loredo (The RAND Corporation)
Speaker: Elvira Loredo
email: loredo "AT" rand.org
The concepts of lean manufacturing have been proven to reduce cycle time, work in process, and inventory requirements in high volume manufacturing. This success has led the Air Force to use Lean manufacturing in its aircraft overhaul depots. However, the process of overhaul is more akin to a job shop than a high volume manufacturing assembly line. Variability in the work requirements during overhaul is a common and expected feature of the overhaul process. Although, there is some planning before overhaul begins, the specific parts that will be required and the scope of the work to be accomplished are often very different from aircraft to aircraft. This variability presents interesting challenges to the application just-in-time manufacturing. In this presentation I discuss the challenges of using Lean manufacturing in an overhaul depot and provide some insights on how the Air Force was able to meet those challenges.
Using Control Charts to Detect Anomalous Morphological Measurements in Brain Imaging
Authors: Sumner Williams and Jeremy Bockholt (The MIND Institute), and Aparna V. Huzurbazar (The University of New Mexico)
Speaker: Sumner Williams
email: swilliams "AT" themindinstitute.org
Research subjects for magnetic resonance studies will have a structural image taken of their brains to allow for region of interest analysis, image registration, and radiological review. Region of interest analysis occurs when the structural image has been segmented into different regions which interest the researcher. This can be performed by a researcher or automated by a program such as Freesurfer. Segmentation errors can occur or brains can be statistically aberrant. This article is intended to shed light on checking segmentation errors of brains using quality control charts. Out of control brains can be flagged to be reprocessed or sent for immediate radiological review if they have already been reprocessed and reviewed by a researcher. Only so many neurological parameters can be tested for in these studies, and therefore unhealthy volunteers can enter a study as healthy. Using quality control charts can flag subjects that are possibly unhealthy due to abnormal brain structure and allow researchers to disenroll them from the study. The literature has shown that certain disease states have different brain volumes for certain areas of the brain, and therefore this has the potential to flag extreme outliers as being potentially unhealthy.
Session Title: Detection of Aberrations in Public Health Data
Organizer: Karen Kafadar, University of Colorado-Denver & Health Sciences Center
SPC Applications in Biosurveillance: Methods and Issues
Author: Ron Fricker (Naval Postgraduate School)
Speaker: Ron Fricker
email: rdfricke "AT" nps.edu
Motivated by the threat of bioterrorism, syndromic surveillance systems are being developed and implemented around the world. Syndromic surveillance is the regular collection, analysis, and interpretation of real-time and near-real-time indicators of possible disease outbreaks and bioterrorism events by public health organizations. One form of biosurveillance, these systems frequently and sometimes naively employ standard SPC methods, such as Shewhart and CUSUM charts, to attempt to detect temporal changes in disease incidence. It is unknown how effective these systems will be at quickly detecting a bioterrorism attack and, in fact, there is some evidence in the form of excessive false alarm rates that they are being suboptimally employed. This talk will describe the issues and challenges in applying SPC methods to the syndromic surveillance problem.
SPC and Disease Surveillance: Detecting Disease Clusters of West Nile Virus in Space and Time
Authors: Karen Kafadar and Kathe E. Bjork (University of Colorado-Denver & Health Sciences Center)
Speaker: Karen Kafadar
email: kk "AT" math.cudenver.edu
The timely detection of potential outbreaks of serious communicable diseases is a critical function of public health departments such as Colorado Department of Public Health and the Environment (CDPHE). CDPHE maintains a database which includes, for each case, the diagnosed disease, temporal information (e.g., date of occurrence, date of report), and somewhat vague spatial information (population centroid of census tract where the diseased person live, suspected location where disease may have been acquired, etc.). The application of traditional SPC methods to such data is not straightforward, due to the complexities of the data even within a data set and especially across diverse sources of data streams. Using four disease series (West Nile Virus, Pertussis, E. coli, Salmonellosis) we illustrate the difficulties of applying traditional SPC and propose some ways to address them.
The Inspection Paradox: Length-Biased Sampling with Variable Test Sensitivity in Industrial Inspection Programs
Authors: Sonya Heltshe and Karen Kafadar (University of Colorado)
Speaker: Sonya Heltshe
email: Sonya.Heltshe "AT" UCHSC.edu
Length biased sampling exists in inspection programs where the target of the inspection can influence the likelihood of detecting a problem or a degraded unit. An analogous phenomenon exists in situations involving screening for diseases such as cancer or cardiovascular risk, where persons with longer preclinical durations are more likely to be screen-detected than those with very brief preclinical stages. This research quantifies the effect of length-biased sampling when units are subjected to periodic screening/inspection. We model test sensitivity as a function of age and degradation phase duration (or preclinical duration in screening trials), and assume an underlying bivariate distribution for the degradation and repair phases that incorporates a parameter for the proportion of slow versus fast approaching failure. We show that the ratio of screening interval length to mean degradation phase duration all influence the magnitude of the effect that length-biased sampling has on the distribution of overall lifetime of the unit. Examples and results for both the disease progression model and industrial inspection schemes are presented.
Session Title: Challenges in Sensor Networks
Organizer: George Michailidis, University of Michigan
Robust Target Detection and Localization in Wireless Sensor Networks
Authors: Liza Levina, Natallia Katenka, and George Michailidis (University of Michigan)
Speaker: Liza Levina
email: elevina "AT" umich.edu
Detecting and localizing a target and estimating its signal are among the fundamental tasks of wireless sensor networks. Performing them efficiently while conserving energy and communications costs often requires distributed processing. We propose an algorithm for local neighborhood voting and show that performing these operations before global data fusion can significantly improve target detection and reduce communications costs compared to standard global methods. We also show that the same algorithm can improve target localization, where methods based on collecting signal data from a small subset of sensors determined by local operations can achieve performance levels similar to methods based on the full data, but at a fraction of the communications cost. In particular, EM-type algorithms we have developed for localizing the target and estimating its signal from this limited information are shown to perform as well as, or, at low SNRs, even better than maximum likelihood methods based on the full data (the current "gold standard" in the field).
Embedded Networked Sensing
Author: Mark Hansen (UCLA)
Speaker: Mark Hansen
email: cocteau "AT" stat.ucla.edu
Embedded networked sensing (ENS) combines innovations in sensor technology, computing and low-power communications. The resulting sensing devices can be "embedded" (deployed) in both the natural and built environments to provide data at unprecedented resolution about how these places "function," and the phenomena that unfold there. In short, ENS systems are transforming how we observe physical (and recently, even social) processes. As with many advances in information technologies, we might view these new observational capabilities in purely technological terms, as triumphs of engineering or computer science. But delivering on the technological promise requires tight collaborations with "data scientists," researchers trained in learning from data. This talk will provide an overview of ENS technology, illustrating both the history, the original vision and subsequent re-vision of these systems. We will cast ENS as a rich field with numerous implications for statistics. Given the theme of this conference, we will give special attention to problems related to data quality and data integrity for ENS.
Sensor Network Deployment Designs
Author: Nicholas Hengartner (Los Alamos National Laboratory)
Speaker: Nicholas Hengartner
email: nickh "AT" lanl.gov
Large area biological air monitoring of metropolitan areas enables early detection of terrorist biological releases. An important statistical question is how best to deploy the air monitors over the region we want to protect. This talk discusses and compares spatial-temporal sampling designs using either moving or fixed air-sampling units for detecting the release of a biological pathogen into the atmosphere and subsequently for mapping its evolution in space and time. While this talk is motivated by problems in homeland security, the discussion and methodology are generally applicable to environmental air monitoring.
Session Title: Design and Analysis Issues for Split-Plot Experiments
Organizer: Douglas Montgomery, Arizona State University
D-optimal Design of Split-split-plot Experiments
Author: Bradley Jones (SAS Institute)
Speaker: Bradley Jones
email: Bradley.Jones "AT" jmp.com
AbstractIn industrial experimentation there is growing interest in studies that span more than one processing step. Convenience often dictates restrictions in randomization in passing from one processing step to another. When the study encompasses three processing steps, this leads to split-split-plot designs.
In this talk, I will show how to compute D-optimal split-split-plot designs and provide illustrative examples using a pre-release version of JMP software. I conclude by considering D-optimal alternatives to a previously run split-split-plot design for cheese production.
Designing Two-Level Split-Plot Experiments
Authors: Murat Kulahci (Arizona State University and Technical University of Denmark)
Speaker: Murat Kulahci
email: Murat.Kulahci "AT" asu.edu
There has lately been great interest in the design and analysis of split-plot experiments. There are however still some unanswered questions in both fronts. When it comes to designing split-plot experiments, the practitioners currently have somehow limited choices. In this paper we present a methodology of designing two-level split-plot experiments using the Kronecker product representation of a two-level full factorial design. This representation is flexible enough to allow for the design of split-plot experiments based on various design criteria. Moreover using the same Kronecker product representation, designs for multi-stage experimentation such as split-split-plot experiments can be relatively easily achieved.
Bayesian Analysis of Split-Plot Experiments with Non-Normal Responses for Evaluating Non-Standard Performance Criteria
Authors: Timothy Robinson (University of Wyoming), Christine Anderson-Cook and Michael Hamada (Los Alamos National Laboratory), Shane Reese (Brigham Young University)
Speaker: Timothy Robinson
email: TJRobin "AT" uwyo.edu
Non-normal responses are common in industrial experiments and many experiments contain factors whose levels are difficult/costly to change, resulting in a split-plot randomization structure. For valid statistical inferences, it is important to account for the correlation structure induced by the split-plot randomization. Generalized linear mixed models (GLMMs), generalized estimating equations (GEEs), and hierarchical generalized linear models (HGLMs) are useful tools for modeling non-normal, exponential family responses in a split-plot setting. These tools are especially useful when interest focuses upon the mean and variance of the response. As an alternative, we demonstrate the utility of Bayesian methods when the user is interested in specific quantiles of the response or functions of the response distribution itself such as the proportion of items within specifications. We pose several such questions for a particular non-normal split-plot example and offer solutions within a Bayesian framework.
Session Title: Statistical Issues in Communication Networks
Organizer: Earl Lawrence, Los Alamos National Laboratory and Vijay Nair, University of Michigan
Chair: Kary Myers, Los Alamos National Laboratory
Estimation of Traffic Flow Characteristics from Sampled Data
Authors: Lili Yang and George Michailidis (University of Michigan)
Speaker: Lili Yang
email: yanglili "AT" umich.edu
Understanding the characteristics of traffic flows is crucial for allocating the necessary resources (bandwidth) to accommodate users demand. The problem of using sampled flow statistics in order to estimate the number of active flows in a link, as well as their packet length and byte size distributions has recently attracted a lot of interest among networking researchers. In this talk, the problem of nonparametric estimation of network flow characteristics, namely packet lengths and byte sizes, based on sampled flow data. The data are obtained through single stage Bernoulli sampling of packets. An adaptive expectation-maximization (EM) algorithm is used for the flow length distribution, which in addition provides an estimate for the number of active flows. The estimation of the flow sizes (in bytes) is accomplished through a random effects regression model that utilizes the flow length information previously obtained. A variation of this approach, particularly suited for mixture distributions that appear in real network traces, is also considered. The proposed approaches are illustrated and compared on a number of synthetic and real data sets.
An Aspect of Topology Determination
Authors: Fei Chen, Lorraine Denby, and Jean Meloche (Avaya Labs)
Speaker: Fei Chen
email: feic "AT" avaya.com
Network topology is an important covariate for end to end performance monitoring and tuning. But end-to-end paths, when discovered using the traceroute utility, can be hard to interpret. This is because every physical router on a network will have more than one IP addresses, one for each interface a router has. To have an accurate representation of the network topology, containing only physical routers rather than multiple aliases of them, requires post-processing of traceroute results. In this talk I will discuss a simple method to reduce the set of IP addresses to the underlying set of physical routers using end-to-end delay data.
Toward a Complete Active Tomography Framework
Author: Earl Lawrence (Los Alamos National Laboratory)
Speaker: Earl Lawrence
email: earl "AT" lanl.gov
Active network tomography is concerned with inferring packet delay and loss across links in a network based upon end-to-end probing. There are two outstanding problems in this area. The first is the need for a framework that adequately models transmission and empty-queue probabilities simultaneously with a flexible continuous component for the heavy-tailed delays. The second problem lies in the spatio-temporal independence assumption used for many models; an assumption known to be violated. This talk will discuss possible solutions to both of these problems. A mixture distribution framework will be discussed for the first problem and a multivariate probit solution will be discussed for the second problem in the narrower loss-only context. We will include a brief discussion of the synthesis of these two ideas as well.
Session Title: Recent Advances in Response Surface Methods
Organizer: Timothy Robinson, University of Wyoming
A Graphical Approach for Assessing Optimal Operating Conditions in Robust Design
Authors: Anu Abraham (HSBC), Timothy Robinson (University of Wyoming), and Christine Anderson-Cook (Los Alamos National Laboratory)
Speaker: Anu Abraham
email: anu_abraham2001 "AT" yahoo.com
The determination of optimal operating conditions in robust parameter design often involves the use of an objective function such as the mean squared error of the response along with graphical overlays of the estimated process mean and variance functions. Existing graphical methods in robust design have limitations when more than two control factors are involved, since visualizing the entire control design space becomes difficult with many control factors. Here we present a new graphical technique for assessing optimal operating conditions which is based upon the estimated proportion of items within specification limits. The new plots offer users the ability to compare competing sets of optimal operating conditions, to assess possible assumption violations regarding the distribution of the noise factors, and the opportunity to observe the shape of the response distribution as it relates to stated specification limits. An example from manufacturing serves as the basis for illustration.
A New Approach to the Design of Experiments for Robust Parameter Design
Authors: William Myers (Procter & Gamble), Timothy Robinson (University of Wyoming), and Raymond Myers (Virginia Tech)
Speaker: William Myers
email: myers.wr "AT" pg.com
This paper considers an approach of design of experiments for the dual response surface analysis in robust parameter design. The response surfaces dealt with are derived from the response models developed in a combined array.
Assessing Uncertainty of Regression Estimates in a Response Surface Model for Repeated Measures
Authors: Shaun Wulff and Timothy Robinson (University of Wyoming)
Speaker: Shaun Wulff
email: Wulff "AT" uwyo.edu
In response surface modeling, not only is the estimation of regression coefficients of interest, but so is the uncertainty or the estimate of standard error of these regression coefficients. When observations are correlated, as in a repeated measures design, the usual formulas for calculating standard errors may or may not be adequate. This talk will discuss some of the proposed adjustments to the standard error formulas, the performance of these adjustments, and when such adjustments should be considered.
ISBIS Special Session: A Conversation about the Proper Choice of Experimental Design
Organizer: Geoff Vining, Virginia Tech
Chair: Stefan Steiner, University of Waterloo
A Mixture Design Planning Process
Authors: Pat Whitcomb (Stat-Ease, Inc.) and Gary W. Oehlert (University of Minnesota)
Speaker: Patrick Whitcomb
email: pat "AT" statease.com
Newcomers to mixture design find it difficult to choose appropriate designs with adequate precision. We demonstrate why standard power calculations (used for factorial design) are not of much use due to the colinearity present in mixture designs. However when using the fitted mixture model for drawing contour maps, 3D surfaces, making predictions, or performing optimization, it is important that the model adequately represent the response behavior over the region of interest. Emphasis is on the ability of the design to support modeling certain types of behavior (linear, quadratic, etc.); we are not generally interested in the individual model coefficients. Therefore, power to detect individual model parameters is not a good measure of what we are designing for. A discussion and pertinent examples will show attendees how the precision of the fitted surface relative to the noise is a critical criterion in design selection. In this presentation, we introduce a process to determine if particular mixture design has adequate precision for DOE needs. Attendees will take away a strategy for determining if a particular mixture design has precision appropriate for their modeling needs.
The Beauty of Classical Designs
Author: Geoff Vining (Virginia Tech)
Speaker: Geoff Vining
George Box has long extolled the virtues of classical designs. In the 1970s, Box and Kiefer engaged each other in an interesting debate over the use of variance optimal designs as an alternative. Box first articulated his fourteen basic principles of experimental design at that time.
The 2000s seems to have regenerated the discussion on the use of variance based optimal designs. This talk reiterates Box’s fourteen basic principles, particularly with regard to industrial experimentation where results are often available almost immediately and experimenters build their designs as a sequence. This talk outlines the virtues of classical designs within a sequential learning strategy. This talk, however, does recognize that classical designs are not always the best approach due to particular experimental constraints. It concludes with a brief discussion of situations where computer generated designs have significant value.
A Practitioner’s View of Optimality
Author: Julia O'Neill (Merck & Co., Inc.)
Speaker: Julia O'Neill
email: julia_oneill "AT" merck.com
A conscientious statistician will carefully consider multiple alternative designs before selecting an optimal design for a problem. The optimal choice depends entirely on the criteria used to define optimality. In this session we will reach beyond the classical numerical definitions of optimality to explore some other criteria which impact the probability of successful implementation of the design. Examples from industrial practice will be used to illustrate the risks of ignoring the non-numeric optimality criteria.
Session Title: Statistical Advances in High Technology
Organizers: Roshan Vengazhiyil and CF Jeff Wu, Georgia Institute of Technology
Chair: Leroy A Franklin, Eli Lilly
Gaussian Process Models for Computer Experiments With Qualitative and Quantitative Factors
Authors: Zhiguang (Peter) Qian (University of Wisconsin-Madison), Huaiqing Wu (Iowa State University), and C. F. Jeff Wu (Georgia Institute of Techology)
Speaker: Zhiguang (Peter) Qian
email: zhiguang "AT" stat.wisc.edu
Modeling experiments with qualitative and quantitative factors is an important issue in computer modeling. Some Gaussian process models that incorporate both qualitative and quantitative factors are proposed. The key to the development of these new models is an approach for constructing correlation functions with qualitative and quantitative factors. An iterative estimation procedure is developed for the proposed models. Modern optimization techniques are used in the estimation to ensure the validity of the constructed correlation functions. The proposed method is illustrated with an example involving a known function and a real example for modeling the thermal distribution of a data center.
Statistical Modeling and Analysis for Robust Synthesis of Nanostructures
Authors: Tirthankar Dasgupta, Christopher Ma, Roshan J. Venghazhiyil, Zhong Lin Wang, and C. F. Jeff Wu (Georgia Institute of Techology)
Speaker: Tirthankar Dasgupta
email: tdasgupt "AT" isye.gatech.edu
The transition from laboratory-level synthesis of nanostructures to their large scale, controlled and designed synthesis necessarily demands systematic investigation of the manufacturing conditions under which the desired nanostructures are synthesized reproducibly, in large quantity and with controlled or isolated morphology. A systematic study on the growth of Cadmium Selenide nanostructures through statistical modeling and optimization of the experimental parameters is conducted, with the objective of investigating the best process conditions that ensure synthesis with high yield and reproducibility. Through a designed experiment and rigorous statistical analysis of experimental data, models linking the probabilities of obtaining specific morphologies to the process variables are developed. A new iterative algorithm for fitting a multinomial logistic model is proposed and used. The optimum process conditions, which maximize the above probabilities and make the synthesis process robust (i.e., less sensitive) to variations of process variables around set values, are derived from the fitted models using Monte-Carlo simulations. Current research focusses on developing a sequential and space-filling design strategy to address the problem of economically optimizing a very complex and non-regular response surface containing several no-yield regions.
Design and Analysis of Experiments Using Nested Factors With Applications to Machining
Authors: Ying Hung, Roshan J. Vengazhiyil, and Shreyes N. Melkote (Georgia Institute of Techology)
Speaker: Ying Hung
email: yhung "AT" isye.gatech.edu
In many applications, some of the factors in the experiments can change with respect to the level of another factor. Such factors are often called nested factors. A factor within which other factors are nested is called a branching factor. For example, suppose we want to experiment with two processing methods. The factors involved in these two methods can be di®erent. Thus, in this experiment the processing method is a branch- ing factor and the other factors are nested within the branching factor. To construct a fractional factorial design or Latin hypercube design involving branching factors is more challenging and has not received much attention in the literature. In this paper, a new class of designs is proposed. Based on di®erent requirements in physical and computer experiments, criteria such as minimum aberration and orthogonal maximin distance are used to rank competing designs. The optimal design can be found by several modern optimization algorithms. Moreover, a modi¯ed kriging model is intro- duced by constructing a new correlation function in the Gaussian process model. The proposed methodology is illustrated using a computer experiment on hard turning. Optimal machining conditions and tool edge geometry are attained, which resulted in a remarkable improvement in the machining process.
Session Title: Advances in Warranty Data Analysis
Organizer: Emmanuel Yashchin, IBM
Chair: Wayne Nelson, Wayne Nelson Statistical Consulting
Some Challenges in Warranty Data Analysis and Its Use
Author: Jeff Robinson (General Motors R&D Center)
Speaker: Jeff Robinson
email: jeffrey.a.robinson "AT" gm.com
Warranties have been characterized as signals of quality to potential customers. The data that results is certainly used by companies internally to monitor quality, and often it is used for much more. This presentation will describe a number of challenges related to analyzing warranty data and using the results. The examples are from the auto business, but I believe many of the issues are more generally applicable. Some of the challenges are old and relate only to warranty data itself. These include the proper assessment of the number of units at risk, dealing with dual usage measures (e.g. age and mileage for cars), no-data forecasts, monitoring for new quality problems, as well as “mining” warranty data to analyze recurring problems. Other challenges involve relating internal warranty data with other quality metrics such as those from external surveys. Still other challenges are emerging due to the availability of new data sources such as on-board diagnostic information. With access to such data we may choose to view a customer warranty event as predictable and potentially preventable.
Analysis of Window-Observation Recurrence Data
Authors: Bill Meeker, Jianying Zuo, and Huaiqing Wu (Iowa State University)
Speaker: Bill Meeker
email: wqmeeker "AT" iastate.edu
Many systems experience recurrent events. Recurrence data are collected to analyze quantities of interest, such as the mean cumulative number of events or the mean cumulative cost of events. Methods of analysis are available for recurrence data with left and/or right censoring. Due to practical constraints, however, recurrence data are sometimes recorded only in windows with gaps between the windows. This paper extends existing methods, both nonparametric and parametric, to window-observation recurrence data. The nonparametric estimator requires minimum assumptions, but will be biased if the size of the risk set is not positive over the entire period of interest. There is no such difficulty when using a parametric model for the recurrence data. For cases in which the size of the risk set is zero for some periods of time, we propose a simple method that uses a parametric adjustment to the nonparametric estimator. The methods are illustrated with two numerical examples. The first example considers extended warranty data where there are gaps in coverage (and thus in data collection). The second considers data from military vehicles for which data is collected only during certain exercises.
Warranty Analysis of Repairable Systems
Authors: David Trindade and Jeff Glosup (Sun Microsystems, Inc.), William D. Heavlin (Google)
Speaker: David Trindade
email: dave "AT" trindade.com
Warranty analysis is often based on MTBF measures. However, summary statistics such as MTBF involve many assumptions, and the use of MTBF in situations where they do not apply can result in misleading inferences and erroneous conclusions. In this talk we will discuss what are called time dependent analysis methods which provide a more accurate and realistic basis for dealing with warranty issues.
Back to Top