A new distribution-free Phase-I procedure for bi-aspect monitoring based on the multi-sample Cucconi statistic
Introduction
Statistical process monitoring (SPM) schemes are widely used in monitoring the stability of a process in diverse fields of applications, from monitoring product or service quality to environmental pollution or network traffic. In general, process monitoring comprises of two phases: Phase I and Phase II. During Phase I, practitioners sequentially collect and analyze a set of sample observations to assess the process characteristics. The Phase-I analysis, also known as the retrospective analysis, aims at understanding the sources of process variability and evaluating process stability. See Jones-Farmer, Woodall, Steiner, and Champ (2014) for more details. Phase-I analysis also helps in selecting a suitable in-control (IC) reference (training) sample or in estimating the right model for the underlying process distribution. Further, during Phase-I analysis, the practitioners attempt to identify the possible presence of one or more abnormal observations as a result of some assignable causes and discard them. More often, by repeating Phase-I analysis and removing extreme values, we may determine a reference sample for benchmarking that represents the IC process characteristics and use it for subsequent Phase-II analysis. Then, Phase-II monitoring aims at observing incoming data streams and monitoring the process stability by comparing the newly collected sample with the benchmarked reference sample.
Woodall (2000) emphasized that Phase-I analysis and follow-up measures are often more critical than Phase-II monitoring and noted that this area of research is grossly neglected. He also pointed out that the Phase-I applications are imperative “in practical considerations of quality characteristic selection, measurement and sampling issues, and rational subgrouping”. Jones-Farmer et al. (2014), recommended using standard Phase-II SPM schemes to analyze Phase-I data retrospectively for determining the reference sample. They also pointed out that one should take great care to reduce and control the false alarm probability (FAP). Recently, Woodall, 2017, Testik et al., 2018, Li et al., 2019, among others, highlighted the importance of Phase-I analysis because a wrong benchmarking of the reference sample can lead to a bad Phase-II performance of the charting scheme. In other words, if the Phase-I sample does not reflect the characteristics of the process, we are likely to miss out a signal due to assignable cause during the Phase-II monitoring, or we may receive too many false alarms. The literature on Phase-II SPM schemes is extensive, whereas that on Phase-I analysis is relatively limited. A large proportion of Phase-I control schemes are parametric and assume a particular distribution function for the underlying processes. A widespread assumption is that the process characteristic is normally distributed, as in Champ and Jones (2004). However, there is often little, or no information about the process distribution, whether or not the assumption of normality is valid before a Phase-I analysis. Woodall, 2000, Capizzi, 2014, among others, emphasized that making a distributional assumption is not justifiable before the establishment of process stability via Phase-I analysis. Distribution-free schemes are inherently IC robust irrespective of the functional form and underlying characteristics of the process distribution, and consequently, distribution-free procedures are more appealing than parametric ones for Phase-I analysis. Distribution-free Phase-I schemes are, therefore, a natural option for Phase-I analysis. Recent years witnessed the development of some nonparametric schemes useful in Phase-I analysis. Jones-Farmer et al. (2014) provided an excellent synopsis of Phase-I literature until about 2014. See Abbasi et al., 2015, Cheng and Shiau, 2015, Coelho et al., 2015, Ning et al., 2015, Capizzi and Masarotto, 2017, Li et al., 2019 for recent research on distribution-free Phase-I monitoring. For some other nonparametric schemes, readers may see, among others, Abbasi et al., 2017, Abid et al., 2017, Abid et al., 2018, Riaz et al., 2019.
In Phase I, many practitioners wish to jointly assess and monitor two aspects of the process, the location and the scale, assuming that the distribution of the underlying process characteristic is normal. The joint and scheme is a simple scheme for bi-aspect monitoring under the normality assumption. Among the early works on distribution-free approaches for Phase-I analysis based on two charts, we find Jones-Farmer and Champ (2010), that proposed a bi-aspect monitoring scheme, known as the RANK scheme. Their scheme simultaneously operates the mean-rank chart as in Jones-Farmer, Jordan, and Champ (2009) for location parameter and the scale-rank chart. Another significant contribution to joint Phase-I monitoring is by Capizzi and Masarotto (2013). They designed the RS/P scheme for Phase-I monitoring and developed the contributory “rsp” R package for practical execution. The RS/P scheme is also a two-chart scheme, but it relies on recursive segmentation and permutation instead of ordinary ranking. The RS/P scheme is particularly useful if there is a sustained shift. However, we noticed that the RS/P scheme often fails to identify the actual problematic sample in case of an isolated shift and that its behaviour is also markedly affected by the position of the change point, that is, by the inertia effect. Some authors, like Gan (1997), criticized the use of schemes based on two-isolated charts in the context of bi-aspect monitoring because such schemes overlook the impact of change in one aspect on the other. The RS/P scheme, being a two-chart scheme, suffers from similar drawbacks. Moreover, Li et al. (2019) emphasized that plotting a single statistic that reflects the influence of both location and scale aspects is more appropriate than two-chart schemes.
Traditionally, most of the conjoint test statistics for equality of location and scale parameters of two samples are of a quadratic form involving two orthogonal rank-statistics, one for the location aspect and the other for the scale aspect. For example, the familiar Lepage statistic is the squared Euclidean distance of the standardized Wilcoxon rank-sum (WRS) statistic and the standardized Ansari-Bradley (AB) statistic from the origin. Mukherjee and Chakraborti, 2012, Chowdhury et al., 2015, Mukherjee and Marozzi, 2017a, Mukherjee and Marozzi, 2017b, Mukherjee and Marozzi, 2017a, Chong et al., 2017, Chong et al., 2018, Mukherjee and Sen, 2018, Song et al., 2019, among others, developed Phase-II SPM schemes using the Lepage and Lepage-type statistics for bi-aspect monitoring. Li et al. (2019) first proposed a single charting scheme for bi-aspect Phase-I analysis based on a multi-sample version of the Lepage statistic from Rublík (2005).
One-chart schemes for Phase-I analysis, such as the Phase-I Lepage chart, have some benefits over two-chart schemes. The first advantage of one-chart schemes is in the determination of control limits. Only one set of control limits is required instead of two sets of control limits for two charts. If the FAP of individual charts is set to for a two-chart scheme, overall FAP is much higher than . In the case of independent location and scale statistics, overall FAP is . If they are not independent, things become more complex. Therefore, one-chart schemes are often easier to design and use. Secondly, the inherent assumption of charts based on location statistics, such as the Kruskal-Wallis (KW) statistic, is the stability of the scale. When the scale parameter varies, the use of such statistics are not recommended (somewhat statistically unethical that many practitioners often do not realize). Similarly, the inherent assumption of charts based on scale statistics is the stability of the location; and when location varies, the use of such statistics are not advised. This fact is well established in many statistical works of the literature on simultaneous inference.
In recent years, several authors, for example, Marozzi (2013), showed that the Cucconi (1968) statistic is often as worthy or better than the Lepage statistic in capturing process shifts in bi-aspect monitoring. The Cucconi statistic has a fascinating history. Cucconi (1968) designed it using ranks and anti-ranks and not as a quadratic combination of two statistics. The original paper appeared at an Italian journal and in the Italian language. Therefore, the international scientific community was not aware of it. Marozzi, 2009, Marozzi, 2013 popularised the two-sample Cucconi statistic outside Italy, where many researchers used the Cucconi statistic in developing SPM schemes. Chowdhury et al., 2014, Mukherjee and Marozzi, 2017b, Mahmood et al., 2017 considered Phase-II joint monitoring schemes based on the two-sample Cucconi statistic. Song, Mukherjee, Marozzi, and Zhang (2020) showed that the Cucconi statistic is also a quadratic form of a location and a scale statistic and could be an elegant choice for Phase-II monitoring when a training sample is available from the IC population. Researchers in various works established that the Phase-II SPM schemes based on the Cucconi statistic compete well with the SPM procedures involving the Lepage or Lepage-type statistics. The latest discussion to this end appeared in Xiang, Gao, Li, Pu, and Dou (2019), and some of the references therein. However, to date, no Phase-I SPM scheme used the Cucconi statistic for bi-aspect process monitoring.
In this paper, we present a novel distribution-free SPM scheme for Phase-I monitoring and assessment considering the multi-sample Cucconi statistic as the pivot. The multi-sample Cucconi statistic, introduced by Marozzi (2014), is an extension of the two-sample Cucconi statistic. Our proposed procedure is a single chart scheme and is suitable for simultaneously detecting subgroup location and scale shifts. The new scheme is not affected by the shortcomings of two-chart schemes, and it is a competitive alternative to the Phase-I Lepage scheme introduced by Li et al. (2019) in different situations. In Section 2, we revisit the multi-sample extension of the Lepage statistic and the corresponding Phase-I Lepage chart as in Li et al. (2019). Section 3 offers a brief overview of the multi-sample Cucconi statistic. We explain the design and implementation algorithm of the proposed Phase-I Cucconi scheme in Section 4 and discuss the determination of its control limits. In Section 5, we study and compare the IC and out-of-control (OOC) performance of the Phase-I Cucconi and the Lepage schemes using Monte-Carlo simulations. We discuss a real example and some practical advantage of the proposed scheme in Section 6. Section 7 summarises the results and suggests some directions for future research.
Section snippets
Review of the multi-sample Lepage statistic and the corresponding Phase-I scheme
Rublík (2005) extended the traditional two-sample Lepage test in a multi-sample situation combining the KW statistic for subgroup location along with the multi-sample version of the AB statistic for scale. Li et al. (2019) utilized this statistic to develop the Phase-I Lepage scheme. Now we review the multi-sample Lepage statistic.
Let , denote a random sample of size from the th subgroup. Here, ’s are univariate and continuous having cumulative distribution function
The multi-sample Cucconi statistic
Observe that,
See, Marozzi (2014) for proofs. Define, for and note that
Marozzi (2014) introduced a multi-sample version of the Cucconi statistic as
Considering , and , we may write,
Implementation
The main idea is to use the multi-sample Cucconi statistic to develop a Phase-I scheme. The steps for implementing the Shewhart-type Phase-I Cucconi scheme to form a reference sample are outlined as follows.
- 1.
Consider a set of subgroups, with size , Note that varying subgroup size is permitted, with , as the particular case where the subgroup size is constant.
- 2.
Compute the charting statistic for , based on the set of data.
- 3.
Plot ,
Comparison study
In this section, we compare the proposed Phase-I Cucconi scheme with the other one-chart Phase-I scheme, i.e. the Phase-I Lepage scheme of Li et al. (2019). For brevity, we do not consider competing schemes other than the Phase-I Lepage scheme because Li et al. (2019) already compared it with some existing two-chart procedures for joint Phase-I analysis, including the RANK scheme as well as the RS/P scheme, and established that the Phase-I Lepage scheme is superior in many situations.
We design
Practical implementation
In this section, we illustrate a practical application of the proposed Phase-I Cucconi scheme. We consider a real dataset on the outer diameters of guide bush, given in Table 9 in Song et al. (2020). Song et al. (2020) indicated that guide bush is one of the critical components in a fuze 117 MK 20, and variations in the dimensional nature of guide bush (e.g. increment or reduction in the outer diameter) may lead to abnormal functioning. The target diameter for the guide bush is 27.03 mm, and
Concluding remarks
In this paper, we have offered an attractive distribution-free one-chart scheme for simultaneous assessment and control of location and scale parameters in Phase I. The proposed scheme uses the multi-sample version of the Cucconi statistic as the pivot. Its IC robustness is a significant advantage because, in many fields, observed data are not normal. The proposed scheme could be a natural choice when the assumption of normality is difficult to justify. The Phase-I Cucconi scheme works even if
CRediT authorship contribution statement
Chenglong Li: Software, Formal analysis, Investigation, Resources, Data curation, Writing - review & editing, Visualization, Funding acquisition. Amitava Mukherjee: Conceptualization, Methodology, Validation, Formal analysis, Investigation, Visualization, Supervision, Project administration. Marco Marozzi: Supervision, Writing - review & editing.
Acknowledgement
The work described in this paper was supported by National Natural Science Foundation of China (No. 71801179).
References (43)
- et al.
Distribution-free Shewhart-Lepage type premier control schemes for simultaneous monitoring of location and scale
Computers & Industrial Engineering
(2017) - et al.
Some distribution-free Lepage-type schemes for simultaneous monitoring of one-sided shifts in location and scale
Computers & Industrial Engineering
(2018) - et al.
A distribution-free Phase-I monitoring scheme for subgroup location and scale based on the multi-sample Lepage statistic
Computers & Industrial Engineering
(2019) - et al.
Optimal design of Shewhart-Lepage type schemes and its application in monitoring service quality
European Journal of Operational Research
(2018) - et al.
Optimizing joint location-scale monitoring – an adaptive distribution-free approach with minimal loss of information
European Journal of Operational Research
(2019) - et al.
On the performance of Phase-I dispersion control charts for process monitoring
Quality and Reliability Engineering International
(2015) - et al.
On enhanced sensitivity of nonparametric EWMA control charts for process monitoring
Scientia Iranica. Transaction E Industrial Engineering
(2017) - et al.
An efficient nonparametric EWMA Wilcoxon signed-rank chart for monitoring location
Quality and Reliability Engineering International
(2017) - et al.
On designing a new cumulative sum Wilcoxon signed rank chart for monitoring process location
PLOS ONE
(2018) - et al.
New corrections for old control charts
Quality Engineering
(2005)
Recent advances in process monitoring: Nonparametric and variable-selection methods for Phase-I and Phase II
Quality Engineering
Phase-I distribution-free analysis of univariate data
Journal of Quality Technology
Phase-I distribution-free analysis of multivariate data
Technometrics
Phase I statistical process control charts: An overview and some results
Quality Engineering
Designing Phase I X-bar charts with small sample sizes
Quality and Reliability Engineering International
Cluster-based profile analysis in Phase I
Journal of Quality Technology
A distribution-free multivariate control chart for Phase-I applications
Quality and Reliability Engineering International
A new distribution-free control chart for joint monitoring of unknown location and scale parameters of continuous distributions
Quality and Reliability Engineering International
Distribution-free Phase II CUSUM control chart for joint monitoring of location and scale
Quality and Reliability Engineering International
A comparison of Phase-I control charts
South African Journal of Industrial Engineering
Un nuovo test non parametrico per il confronto tra due gruppi campionari
Giornale degli Economisti
Cited by (11)
Some quasi-distribution-free schemes for Phase-I analysis of multivariate industrial processes
2023, Computers and Industrial EngineeringDistribution-free Phase-II monitoring of high-dimensional industrial processes via origin and modified interpoint distance based algorithms
2023, Computers and Industrial EngineeringA new distribution-free scheme for simultaneous Phase-I analysis of four process aspects and its application in monitoring customers’ waiting times
2023, Computers and Industrial EngineeringProposed nonparametric runs rules Lepage and synthetic Lepage schemes
2022, Computers and Industrial EngineeringDistribution-free double exponentially and homogeneously weighted moving average Lepage schemes with an application in monitoring exit rate
2021, Computers and Industrial Engineering