Skip to main content
Log in

Response improvement in complex experiments by co-information composite likelihood optimization

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

We propose an adaptive procedure for improving the response outcomes of complex combinatorial experiments. New experiment batches are chosen by minimizing the co-information composite likelihood (COIL) objective function, which is derived by coupling importance sampling and composite likelihood principles. We show convergence of the best experiment within each batch to the globally optimal experiment in finite time, and carry out simulations to assess the convergence behavior as the design space size increases. The procedure is tested as a new enzyme engineering protocol in an experiment with a design space size of order 107.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Addelman, S.: Sequences of two-level fractional factorial plans. Technometrics 11, 477–509 (1969)

    Article  MATH  Google Scholar 

  • Allen, T.T., Rajagopalan, R.: A Bayesian plotting method for fractional factorial data analysis. J. Qual. Technol. 43, 224–235 (2011)

    Google Scholar 

  • Bell, A.J.: The co-information lattice. In: Proceedings of the Fifth International Workshop on Independent Component Analysis and Blind Signal Separation: ICA 2003 (2003)

    Google Scholar 

  • Box, G.E.P., Meyer, R.D.: An analysis for unreplicated fractional factorials. Technometrics 28, 11–18 (1986)

    Article  MATH  MathSciNet  Google Scholar 

  • Box, G.E.P., Meyer, R.D.: Finding the active factors in fractionated screening experiments. J. Qual. Technol. 25, 94–105 (1993)

    Google Scholar 

  • Box, G.E.P., Wilson, K.B.: On the experimental attainment of optimum conditions. J. R. Stat. Soc. B 13, 1–45 (1951)

    MATH  MathSciNet  Google Scholar 

  • Chaloner, K., Verdinelli, I.: Bayesian experimental design: a review. Stat. Sci. 10, 273–304 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  • Conti, S., O’Hagan, A.: Bayesian emulation of complex multi-output and dynamic computer models. J. Stat. Plan. Inference 140(3), 640–651 (2010)

    Article  MATH  MathSciNet  Google Scholar 

  • Conti, S., Gosling, J.P., Oakley, J.E., O’Hagan, A.: Gaussian process emulation of dynamic computer codes. Biometrika 96(3), 663–676 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  • Daniel, C.: Sequences of fractional replicates in the 2pq series (Corr: V57 p919). J. Am. Stat. Assoc. 57, 403–429 (1962)

    MATH  MathSciNet  Google Scholar 

  • Daniel, C.: Applications of Statistics to Industrial Experimentation. Wiley-Interscience, New York (1976)

    Book  MATH  Google Scholar 

  • Dean, A.M., Lewis, S.M. (eds.): Screening: Methods for Industrial Experimentation, Drug Discovery and Genetics. Springer, New York (2006)

    Google Scholar 

  • Gilmour, S.G., Mead, R.: Stopping rules for sequences of factorial designs. Appl. Stat. 44, 343–355 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  • Gilmour, S.G., Mead, R.: Fixing a factor in the sequential design of two-level fractional factorial experiments. J. Appl. Stat. 23, 21–29 (1996)

    Article  Google Scholar 

  • Gilmour, S.G., Mead, R.: A Bayesian design criterion for locating the optimum point on a response surface. Stat. Probab. Lett. 64(3), 235–242 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  • John, P.W.M.: Augmenting 2n−1 designs. Technometrics 8, 469–480 (1966)

    MathSciNet  Google Scholar 

  • Kirkwood, J.G., Boggs, E.M.: The radial distribution function in liquids. J. Chem. Phys. 10, 394–402 (1942)

    Article  Google Scholar 

  • Kroese, D.P., Porotsky, S., Rubinstein, R.Y.: The cross-entropy method for continuous multi-extremal optimization. Methodol. Comput. Appl. Probab. 8(3), 383–407 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  • Lieber, D., Rubinstein, R.Y., Elmakis, D.: Quick estimation of rare events in stochastic networks. IEEE Trans. Reliab. 46(3), 254–265 (1997)

    Article  Google Scholar 

  • Lindsay, B.G.: Composite likelihood methods. In: Prabhu, N.U. (ed.) Statistical Inference from Stochastic Processes, pp. 221–239. American Mathematical Society, Providence (1988)

    Chapter  Google Scholar 

  • Longhi, S., Czjzek, M., Lamzin, V., Cambillau, C.: Atomic resolution 1.0a crystal structure of fusarium solani cutinase: stereochemical analysis. J. Mol. Biol. 268, 779–799 (1997)

    Article  Google Scholar 

  • Matsuda, H.: Physical nature of higher-order mutual information: intrinsic correlations and frustration. Phys. Rev. E 62(3), 3096–3102 (2000)

    Article  Google Scholar 

  • McGill, W.J.: Multivariate information transmission. Psychometrika 19, 97–116 (1954)

    Article  MATH  Google Scholar 

  • Minervini, G., Evangelista, G., Villanova, L., Slanzi, D., De Lucrezia, D., Poli, I., Luisi, P.L., Polticelli, F.: Massive non-natural proteins structure prediction using grid technologies. BMC Bioinform. 10, 1–9 (2009)

    Article  Google Scholar 

  • Myers, R.H., Montgomery, D.C.: Response Surface Methodology: Process and Product Optimization Using Designed Experiments. Wiley, New York (2009)

    Google Scholar 

  • Pajak, T., Addelman, S.: Minimum full sequences of 2nm resolution III plans. J. R. Stat. Soc. B 37, 88–95 (1975)

    MATH  MathSciNet  Google Scholar 

  • Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Mateo (1997)

    Google Scholar 

  • Rubinstein, R.Y., Kroese, D.P.: The Cross-Entropy Method: A Unified Approach to Monte Carlo Simulation, Randomized Optimization and Machine Learning. Springer, Berlin (2004)

    Book  Google Scholar 

  • Santner, T.J., Williams, B., Notz, W.: The Design and Analysis of Computer Experiments. Springer, Berlin (2003)

    Book  MATH  Google Scholar 

  • Singer, A.: Maximum entropy formulation of the Kirkwood superposition approximation. J. Chem. Phys. 121(8), 3657–3666 (2004)

    Article  Google Scholar 

  • Watanabe, S.: Information theoretical analysis of multivariate correlation. IBM J. Res. Dev. 4, 66–82 (1960)

    Article  MATH  Google Scholar 

  • Wu, C., Hamada, M.: Experiments: Planning, Analysis, and Parameter Design Optimization. Wiley, New York (2000)

    Google Scholar 

  • Zlochin, M., Birattari, M., Meuleau, N., Dorigo, M.: Model-based search for combinatorial optimization: a critical survey. Ann. Oper. Res. 131(1–4), 373–395 (2004)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Davide Ferrari.

Appendix: Proofs

Appendix: Proofs

1.1 A.1 Proof of Proposition 1

The parameter estimate computed at step t+1 solves the following set of equations over Θ KSA :

(6.1)

As n→∞, by Assumptions A1–A3, the right hand side of the above equation converges to a quantity proportional to

(6.2)

Note that

$$\mathbb{P}(Y_t \geq\gamma_t, \mathbf{X}_t = \mathbf{x})= \frac {\mathbb{P}(\varphi (\mathbf{x}) + \epsilon\geq \gamma_t)p(\mathbf{x}|\theta_t)}{\rho_t}, $$

where ρ t =ℙ(Y t γ t ). By Assumption A4, there exist θ t+1Θ KSA such that setting θ=θ t+1 in (6.2) gives

As n→∞, for the solution of (6.2) evaluated at the optimal point x , we have

1.2 A.2 Proof of Proposition 2

Let β t =c ϵ /ρ t . Proposition 1 implies

$$p_t (\mathbf{x}^\ast)\geq p_0 \beta_1 \beta_2 \cdots\beta_t = p_0 \prod_{i = 1}^t \beta_t, $$

where p 0=p 0(x ). When n is large, Proposition 1 implies

(6.3)

Taking the log in the right hand side of the above expression gives

(6.4)

as T→∞ if . This proves (3.1). Moreover, (6.3) and (6.4) imply the bound in (3.2).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ferrari, D., Borrotti, M. & De March, D. Response improvement in complex experiments by co-information composite likelihood optimization. Stat Comput 24, 351–363 (2014). https://doi.org/10.1007/s11222-013-9374-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-013-9374-8

Keywords

Navigation