Computation of the normalising constant for product-form models of distributed systems with synchronisation

https://doi.org/10.1016/j.future.2019.10.045Get rights and content

Highlights

  • We consider stochastic models of distributed systems with fork-join and batches of jobs.

  • We provide an algorithm for the efficient computation of the performance indices.

  • We compare the performance of the algorithm with the state of the art.

  • We show some application scenarios in the case of distributed systems.

Abstract

Performance evaluation of large distributed systems plays a pivotal role in the design of Internet of Things (IoT) applications, or Big Data analysis, where high scalability and low response times are required. However, this performance assessment requires models, methodologies and tools tailored for this type of systems. Product-form queueing networks have been widely adopted for the analysis of sequential computations thanks to the availability of efficient algorithms for the computation of the average performance indices. However, this formalism has strong limitations since it does not allow one to model fork-join constructs or batches of jobs. Product-form stochastic Petri nets partially overcome these limitations but, on the other hand, general algorithms for the computation of the expected performance indices are not known or require strong assumptions on the model structure. In this paper, we define a new algorithm for the analysis of product-form stochastic Petri nets which works under much less restrictive assumptions than those previously proposed. The idea is that, in many cases, the entire state space of the model can be stored in memory thanks to multi-valued decision diagrams and the computation of the net’s performance indices can take advantage of the tree structure that characterises this representation. Finally, we present a case study and several examples that will be used to study the performance of the algorithm for realistic scenarios.

Introduction

Performance evaluation of large distributed systems is a challenging research topic that has attracted much attention over the last decades. In many practical cases, benchmarking the systems is too complicated and simulations may be time expensive and inappropriate for studies that must compare many possible configurations with the aim of choosing the best according to a certain metric. Analytical models, despite their strong assumptions, have shown to be a viable route to understand and tune the system’s performance at the early stages of its development (see, e.g., [1]).

Many works rely on queueing systems to study distributed architectures (see, e.g., [2], [3] for some recent references) but due to their very abstract level of representation of resource contention, Markovian queueing networks (QNs) find a wider application (see, e.g., [4], [5], [6]). Under mild conditions, Markovian queueing networks admit a product-form solution, i.e., the expression of the stationary distribution can be derived without resorting to the explicit solution of the Markov chain underlying the model. However, product-forms do not immediately imply the analytical tractability of the models. Indeed, for queueing networks several algorithms have been defined for the computation of the average performance indices in an efficient and numerically stable way (see [7] for a survey of the classical algorithms and [8] for a recent application). The drawback of this formalism consists in its limitations: aside from the assumptions on the distributions of service times and on the scheduling disciplines, product-form queueing networks are not able to model fork-join constructs nor batches of jobs or resources. Clearly, in distributed systems, this is a serious limitation. For example, in MapReduce based computations, the Map and Reduce phases correspond to fork and join constructs, respectively.

One of the most important formalisms that overcome the limitations of QN are the stochastic Petri nets (SPNs) [9], [10], [11]. They are widely used for modelling and performance evaluation of computer systems, computer networks, telecommunication systems as well as in bioinformatics where they are employed in modelling of various biological and biochemical processes. Like QNs [7], their underlying stochastic process is a continuous-time Markov chain (CTMC). Also like QNs, in practical use, SPNs suffer from the problem of state space explosion which consists in an exponential, in model size, growth of the number of distinct states of the underlying CTMC. This limits the scope of performance evaluation methods to models for which the underlying CTMC is small enough to be solved (i.e., to compute its transient or stationary probability distribution), or has a special structure. Because of these difficulties with solving models with large state spaces, a wide range of approximation methods and special solution methods for models with special structure or characteristics have been developed.

Product-form models, firstly identified for QNs [12], [13], have been studied for SPNs [14], [15], [16], [17] and other modelling formalisms; for a model in this class, the stationary probability distribution can be obtained in the form of a product over model components. State probabilities for these models are usually obtained in an unnormalised form (formally, a measure over the state space of the process is obtained), requiring the computation of the normalising constant which can be computed directly as a sum of unnormalised probabilities over the state space of the underlying CTMC or using more efficient convolution algorithms [18], [19]. The normalisation phase is crucial for determining most of the performance indices of the model.

The stationary analysis of product-form models is greatly simplified in comparison to other models and more complex models with larger state spaces can be efficiently analysed. However, the convolution algorithm for SPNs requires a special structure of the reachability set and can only be applied to the class of S-invariant reachable SPNs. Furthermore, at the current state of the art, it is not known how to automatically check the membership of this class without generating the reachability set, and the convolution is less efficient than in the case of QNs due to the need to solve a large number of systems of linear Diophantine equations. The S-invariant reachability restriction severely limits the utility of the convolution algorithm.

Multi-valued decision diagrams (MDDs) [20], data structures that encode discrete functions of discrete variables, have been used with great success in generation and storage of very large state spaces for bounded SPN models [21], [22]. In the present work, we use MDDs that encode state spaces of product-form models in order to efficiently compute the normalising constant for models with finite state spaces. We also propose related methods for the computation of performance measures of product-form SPNs with finite reachability sets. While we need to generate the reachability set of the SPN, we do not require S-invariant reachability and the proposed algorithm can thus be applied to general SPNs. Also, in contrast to convolution, we do not need to solve systems of Diophantine equations. Hence, the performance of the proposed algorithm on S-invariant reachable SPNs is thus much better than the convolution algorithm if reachability set generation is not taken into account, while being comparable otherwise.

The paper is laid out as follows. First, in Section 2, we propose a review of the literature on this topic. In Section 3, we introduce SPNs and product-forms. Section 4 introduces some notation for state spaces of the models handled in the paper, and defines MDDs. Based on these definitions, we introduce a novel recursive algorithm for computation of the normalising constant for general product-form models. In particular, the algorithm can be applied to SPNs. Section 5 compares the proposed algorithm to the convolution algorithm for a class of product-form SPNs [19] and proposes methods for performance evaluation, comparing them to the mean value algorithm (MVA) [23]. In Section 6 we report results of computational experiments on SPNs in which we test performance of the proposed algorithm and compare it to the performance of the SPN convolution algorithm. Finally, Section 7 concludes the paper. Appendix A contains the proofs of statements from the main part of the paper. While our main contributions are directed to the case of SPNs, for the sake of clarity we include in Appendix B a comparison with the convolution algorithm for a class of product-form QNs [18], which can be considered as a special case of what we propose here. The point of the latter appendix is not to compare the proposed algorithm with the convolution algorithm on QNs, but to drive the intuition of the reader who is expert on the queueing network analysis.

Section snippets

Related work

In this section, we review the literature on the solution of SPNs and the theoretical frameworks used for the definition of our algorithm. We leave the discussion on the applications of product-form SPNs for the performance evaluation of distributed systems to Section 3, where the reader will have a clearer view of their potential.

In [24], [25] decision diagrams have been used for the analysis of SPNs. However, unlike in the present paper, (1) these approaches rely on encoding, using decision

Stochastic Petri nets

In this section, we introduce stochastic Petri nets (SPNs) [9], [10], [11] and review their main properties. Then, we introduce a model of a distributed computation that will serve as the running example.

Algorithm MDD-rec

As explained in Section 3, the main problem in the analysis of product-form models is efficient computation of the normalising constant G. In this section, we introduce a novel recursive algorithm, named MDD-rec, for its computation. MDD-rec computes the normalising constant by walking over nodes of a multi-valued decision diagram (MDD) that encodes the state space S of the model. Generation of this MDD is out of the scope of this paper (for further details see [39]) and it is assumed that the

Algorithm comparison with convolution on product-form stochastic Petri nets

One of the well-known algorithms for computation of the normalising constant in product-form SPNs is the convolution algorithm presented in [19]. However, this algorithm, in contrast with MDD-rec, is not applicable to any product-form SPN, but only to a subclass of S-invariant reachable SPNs. In this section, we first introduce the conditions required by S-invariant reachable SPNs and then we compare the convolution algorithm of [19] with the proposed algorithm MDD-rec.

Experiments

We have implemented the proposed algorithm MDD-rec 1 for SPNs in C++, using the Multi-terminal and Edge-valued Decision Diagram LibrarY (MEDDLY) [42]. In our implementation, the generation of the MDD encoding of the reachability set of the SPN is handled by an efficient saturation algorithm [39] provided by MEDDLY. We have implemented the labelling function L from the algorithm MDD-rec using the hashing-based map implementation unordered_map from the C++11 standard.

To compare the performance of

Conclusion

In this paper, we have presented the algorithm MDD-rec for computation of the normalising constant for product-form models with finite state spaces, based on efficient generation and encoding of the models’ state spaces using MDDs. The algorithm can be applied for the computation of the stationary distribution of general product-form models with finite state spaces, although its main domain of application consists of models expressed in terms of SPNs.

Product-form SPNs are crucial for the

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Simonetta Balsamo is a full professor of Computer Science at the University Ca’ Foscari of Venice, Italy. Her research interests include performance and reliability modelling and analysis of computer and communication systems, parallel and distributed processing, cloud computing, distributed simulation, quantitative analysis of software architectures and integration of specification languages and performance models. She has published several papers in international journals, conference

References (46)

  • BonaldT. et al.

    Balanced fair resource sharing in computer clusters

    Perform. Eval.

    (2017)
  • AyestaU. et al.

    On a unifying product form framework for redundancy models

    Perform. Eval.

    (2018)
  • SmithC. et al.

    Performance Solutions: A Practical Guide to Creating Responsive, Scalable Software

    (2001)
  • P.D. Ezhilchelvan, I. Mitrani, Multi-class resource sharing with batch arrivals and complete blocking, in: QEST 2017,...
  • J. Rolia, G. Casale, D. Krishnamurthy, S. Dawson, S. Kraft, Predictive modelling of SAP ERP applications: challenges...
  • E.C. d’Oro, S. Colombo, M. Gribaudo, M. Iacono, D. Manca, P. Piazzolla, Modeling and evaluating performances of complex...
  • BalsamoS. et al.

    Queueing Networks

    (2007)
  • MolloyM.K.

    Performance analysis using stochastic Petri nets

    IEEE Trans. Comput.

    (1982)
  • MarsanM.A. et al.

    Modelling with Generalized Stochastic Petri Nets

    (1995)
  • MurataT.

    Petri nets: Properties, analysis and applications

    Proc. IEEE

    (1989)
  • JacksonJ.R.

    Jobshop-like queueing systems

    Manage. Sci.

    (1963)
  • BaskettF. et al.

    Open, closed, and mixed networks of queues with different classes of customers

    J. ACM

    (1975)
  • HendersonW. et al.

    A net level performance analysis of Stochastic Petri Nets

    J. Aust. Math. Soc. Ser. B

    (1989)
  • Cited by (0)

    Simonetta Balsamo is a full professor of Computer Science at the University Ca’ Foscari of Venice, Italy. Her research interests include performance and reliability modelling and analysis of computer and communication systems, parallel and distributed processing, cloud computing, distributed simulation, quantitative analysis of software architectures and integration of specification languages and performance models. She has published several papers in international journals, conference proceedings, books and special editions. She has served as associated editor of Performance Evaluation Journal, general chair, programme chair and programme committee member for several international conferences. She has taught various courses at undergraduate and graduate level, including Stochastic Modelling, Performance Analysis, Discrete-event Simulation, Operating Systems, Distributed Systems and Computer Networks.

    Andrea Marin is associate professor of Computer Science at the University Ca’ Foscari of Venice, Italy. His research interests include performance evaluation and reliability modelling using stochastic models or simulations. He has published more than 80 peer-review papers on theoretical and applicative domains, including analysis of cloud architectures, distributed software systems and telecommunication infrastructures. He is in the programme committee of many conferences and symposiums in the field of performance evaluation.

    Ivan Stojic is a researcher at the Fondazione Bruno Kessler in Trento, Italy. He received his Ph.D. in computer science at the Ca’ Foscari University of Venice. His research interests concern formal languages and analysis methods for modelling, verification and performance evaluation of computer systems, especially in the context of distributed and dynamically adaptive systems. He has been working in publicly and privately funded research projects, publishing in international computer science conferences.

    View full text