# Model Checking Parameterized Asynchronous Shared-Memory Systems

- First Online:

## Abstract

We characterize the complexity of liveness verification for parameterized systems consisting of a leader process and arbitrarily many anonymous and identical contributor processes. Processes communicate through a shared, bounded-value register. While each operation on the register is atomic, there is no synchronization primitive to execute a sequence of operations atomically.

We analyze the case in which processes are modeled by finite-state machines or pushdown machines and the property is given by a Büchi automaton over the alphabet of read and write actions of the leader. We show that the problem is decidable, and has a surprisingly low complexity: it is NP-complete when all processes are finite-state machines, and is PSPACE-hard and in NEXPTIME when they are pushdown machines. This complexity is lower than for the non-parameterized case: liveness verification of finitely many finite-state machines is PSPACE-complete, and undecidable for two pushdown machines.

For finite-state machines, our proofs characterize infinite behaviors using existential abstraction and semilinear constraints. For pushdown machines, we show how contributor computations of high stack height can be simulated by computations of many contributors, each with low stack height. Together, our results characterize the complexity of verification for parameterized systems under the assumptions of anonymity and asynchrony.

## 1 Introduction

We study the verification problem for *parameterized asynchronous shared-memory systems* [9, 12]. These systems consist of a *leader* process and arbitrarily many identical *contributors*, processes with no identity, running at arbitrary relative speeds. The shared-memory consists of a read/write register that all processes can access to perform either a read operation or a write operation. The register is bounded: the set of values that can be stored is finite. Read/write operations execute atomically but sequences of operations do not: no process can conduct an atomic sequence of reads and writes while excluding all other processes. In a previous paper [9], we have studied the complexity of safety verification, which asks to check if a safety property holds no matter how many contributors are present. In a nutshell, we showed that the problem is coNP-complete when both leader and contributors are finite-state automata and PSPACE-complete when they are pushdown automata.

In this paper we complete the study of this model by addressing the verification of liveness properties specified as \(\omega \)-regular languages (which in particular encompasses LTL model-checking). Given a property like “every request is eventually granted” and a system with a fixed number of processes, one is often able to guess an upper bound on the maximal number of steps until the request is granted, and replace the property by the safety property “every request is granted after at most \(K\) steps.” In parameterized systems this bound can depend on the (unbounded) number of processes, and so reducing liveness to safety, or to finitary reasoning, is not obvious. Indeed, for many parameterized models, liveness verification is undecidable even if safety is decidable [8, 13].

Our results show that there is no large complexity gap between liveness and safety verification: liveness verification (existence of an infinite computation violating a property) is NP-complete in the finite-state case, and PSPACE-hard and in NEXPTIME in the pushdown case. In contrast, remember that liveness checking is already PSPACE-complete for a *finite* number of finite-state machines, and undecidable for a *fixed* number of pushdown systems. Thus, not only is liveness verification decidable in the parameterized setting but the complexity of the parameterized problem is *lower* than in the non-parameterized case, where all processes are part of the input. We interpret this as follows: in asynchronous shared-memory systems, the existence of arbitrarily many processes leads to a “noisy” environment, in which contributors may hinder progress by replying to past messages from the leader, long after the computation has moved forward to a new phase. It is known that imperfect communication can *reduce* the power of computation and the complexity of verification problems: the best known example are lossy channel systems, for which many verification problems are decidable, while they are undecidable for perfect channels (see e.g. [1, 3]). Our results reveal another instance of the same phenomenon.

Technically, our proof methods are very different from those used for safety verification. Our previous results [9] relied on a fundamental Simulation Lemma, inspired by Hague’s work [12], stating that the *finite* behaviors of an arbitrary number of contributors can be simulated by a finite number of *simulators*, one for each possible value of the register. Unfortunately, the Simulation Lemma does not extend to infinite behaviors, and so we have to develop new ideas. In the case in which both leader and contributors are finite-state machines, the NP-completeness result is obtained by means of a combination of an abstraction that overapproximates the set of possible infinite behaviors, and a semilinear constraint that allows us to regain precision. The case in which both leader and contributors are pushdown machines is very involved. In a nutshell, we show that pushdown runs in which a parameter called the *effective stack height* grows too much can be “distributed” into a number of runs with smaller effective stack height. We then prove that the behaviors of a pushdown machine with a bounded effective stack height can be simulated by an exponentially larger finite-state machine.

*Related Work.* Parameterized verification has been studied extensively, both theoretically and practically. While very simple variants of the problem are already undecidable [6], many non-trivial parameterized models retain decidability. There is no clear “rule of thumb” that allows one to predict what model checking problems are decidable, nor their complexities, other than “liveness is generally harder than safety.” For example, coverability for Petri nets—in which finite-state, identityless processes communicate via rendezvous or global shared state— is EXPSPACE-complete, higher than the PSPACE-completeness of the non-parameterized version, and verification of liveness properties can be equivalent to Petri net reachability, for which we only know non-primitive recursive upper bounds, or even undecidable. Safety verification for extensions to Petri nets with reset or transfer, or broadcast protocols, where arbitrarily many finite-state processes communicate through broadcast messages, are non-primitive recursive; liveness verification is undecidable in all cases [2, 8, 13]. Thus, our results, which show simultaneously lower complexity than non-parameterized problems, as well as similar complexity for liveness and safety, are quite unexpected.

German and Sistla [10] and Aminof *et al.* [4] have studied a parameterized model with rendezvous as communication primitive, where processes are finite-state machines. Model checking the fully symmetrical case—only contributors, no leaders—runs in polynomial time (other topologies have also been considered [4]), while the asymmetric case with a leader is EXPSPACE-complete. In this paper we study the same problems, but for a shared memory communication primitive.

*Population protocols* [5] are another well-studied model of identityless asynchronous finite-state systems communicating via rendezvous. The semantics of population protocols is given over fair runs, in which every potential interaction that is infinitely often enabled is infinitely often taken. With this semantics, population protocols compute exactly the semilinear predicates [5]. In this paper we do not study what our model can compute (in particular, we are agnostic with respect to which fairness assumptions are reasonable), but what we can compute or decide about the model.

## 2 Formal Model: Non-atomic Networks

In this paper, we identify systems with languages. System actions are modeled as symbols in an alphabet, executions are modeled as infinite words, and the system itself is modeled as the language of its executions. Composition operations that combine systems into larger ones are modeled as operations on languages.

### 2.1 Systems as Languages

An *alphabet*\({\varSigma }\) is a finite, non-empty set of *symbols*. A *word* over \({\varSigma }\) is a finite sequence over \({\varSigma }\) including the empty sequence denoted \(\varepsilon \), and a *language* is a set of words. An \(\omega \)*-word* over \({\varSigma }\) is an infinite sequence of symbols of \({\varSigma }\), and an \(\omega \)*-language* is a set of \(\omega \)-words. We use \({\varSigma }^*\) (resp. \({\varSigma }^\omega \)) to denote the language of all words (resp. \(\omega \)-words) over \({\varSigma }\). When there is no ambiguity, we use “words” to refer to words or \(\omega \)-words. We do similarly for languages. Let \(w\) be a sequence over some alphabet, define \(\text {dom}(w)=\{1,\ldots ,n\}\) if \(w = a_1 a_2 \ldots a_n\) is a word; else (\(w\) is an \(\omega \)-word) \(\text {dom}(w)\) denote the set \(\mathbb {N}\setminus \{0\}\). Elements of \(\text {dom}(w)\) are called *positions*. The *length* of a sequence \(w\) is defined to be \(\sup \text {dom}(w)\) and is denoted \(\vert {w}\vert \). We denote by \((w)_i\) the symbol of \(w\) at position \(i\) if \(i\in \text {dom}(w)\), \(\varepsilon \) otherwise. Moreover, let \((w)_{i..j}\) with \(i,j\in \mathbb {N}\) and \(i<j\) denote \( (w)_{i} (w)_{i+1} \ldots (w)_{j}\). Also \( (w)_{i..\infty }\) denotes \( (w)_{i}(w)_{i+1}\ldots \) For words \(u, v\in ({\varSigma }^\omega \cup {\varSigma }^{*})\), we say \(u\) is a *prefix* of \(v\) if either \(u = v\) or \(u \in {\varSigma }^*\) and there is a \(w\in ({\varSigma }^\omega \cup {\varSigma }^{*})\) such that \(v = u w\).

*Combining Systems: Shuffle.* Intuitively, the shuffle of systems \(L_1\) and \(L_2\) is the system interleaving the executions of \(L_1\) with those of \(L_2\). Given two \(\omega \)-languages \(L_1 \subseteq {\varSigma }_1^\omega \) and \(L_2 \subseteq {\varSigma }_2^\omega \), their *shuffle*, denoted by \(L_1 \between L_2\), is the \(\omega \)-language over \(({\varSigma }_1 \cup {\varSigma }_2)\) defined as follows. Given two \(\omega \)-words \(x \in {\varSigma }_1^\omega , y \in {\varSigma }_2^\omega \), we say that \(z \in ({\varSigma }_1 \cup {\varSigma }_2)^\omega \) is an *interleaving* of \(x\) and \(y\) if there exist (possibly empty) words \(x_1, x_2, \ldots , x_i, \ldots \in {\varSigma }_1^*\) and \( y_1, y_2, \ldots , y_i, \ldots \in {\varSigma }_2^*\) such that each \(x_1 x_2 \cdots x_i\) is a prefix of \(x\), and each \(y_1 y_2 \cdots y_i\) is a prefix of \(y\), and \(z=x_1y_1 x_2y_2 \cdots x_iy_i \cdots \in {\varSigma }^\omega \) is an \(\omega \)-word. Then \(L_1 \between L_2 = \textstyle {\bigcup _{x\in L_1, y\in L_2}} x \between y\), where \(x \between y \) denotes the set of all interleavings of \(x\) and \(y\). For example, if \(L_1 = ab^\omega \) and \(L_2 = ab^\omega \), we get \(L_1 \between L_2 = (a+ab^*a)b^\omega \). Shuffle is associative and commutative, and so we can write \(L_1 \between \cdots \between L_n\) or \({}\between _{i=1}^n L_i\).

*Combining Systems: Asynchronous product.* The asynchronous product of \(L_1 \subseteq {\varSigma }_1^\omega \) and \(L_2 \subseteq {\varSigma }_2^\omega \) also interleaves the executions but, this time, the actions in the common alphabet must now be executed jointly. The \(\omega \)-language of the resulting system, called the *asynchronous product* of \(L_1\) and \(L_2\), is denoted by \(L_1 \parallel L_2\), and defined as follows. Let \( Proj _{{\varSigma }}(w)\) be the word obtained by erasing from \(w\) all occurrences of symbols not in \({\varSigma }\). \(L_1 \parallel L_2\) is the \(\omega \)-language over the alphabet \({\varSigma }={\varSigma }_1\cup {\varSigma }_2\) such that \(w\in L_1 \parallel L_2\) iff \( Proj _{{\varSigma }_1}(w)\) and \( Proj _{{\varSigma }_2}(w)\) are prefixes of words in \(L_1\) and \(L_2\), respectively. We abuse notation and write \(w_1 \parallel L_2\) instead of \(\{w_1\} \parallel L_2\) when \(L_1={\left\{ w_1 \right\} }\). For example, let \({\varSigma }_1 = \{a,c\}\) and \({\varSigma }_2 = \{b,c\}\). For \(L_1 = (ac)^\omega \) and \(L_2 = (bc)^\omega \) we get \(L_1 \parallel L_2 = ((ab+ba)c)^\omega \). Observe that the language \(L_1\parallel L_2\) depends on \(L_1\), \(L_2\) and also on \({\varSigma }_1\) and \({\varSigma }_2\). For example, if \({\varSigma }_1 = \{a\}\) and \({\varSigma }_2 = \{b\}\), then \(\{a^\omega \} \parallel \{b^\omega \} = (a+b)^\omega \), but if \({\varSigma }_1 = \{a,b\}={\varSigma }_2\), then \(\{a^\omega \} \parallel \{b^\omega \} = \emptyset \). So we should more properly write \(L_1\parallel _{{\varSigma }_1,{\varSigma }_2} L_2\). However, since the alphabets \({\varSigma }_1\) and \({\varSigma }_2\) will be clear from the context, we will omit them. Like shuffle, asynchronous product is also associative and commutative, and so we write \(L_1 \parallel \cdots \parallel L_n\). Notice finally that shuffle and asynchronous product coincide if \({\varSigma }_1 \cap {\varSigma }_2 = \emptyset \), but usually differ otherwise. For instance, if \(L_1 = ab^\omega \) and \(L_2 = ab^\omega \), we get \(L_1 \parallel L_2 = ab^\omega \).

We describe systems as combinations of shuffles and asynchronous products, for instance we write \(L_1 \parallel (L_2 \between L_3)\). In these expressions we assume that \(\between \) binds tighter than \(\parallel \), and so \(L_1 \between L_2 \parallel L_3\) is the language \((L_1 \between L_2)\parallel L_3\), and not \(L_1 \between (L_2 \parallel L_3)\).

### 2.2 Non-atomic Networks

A non-atomic network is an infinite family of systems parameterized by a number \(k\). The \(k\)th element of the family has \(k+1\) components communicating through a global store by means of read and write actions. The store is modeled as an atomic register whose set of possible values is finite. One of the \(k+1\) components is the leader, while the other \(k\) are the contributors. All contributors have exactly the same possible behaviors (they are copies of the same \(\omega \)-language), while the leader may behave differently. The network is called non-atomic because components cannot atomically execute sequences of actions, only one single read or write.

Formally, we fix a finite set \(\mathcal{G}\) of *global values*. A *read-write alphabet* is any set of the form \(\mathcal{A}\times \mathcal{G}\), where \(\mathcal{A}\) is a set of *read* and *write (actions)*. We denote a symbol \((a,g) \in \mathcal{A}\times \mathcal{G}\) by \(a(g)\) and define \(\mathcal{G}(a_1, \ldots , a_n) = \{ a_i(g) \mid 1\le i\le n,\,~g\in \mathcal{G}\} \).

We fix two languages \(\mathcal{D}\subseteq {\varSigma }_{\mathcal{D}}^\omega \) and \(\mathcal{C}\subseteq {\varSigma }_{\mathcal{C}}^\omega \), called the *leader* and the *contributor*, with alphabets \({\varSigma }_{\mathcal{D}}=\mathcal{G}(r_d,w_d)\) and \({\varSigma }_{\mathcal{C}}=\mathcal{G}(r_c, w_c)\), respectively, where \(r_d, r_c\) are called *reads* and \(w_c, w_d\) are called *writes*. We write \(w_{\star }\) (respectively, \(r_{\star }\)) to stand for either \(w_c\) or \(w_d\) (respectively, \(r_c\) or \(r_d\)). We further assume that \( Proj _{\{r_{\star }(g),w_{\star }(g)\}}(\mathcal{D}\cup \mathcal{C})\ne \emptyset \) holds for every \(g\in \mathcal{G}\), else the value \(g\) is never used and can be removed from \(\mathcal{G}\).

*store*, over \({\varSigma }_{\mathcal{D}}\cup {\varSigma }_{\mathcal{C}}\). It models the sequences of read and write operations supported by an atomic register: a write \(w_{\star }(g)\) writes \(g\) to the register, while a read \(r_{\star }(g)\) succeeds when the register’s current value is \(g\). Initially the store is only willing to execute a write. Formally \(\mathcal{S}\) is defined as \( \Bigl ( \sum _{g \in \mathcal{G}} \big (\;w_{\star }(g)\, (r_{\star }(g))^*\;\big ) \Bigr )^\omega + \Bigl ( \sum _{g \in \mathcal{G}} \big (\;w_{\star }(g)\, (r_{\star }(g))^*\;\big )^* \; \sum _{g \in \mathcal{G}} \big (\;w_{\star }(g)\, (r_{\star }(g))^\omega \;\big ) \Bigr ) \) and any finite prefix thereof. Observe that \(\mathcal{S}\) is completely determined by \({\varSigma }_{\mathcal{D}}\) and \({\varSigma }_{\mathcal{C}}\). Figure 1 depicts a store with \(\{1,2,3\}\) as possible values as the language of a transition system.

**Definition 1**

Let \(\mathcal{D}\subseteq {\varSigma }_{\mathcal{D}}^\omega \) and \(\mathcal{C}\subseteq {\varSigma }_{\mathcal{C}}^\omega \) be a leader and a contributor, and let \(k\ge 1\). The \(k\)*-instance of the*\((\mathcal{D},\mathcal{C})\)*-network* is the \(\omega \)-language \(\mathcal{N}^{(k)} = (\mathcal{D}\parallel \mathcal{S}\parallel {}\between _{k} \mathcal{C}) \) where \({}{\between }_{k} \mathcal{C}\) stands for \({}\between _{i=1}^k \mathcal{C}\). The \((\mathcal{D},\mathcal{C})\)-network \(\mathcal{N}\) is the \(\omega \)-language \(\mathcal{N}= \bigcup _{k=1}^\infty \mathcal{N}^{(k)}\). We omit the prefix \((\mathcal{D},\mathcal{C})\) when it is clear from the context. It follows easily from the properties of shuffle and asynchronous product that \(\mathcal{N}= (\mathcal{D}\parallel \mathcal{S}\parallel \, {\between }_{\infty } \mathcal{C})\), where \({\between }_{\infty } \mathcal{C}\) is an abbreviation of \({}\bigcup _{k=1}^\infty {\between }_{k} \mathcal{C}\).

Next we introduce a notion of *compatibility* between a word of the leader and a multiset of words of the contributor (a multiset because several contributors may execute the same sequence of actions). Intuitively, compatibility means that all the words can be interleaved into a legal infinite sequence of reads and writes supported by an atomic register—that is, an infinite sequence belonging to \(\mathcal{S}\). Formally:

**Definition 2**

Let \(u\in {\varSigma }_{\mathcal{D}}^\omega \), and let \(M={\left\{ v_1, \ldots , v_k \right\} }\) be a multiset of words over \({\varSigma }_{\mathcal{C}}^\omega \) (possibly containing multiple copies of a word). We say that \(u\) is *compatible* with \(M\) iff the \(\omega \)-language \((u \parallel \mathcal{S}\parallel {}\between _{i=1}^k v_i)\) is non-empty. When \(u\) and \(M\) are compatible, there exists a word \(s \in \mathcal{S}\) such that \((u \parallel s \parallel {}\between _{i=1}^k v_i) \ne \emptyset \). We call \(s\) a *witness* of compatibility.

*Example 1*

Consider the network with \(\mathcal{G}=\{1,2,3\}\) where the leader, store, and contributor languages are given by the infinite paths of the transition systems from Fig. 1. The only \(\omega \)-word of \(\mathcal{D}\) is \((r_d(1)r_d(2)r_d(3))^\omega \) and the \(\omega \)-language of \(\mathcal{C}\) is \((w_c(1)r_c(3)r_c(1)+w_c(2)r_c(1)r_c(2)+w_c(3)r_c(2)r_c(3))^\omega \). For instance, \(\mathcal{D}=(r_d(1)r_d(2)r_d(3))^\omega \) is compatible with the multiset \(M\) of \(6\)\(\omega \)-words obtained by taking two copies of \( (w(1) r(3) r(1))^{\omega }\), \( (w(2) r(1) r(2))^{\omega }\) and \( (w(3) r(2) r(3))^{\omega }\). The reader may be interested in finding another multiset compatible with \(\mathcal{D}\) and containing only 4 \(\omega \)-words.

*Stuttering Property.* Intuitively, the stuttering property states that if we take an \(\omega \)-word of a network \( \mathcal{N}\) and “stutter” reads and writes of the contributors, going e.g. from \(w_d(1) r_c(1)w_c(2)r_d(2) \ldots \) to \(w_d(1) r_c(1)r_c(1) w_c(2)w_c(2)w_c(2) r_d(2) \ldots \), the result is again an \(\omega \)-word of the network.

Let \(s\in \mathcal{S}\) be a witness of compatibility of \(u \in {\varSigma }_{\mathcal{D}}^{\omega }\) and \(M = {\left\{ v_1, \ldots , v_k \right\} }\). Pick a set \(I\) of positions (viz. \(I \subseteq \text {dom}(s)\)) such that \( (s)_i \in {\varSigma }_{\mathcal{C}}\) for each \(i\in I\), and pick a number \(\ell _i \ge 0\) for every \(i \in I\). Let \(s'\) be the result of simultaneously replacing each \( (s)_i \) by \( (s)_i^{\ell _i+1}\) in \(s\). We have that \(s' \in \mathcal{S}\). Now let \(v_s = (s)_{i_1}^{\ell _{i_1}} \cdot (s)_{i_2}^{\ell _{i_2}}\cdots \), where \(i_1 = \min (I)\), \(i_2 = \min (I\setminus \{i_1\})\), \(\ldots \) It is easy to see that \((u \parallel s' \parallel v_s \between {}\between _{i=1}^k v_i) \ne \emptyset \), and so \(u\) is compatible with \(M\oplus \{v_s\}\), the multiset consisting of \(M\) and \(v_s\), and \(s'\) is a witness of compatibility.

An easy consequence of the stuttering property is the *copycat lemma* [9].

**Lemma 1**

**( Copycat Lemma).** Let \(u \in {\varSigma }_{\mathcal{D}}^{\omega }\) and let \(M\) be a multiset of words of \({\varSigma }_{\mathcal{C}}^{\omega }\). If \(u\) is compatible with \(M\), then \(u\) is also compatible with \(M \oplus \{v\}\) for every \(v \in M\).

### 2.3 The Model-Checking Problem for Linear-Time Properties

We consider the model checking problem for linear-time properties, that asks, given a network \(\mathcal{N}\) and an \(\omega \)-regular language \(L\), decide whether \(\mathcal{N}\parallel L\) is non-empty. We assume \(L\) is given as a Büchi automaton \(A\) over \({\varSigma }_{\mathcal{D}}\). Intuitively, \(A\) is a tester that observes the actions of the leader; we call this the *leader model checking problem*.

**Given**: machines \(D \in \mathtt D \) and \(C \in \mathtt C \), and a Büchi automaton \(A\)**Decide**: Is \(\mathcal{N}_A=(L(A) \parallel L(D) \parallel \mathcal{S}\parallel {\between }_{\infty } L(C))\) non-empty?

In the next sections we prove that MC(FSM,FSM) and MC(PDM,FSM) are NP-complete, while MC(PDM,PDM) is in NEXPTIME and PSPACE-hard.

*Example 2*

Consider the instance of the model checking problem where \(\mathcal{D}\) and \(\mathcal{C}\) are as in Fig. 1, and \(A\) is a Büchi automaton recognizing all words over \({\varSigma }_{\mathcal{D}}\) containing infinitely many occurrences of \(r_d(1)\). Since \(\mathcal{D}\) is compatible with a multiset of words of the contributors, \(\mathcal{N}_A\) is non-empty. In particular, \(\mathcal{N}_A^{(4)}\ne \emptyset \).

Since \({\varSigma }_A={\varSigma }_\mathcal{D}\), we can replace \(A\) and \(D\) by a machine \(A\times D\) with a Büchi acceptance condition. The construction of \(A\times D\) given \(A\) and \(D\) is standard. In what follows, we assume that \(D\) comes with a Büchi acceptance condition and forget about \(A\).

There are two natural variants of the model checking problem, where \({\varSigma }_A = {\varSigma }_{\mathcal{C}}\), i.e., the alphabet of \(A\) contains the actions of all contributors, or \({\varSigma }_A = {\varSigma }_{\mathcal{D}} \cup {\varSigma }_{\mathcal{C}}\). In both these variants, the automaton A can be used to simulate atomic networks. Indeed, if the language of A consists of all sequences of the form \((w_d() r_c() w_c() r_d() )^\omega \), and we design the contributors so that they alternate reads and writes, then the accepting executions are those in which the contributors read a value from the store and write a new value in an atomic step. So the complexity of the model-checking problem coincides with the complexity for atomic networks (undecidable for PDMs and EXPSPACE-complete for FSMs), and we do not study it further.

## 3 MC(FSM,FSM) is NP-Complete

We fix some notations. A finite-state machine (FSM) \((Q,\delta , q_0)\) over \({\varSigma }\) consists of a finite set of states \(Q\) containing an initial state \(q_0\) and a transition relation \(\delta \subseteq Q\times {\varSigma }\times Q\). A word \(v \in {\varSigma }^{\omega }\) is *accepted* by an FSM if there exists a sequence \(q_1q_2\cdots \) of states such that \( (q_i, (v)_{i+1}, q_{i+1})\in \delta \) for all \(i\ge 0\). We denote by \(q_0 \xrightarrow {(v)_1} q_1 \xrightarrow {(v)_2} \cdots \) the *run* accepting \(v\). A Büchi automaton \((Q,\delta , q_0, F)\) is an FSM \((Q,\delta , q_0)\) together with a set \(F\subseteq Q\) of accepting states. An \(\omega \)-word \(v\in {\varSigma }^\omega \) is accepted by a Büchi automaton if there is a run \(q_0 \xrightarrow {(v)_1} q_1 \xrightarrow {(v)_2} \cdots \) such that \(q_j\in F\) for infinitely many positions \(j\). The \(\omega \)-language of a FSM or Büchi automaton \(A\), denoted by \(L(A)\), is the set of \(\omega \)-words accepted by \(A\).

In the rest of the section we show that MC(FSM,FSM) is NP-complete. Section 3.1 defines the infinite transition system associated to a (FSM,FSM)-network. Section 3.2 introduces an associated finite abstract transition system. Section 3.3 states and proves a lemma (Lemma 3) characterizing the cycles of the abstract transition system that, loosely speaking, can be concretized into infinite executions of the concrete transition system. Membership in NP is then proved using the lemma. NP-hardness follows from NP-hardness of reachability [9].

### 3.1 (FSM,FSM)-Networks: Populations and Transition System

We fix a Büchi automaton \(D=(Q_D,\delta _D, q_{0D}, F)\) over \({\varSigma }_{\mathcal{D}}\) and an FSM \(C=(Q_C, \delta _C, q_{0C})\) over \({\varSigma }_{\mathcal{C}}\). A *configuration* is a tuple \((q_D, g, \mathbf {p})\), where \(q_D\in Q_D\), \(g \in \mathcal{G}\cup \{\#\}\), and \(\mathbf {p} :Q_C \rightarrow \mathbb {N}\) assigns to each state of \(C\) a natural number. Intuitively, \(q_D\) is the current state of \(D\); \(g\) is a value or the special value \(\#\), modelling that the store has not been initialized yet, and no process read before some process writes; finally, \(\mathbf {p}(q)\) is the number of contributors currently at state \(q \in Q_C\). We call \(\mathbf {p}\) a *population* of \(Q_C\), and write \(|\mathbf {p}| = \sum _{q \in Q_C} \mathbf {p}(q)\) for the *size* of \(\mathbf {p}\). Linear combinations of populations are defined componentwise: for every state \(q \in Q_C\), we have \((k_1 \mathbf {p_1} + k_2 \mathbf {p_2})(q) := k_1 \mathbf {p_1}(q) +k_2 \mathbf {p_2}(q)\). Further, given \(q \in Q_C\), we denote by \(\mathbf {q}\) the population \(\mathbf {q}(q') = 1\) if \(q=q'\) and \(\mathbf {q}(q') = 0\) otherwise, i.e., the population with one contributor in state \(q\) and no contributors elsewhere. A configuration is *accepting* if the state of \(D\) is accepting, that is whenever \(q_D\in F\). Given a set of populations \(\mathbf {P}\), we define \((q_D, g, \mathbf {P}) := \{ ( q_D, g, \mathbf {p}) \mid \mathbf {p} \in \mathbf {P} \}\).

\(X\) is the set of all configurations, and \(X_0 \subseteq X\) is the set of initial configurations, given by \((q_{0D}, \#, \mathbf {P_0})\), where \(\mathbf {P_0}= \{k\mathbf {q_{0C}} \mid k \ge 1\}\);

- \(T = T_D \cup T_C\), where
\(T_D\) is the set of triples \(\big ( \, ( q_D, g, \mathbf {p}) \, , \, t \, , \, ( q_D', g', \mathbf {p}) \, \big )\) such that \(t\) is a transition of \(D\), viz. \(t\in \delta _D\), and one of the following conditions holds: (i) \(t=(q_D, w_d(g'), q_D')\); or (ii)\(t=(q_D, r_d(g), q_D')\), \(g = g'\).

\(T_C\) is the set of triples \(\big ( \, (q_D, g, \mathbf {p}) \, , \, t \, , \, (q_D, g', \mathbf {p}') \, \big )\) such that \(t\in \delta _C\), and one of the following conditions holds: (iii) \(t=(q_C, w_c(g'), q_C')\), \(\mathbf {p} \ge \mathbf {q_C}\), and \(\mathbf {p}' = \mathbf {p} - \mathbf {q_C} + \mathbf {q'_C}\); or (iv) \(t=(q_C, r_c(g), q_C')\), \(\mathbf {p} \ge \mathbf {q_C}\), \(g=g'\), and \(\mathbf {p}' = \mathbf {p} - \mathbf {q_C} + \mathbf {q'_C}\).

Observe that \(|\mathbf {p}| = |\mathbf {p}'|\), because the total number of contributors of a population remains constant. Given configurations \(c\) and \(c'\), we write \(c{\displaystyle \mathop {\rightarrow }^{t}} c'\) if \((c,t,c')\in T\).

We introduce a notation important for Lemma 3 below. We define \({\varDelta }(t) := \mathbf {p}' - \mathbf {p}\). Observe that \({\varDelta }(t)=\mathbf {0}\) in cases (i) and (ii) above, and \({\varDelta }(t) = - \mathbf {q_C} + \mathbf {q_C}'\) in cases (iii) and (iv). So \({\varDelta }(t)\) depends only on the transition \(t\), but not on \(\mathbf {p}\).

### 3.2 The Abstract Transition System

*abstraction function*\(\alpha \) that assigns to a set \(\mathbf {P}\) of populations the set of states of \(Q_C\) populated by \(\mathbf {P}\). We also introduce a

*concretization function*\(\gamma \) that assigns to a set \(Q \subseteq Q_C\) the set of all populations \(\mathbf {p}\) that only populate states of \(Q\). Formally:

*abstract configuration*is a tuple \(( q_D, g, Q)\), where \(q_D \in Q_D\), \(g \in \mathcal{G}\cup \{\#\}\), and \(Q \subseteq Q_C\). We extend \(\alpha \) and \(\gamma \) to (abstract) configurations in the obvious way. An abstract configuration is

*accepting*when the state of \(D\) is accepting, that is whenever \(q_D\in F\).

*abstraction*\(\alpha { TS} = ({\alpha }X,{\alpha }T,{\alpha }X_0)\) as follows:

\({\alpha }X = Q_D \times (\mathcal{G}\cup \{\#\}) \times 2^{Q_C}\) is the set of all abstract configurations.

\({\alpha }X_0 = (q_{0D}, \#, \alpha (\mathbf {P_0})) = (q_{0D}, \#, \{q_{0C}\}) \) is the initial configuration.

\((\ (q_D, g, Q),\ t,\ (q_D', g', Q')\ ) \in \alpha {T}\) iff there is \(\mathbf {p}\in \gamma (Q)\) and \(\mathbf {p}'\) such that \((q_D,g,\mathbf {p})\xrightarrow {t} (q_D',g',\mathbf {p}')\) and \(Q' = \alpha (\{ \mathbf {p}' \mid \exists \mathbf {p} \in \gamma (Q) :(q_D, g, \mathbf {p}) \xrightarrow {t} (q_D', g', \mathbf {p}') \})\).

- (A)
For each \(\omega \)-path \(c_0 \xrightarrow {t_1} c_1 \xrightarrow {t_2} c_2 \cdots \) of \({ TS}\), there exists an \(\omega \)-path \(a_0 \xrightarrow {t_1}_\alpha a_1 \xrightarrow {t_2}_\alpha a_2 \cdots \) in \(\alpha { TS}\) such that \(c_i \in \gamma (a_i)\) for all \(i\ge 0\).

- (B)If \((q_D, g, Q) \xrightarrow {t}_\alpha (q_D', g', Q')\), then \(Q \subseteq Q'\). To prove this claim, consider two cases:
\(t\in \delta _D\). Then \((q_D, g, \mathbf {p}) \xrightarrow {t} (q_D', g', \mathbf {p})\) for every population \(\mathbf {p}\) (because only the leader moves). So \((q_D, g, Q) \xrightarrow {t}_\alpha (q_D', g', Q)\).

\(t\in \delta _C\). Consider the population \(\mathbf {p}= 2 \sum _{q\in Q} \mathbf {q} \in \gamma (Q)\). Then \(( q_D, g, \mathbf {p}) \xrightarrow {t} ( q_D, g', \mathbf {p}')\), where \(\mathbf {p}'= \mathbf {p} - \mathbf {q_C} + \mathbf {q_C}'\). But then \(\mathbf {p}' \ge \sum _{q\in Q} \mathbf {q}\), and so \(\alpha (\{\mathbf {p}'\}) \supseteq Q\), which implies \(( q_D, g, Q) \xrightarrow {t}_\alpha ( q_D, g', Q')\) for some \(Q' \supseteq Q\).

### 3.3 Realizable Cycles of the Abstract Transition System

We show that the existence of an infinite accepting path in \({ TS}\) reduces to the existence of a certain lasso path in \({\alpha }{ TS}\). A lasso path consists of a stem and a cycle. Lemma 2 shows how every abstract finite path (like the stem) has a counterpart in \({ TS}\). Lemma 3 characterizes precisely those cycles in \({\alpha }{ TS}\) which have an infinite path counterpart in \({ TS}\).

**Lemma 2**

Let \( (q_D,g,Q) \) be an abstract configuration of \( \alpha { TS}\) reachable from \( ( q_{0D}, \#, \alpha (\mathbf {P_0}))\) (\(={\alpha }X_0\)). For every \(\mathbf {p}\in \gamma (Q)\), there exists \(\mathbf {\hat{p}}\) such that \( ( q_D, g, \mathbf {\hat{p}})\) is reachable from \( ( q_{0D}, \#, \mathbf {P_0}) \) and \(\mathbf {\hat{p}} \ge \mathbf {p}\).

We proceed to characterized the cycles of the abstract transition system that can be “concretized”. A *cycle* of \(\alpha { TS}\) is a path \(a_0 \xrightarrow {t_1}_\alpha a_1 \xrightarrow {t_2}_\alpha a_2 \cdots \xrightarrow {t_{n-1}}_\alpha a_n\) such that \(a_n = a_0\). A cycle is *realizable* if there is an infinite path \(c_0 \xrightarrow {t'_1} c_1 \xrightarrow {t'_2} c_2 \cdots \) of \({ TS}\) such that \(c_k \in \gamma (a_{(k \mod n)})\) and \(t'_{k+1} = t_{(k+1\mod n)}\) for every \(k \ge 0\).

**Lemma 3**

A cycle \(a_0 \xrightarrow {t_1}_\alpha a_1 \xrightarrow {t_2}_\alpha a_2 \cdots \xrightarrow {t_{n}}_\alpha a_n\) of \(\alpha { TS}\) is realizable iff \(\sum _{i=1}^{n} {\varDelta }(t_i) = \mathbf {0}\).

**Theorem 1**

MC(FSM,FSM) is NP-complete.

*Proof*

- 1.
Guess a sequence \(Q_1,\ldots ,Q_{\ell }\) of subsets of \(Q_C\) such that \(Q_i \subsetneq Q_{i+1}\) for all \(i\), \(0 < i < \ell \). Note that \(\ell \le \vert {Q_C}\vert \).

- 2.
Compute the set \(\mathcal {Q} = Q_D \times (\mathcal{G}\cup \{\#\}) \times \{ \{q_{0C}\} , Q_1,\ldots ,Q_{\ell } \} \) of abstract configurations and the set \(\mathcal {T}\) of abstract transitions between configurations of \(\mathcal {Q}\).

- 3.
Guess an accepting abstract configuration \(a \in \mathcal {Q}\), that is, an \( a = (q_D,g,Q)\) such that \(q_D\) is accepting in \(D\).

- 4.
Check that \(a\) is reachable from the initial abstract configuration \( ( q_{0D}, \#, \{q_{0C}\}) \) by means of abstract transitions of \(\mathcal {T}\).

- 5.
Check that the transition system with \(\mathcal {Q}\) and \(\mathcal {T}\) as states and transitions contains a cycle \(a_0 \xrightarrow {t_1}_\alpha a_1 \cdots a_{n-1} \xrightarrow {t_n}_\alpha a_n \) such that \(n\ge 1\), \(a_0=a_n=a\) and \(\sum _{i=1}^n {\varDelta }(t_i) = \mathbf {0}\).

Compute an FSA

^{1}\(A^{\circlearrowright }_{a}\) over the alphabet \(\delta _D\cup \delta _C\) with \(\mathcal {Q}\) as set of states, \(\mathcal {T}\) as set of transitions, \(a\) as initial state, and \( \{a\} \) as set of final states.Use the polynomial construction of Seidl

*et al.*[15] to compute an (existential) Presburger formula \({\varOmega }\) for the Parikh image of \(L(A_{a}^{\circlearrowright })\). The free variables of \({\varOmega }\) are in one-to-one correspondence with the transitions of \(\delta _D\cup \delta _C\). Denote by \(x_t\) the variable corresponding to transition \(t\in \delta _D\cup \delta _C\).- Compute the formulawhere \( tgt \) and \( src \) returns the target and source states of the transition passed in argument. \({\varOmega }'\) adds to \({\varOmega }\) the realizability condition of Lemma 3.$$\begin{aligned} \textstyle {\varOmega }' = {\varOmega }\; \wedge \; \bigwedge _{q_c \in Q_c} \Bigl ( \sum _{ tgt (t) = q_c} x_t = \sum _{ src (t) = q_c} x_t \Bigr ) \; \wedge \; \sum _{t\in \delta _D\cup \delta _C} x_t > 0 \end{aligned}$$
Check satisfiability of \({\varOmega }'\). This step requires nondterministic polynomial time because satisfiability of an existential Presburger formula is in NP [11].\(\quad \square \)

## 4 MC(PDM,FSM) is NP-Complete

A pushdown system (PDM) \(P = (Q, {\varGamma }, \delta , q_0)\) over \({\varSigma }\) consists of a finite set \(Q\) of states including the initial state \(q_0\), a *stack alphabet*\({\varGamma }\) including the bottom stack symbol \(\bot \), and a set of *rules*\(\delta \subseteq Q\times {\varSigma }\times {\varGamma }\times Q \times ({\varGamma }\backslash \{\bot \} \cup \{\mathsf {pop}\})\) which either push or pop as explained below. A *PDM-configuration*\(qw\) consists of a state \(q\in Q\) and a word \(w\in {\varGamma }^*\) (denoting the stack content). For \(q,q'\in Q\), \(a \in {\varSigma }\), \(\gamma ,\gamma '\in {\varGamma }\), \(w,w'\in {\varGamma }^*\), we say a PDM-configuration \(q'w\) (resp. \(q'\gamma '\gamma w\)) \(a\)-follows \(q \gamma w\) if \((q, a, \gamma , q', \mathsf {pop})\in \delta \), (resp. \((q,a,\gamma ,q',\gamma ') \in \delta \)); we write \(qw \xrightarrow {a} q'w'\) if \(q'w'\)\(a\)-follows \(qw\), and call it a *transition*. A *run*\(c_0 \xrightarrow {(v)_1} c_1\xrightarrow {(v)_2}\ldots \) on a word \(v\in {\varSigma }^\omega \) is a sequence of PDM-configurations such that \(c_0 = q_0 \bot \) and \(c_i \xrightarrow {(v)_{i+1}} c_{i+1}\) for all \(i\ge 0\). We write \(c\xrightarrow {*} c'\) if there is a run from \(c\) to \(c'\). The language \(L(P)\) of \(P\) is the set of all words \(v\in {\varSigma }^\omega \) such that \(P\) has a run on \(v\).

A Büchi PDM is a PDM with a set \(F\subseteq Q\) of accepting states. A word is accepted by a Büchi PDM if there is a run on the word for which some state in \(F\) occurs infinitely often along the PDM-configurations. The following lemma characterizes accepting runs.

**Lemma 4**

[7] Let \(c\) be a configuration. There is an accepting run starting from \(c\) if there are states \(q\in Q\), \(q_f\in F\), a stack symbol \(\gamma \in {\varGamma }\) such that \(c \xrightarrow {*} q\gamma w\) for some \(w\in {\varGamma }^*\) and \(q\gamma \xrightarrow {*} q_f u \xrightarrow {*} q \gamma w'\) for some \(u, w'\in {\varGamma }^*\).

We now show MC(PDM, FSM) is decidable, generalizing the proof from Sect. 3. Fix a Büchi PDM \(P=(Q_D,{\varGamma }_D, \delta _D, q_{0D}, F)\), and a FSM \(C=(Q_C, \delta _C, q_{0C})\). A *configuration* is a tuple \((q_D, w, g, \mathbf {p})\), where \(q_D\in Q_D\), \(w\in {\varGamma }_D^*\) is the stack content, \(g \in \mathcal{G}\cup \{\#\}\), and \(\mathbf {p}\) is a population. Intuitively, \(q_D w\) is the PDM-configuration of the leader. We extend the definitions from Sect. 3 like accepting configuration in the obvious way.

We define a labeled transition system \({ TS} = (X, T, X_0)\), where \(X\) is the set of configurations including the set \(X_0 = (q_{0D}, \bot , \#, \mathbf {P_0})\) of initial configurations, and the transition relation \(T = T_D \cup T_C\), where \(T_C\) is as before and \(T_D\) is the set of triples \(\big ( ( q_D, w, g, \mathbf {p}), t, ( q_D', w', g', \mathbf {p})\big )\) such that \(t\) is a transition (not a rule) of \(D\), and one of the following conditions holds: (i) \(t = (q_D w \xrightarrow {w_d(g')} q_D' w')\); or (ii) \(t= (q_Dw \xrightarrow {r_d(g)} q_D'w')\) and \(g = g'\). We define the abstraction \(\alpha { TS}\) of \({ TS}\) as the obvious generalization of the abstraction in Sect. 3. An accepting path of the (abstract) transition system is an infinite path with infinitely many accepting (abstract) configurations. As for MC(FSM, FSM), not every accepting path of the abstract admits a concretization, but we find a realizability condition in terms of linear constraints. Here we use again the polynomial construction of Seidl *et al.* [15] mentioned in the proof of Theorem 1, this time to compute an (existential) Presburger formula for the Parikh image of a pushdown automaton.

**Theorem 2**

MC(PDM,FSM) is NP-complete.

## 5 MC(PDM,PDM) is in NEXPTIME

We show how to reduce MC(PDM,PDM) to MC(PDM,FSM). We first introduce the notion of *effective stack height* of a PDM-configuration in a run of a PDM, and define, given a PDM \(C\), an FSM \(C_k\) that simulates all the runs of \(C\) of effective stack height \(k\). Then we show that, for \(k \in O(n^3)\), where \(n\) is the size of \(C\), the language \((L(D) \parallel \mathcal{S}\parallel {\between }_{\infty } L(C))\) is empty iff \((L(D) \parallel \mathcal{S}\parallel {\between }_{\infty } L(C_k))\) is empty.

### 5.1 A FSM for Runs of Bounded Effective Stack Height

Consider a run of a PDM that repeatedly pushes symbol on the stack. The stack height of the configurations^{2} is unbounded, but, intuitively, the PDM only uses the topmost stack symbol during the run. To account for this we define the notion of effective stack height.

**Definition 3**

Let \(\rho =c_0 \xrightarrow {(v)_1} c_1 \xrightarrow {(v)_2} \cdots \) be an infinite run of a PDM on \(\omega \)-word \(v\), where \(c_i = q_iw_i\). The *dark suffix* of \(c_i\) in \(\rho \), denoted by \({ ds}(w_i)\), is the longest suffix of \(w_i\) that is also a proper suffix of \(w_{i+k}\) for every \(k \ge 0\). The *active prefix*\({ ap}(w_i)\) of \(w_i\) is the prefix satisfying \(w_i = { ap}(w_i) \cdot { ds}(w_i)\). The *effective stack height* of \(c_i\) in \(\rho \) is \(\vert {{ ap}(w_i)}\vert \). We say that \(\rho \) is *effectively*\(k\)*-bounded* (or simply \(k\)-bounded for the sake of readability) if every configuration of \(\rho \) has an effective stack height of at most \(k\). Further, we say that \(\rho \) is *bounded* if it is \(k\)-bounded for some \(k \in \mathbb {N}\). Finally, an \(\omega \)-word of the PDM is \(k\)*-bounded*, respectively bounded, if it is the word generated by some \(k\)-bounded, respectively bounded, run (other runs for the same word may not be bounded).

Intuitively, the effective stack height measures the actual memory required by the PDM to perform its run. For example, repeatedly pushing symbols on the stack produces a run with effective stack height 1. Given a position in the run, the elements of the stack that are never popped are those in the longest common suffix of all subsequent stacks. The first element of that suffix may be read, therefore only the longest *proper* suffix is effectively useless, so no configuration along an infinite run has effective stack height \(0\).

**Proposition 1**

Every infinite run of a PDM contains infinitely many positions at which the effective stack height is 1.

*Proof*

Let \(p_0w_0 \xrightarrow {} p_1w_1 \xrightarrow {} p_2w_2 \xrightarrow {} \cdots \) be any infinite run. Notice that \(|w_i|\ge 1\) for every \(i \ge 0\), because otherwise the run would not be infinite. Let \(X\) be the set of positions of the run defined as: \(i \in X\) iff \(|w_i| \le |w_j|\) for every \(j > i\). Observe that \(X\) is infinite, because the first configuration of minimal stack height, say \(p_kw_k\) belongs to it, and so does the first configuration of minimal stack height of the suffix \(p_{k+1}w_{k+1} \xrightarrow {} \cdots \), etc. By construction, the configuration at every position in \(X\) has effective stack height \(1\).\(\quad \square \)

In a \(k\)-bounded run, whenever the stack height exceeds \(k\), the \(k+1\)-th stack symbol will never become the top symbol again, and so it becomes useless. So, we can construct a finite-state machine \(P_k\) recognizing the words of \(L(P)\) accepted by \(k\)-bounded runs.

**Definition 4**

Given a PDM \(P = (Q,{\varGamma }, \delta , q_0)\), the FSM \(P_k = (Q_k, \delta _{k}, q_{0k})\), called the k-restriction of Ps, is defined as follows: (a) \(Q_k=Q \times \bigcup _{i=1}^k {\varGamma }^{i}\) (a state of \(P_k\) consists of a state of \(P\) and a stack content no longer than \(k\)); (b) \(q_{0k} = (q_0, \bot )\); (c) \(\delta _{k}\) contains a transition \((q, (w)_{1..k}) \xrightarrow {a} (q', (w')_{1..k})\) iff \(q w \xrightarrow {a} q' w'\) is a transition (not a rule) of \(P\).

**Theorem 3**

Given a PDM \(P\), \(w\) admits a \(k\)-bounded run in \(P\) iff \(w \in L(P_{k})\).

### 5.2 The Reduction Theorem

We fix a Büchi PDM \(D\) and a PDM \(C\). By Theorem 3, in order to reduce MC(PDM,PDM) to MC(PDM,FSM) it suffices to prove the following Reduction Theorem:

**Theorem 4**

**( Reduction Theorem).**Let \(N = 2 |Q_C|^2 |{\varGamma }_C| + 1\), where \(Q_C\) and \({\varGamma }_C\) are the states and stack alphabet of \(C\), respectively. Let \(C_N\) be the \(N\)-restriction of \(C\). We have:There are PDMs \(D\), \(C\) for which (\(\dagger \)) holds only for \(N\in {\varOmega }(|Q_C|^2|{\varGamma }_C|)\).

Theorems 4 and 2 provide an upper bound for MC(PDM,PDM). PSPACE-hardness of the reachability problem [9] gives a lower bound.

**Theorem 5**

MC(PDM,PDM) is in NEXPTIME and PSPACE-hard. If the contributor is a one counter machine (with zero-test), it is NP-complete.

The proof of Theorem 4 is very involved. Given a run of \(D\) compatible with a finite multiset of runs of \(C\), we construct another run of \(D\) compatible with a finite multiset of \(N\)-bounded runs of \(C_N\). (Here we extend compatibility to runs: runs are compatible if the words they accept are compatible.)

The proof starts with the Distributing lemma, which, loosely speaking, shows how to replace a run of \(C\) by a multiset of “smaller” runs of \(C\) without the leader “noticing”. After this preliminary result, the first key proof element is the Boundedness Lemma. Let \(\sigma \) be an infinite run of \(D\) compatible with a finite multiset \(R\) of runs of \(C\). The Boundedness Lemma states that, for any number \(Z\), the first \(Z\) steps of \(\sigma \) are compatible with a (possibly larger) multiset \(R_Z\) of runs of \(C_N\). Since the size of \(R_Z\) may grow with \(Z\), this lemma does not yet prove Theorem 4: it only shows that \(\sigma \) is compatible with an *infinite* multiset of runs of \(C_N\). This obstacle is overcome in the final step of the proof. We show that, for a sufficiently large \(Z\), there are indices \(i<j\) such that, not \(\sigma \) itself, but the run \((\sigma )_{1..i} \big ( (\sigma )_{i+1..j} \big )^\omega \) for adequate \(i\) and \(j\) is compatible with a *finite* multiset of runs of \(C_N\). Loosely speaking, this requires to prove not only that the leader can repeat \((\sigma )_{i+1..j}\) infinitely often, but also that the runs executed by the instances of \(C_N\) while the leader executes \((\sigma )_{i+1..j}\) can be repeated infinitely often.

*The Distributing Lemma.* Let \(\rho = c_0 \xrightarrow {a_1} c_1 \xrightarrow {a_2} c_2 \xrightarrow {a_3} \cdots \) be a (finite or infinite) run of \(C\). Let \(r_i\) be the PDM-rule of \(C\) generating the transition \(c_{i-1} \xrightarrow {a_i} c_i\). Then \(\rho \) is completely determined by \(c_0\) and the sequence \(r_1r_2r_3\) ...Since \(c_0\) is also fixed (for fixed \(C\)), in the rest of the paper we also sometimes write \(\rho = r_1r_2r_3 \ldots \) This notation allows us to speak of \(\text {dom}(\rho )\), \( (\rho )_k\), \( (\rho )_{i..j}\) and \( (\rho )_{i..\infty }\).

*distributes to a multiset*\(R\)

*of runs of*\(C\) if there exists an

*embedding function*\(\psi \) that assigns to each run \(\rho ' \in R\) and to each position \(i \in \text {dom}(\rho ')\) a position \(\psi (\rho ', i) \in \text {dom}(\rho )\), and satisfies the following properties:

\( (\rho ')_i = (\rho )_{\psi (\rho ',i)}\). (A rule occurrence in \(\rho '\) is matched to another occurrence of the same rule in \(\rho \).)

\(\psi \) is surjective. (For every position \(k \in \text {dom}(\rho )\) there is at least one \(\rho ' \in R\) and a position \(i \in \text {dom}(\rho ')\) such that \(\psi (\rho ',i)=k\), or, informally, \(R\) “covers” \(\rho \).)

If \(i < j\), then \(\psi (\rho ', i) < \psi (\rho ', j)\). (So \(\psi (\rho ',1)\psi (\rho ',2)\cdots \) is a scattered subword of \(\rho \).)

*Example 3*

**Lemma 5**

**(Distributing Lemma).** Let \(u \in L(D)\), and let \(M\) be a multiset of words of \(L(C)\) compatible with \(u\). Let \(v \in M\) and let \(\rho \) an accepting run of \(v\) in \(C\) that distributes to a multiset \(R\) of runs of \(C\), and let \(M_R\) the corresponding multiset of words. Then \(M \ominus \{v\} \oplus M_R\) is also compatible with \(u\).

*The Boundedness Lemma.* We are interested in distributing a multiset of runs of \(C\) into another multiset with, loosely speaking, “better” effective stack height.

Fix a run \(\rho \) of \(C\) and a distribution \(R\) of \(\rho \) with embedding function \(\psi \). In Example 3, \((\rho )_{1..4}\) is distributed into \((\rho '_1)_{1..1}\), \((\rho '_2)_{1..2}\) and \((\rho '_3)_{1..3}\). Assume \(\rho \) is executed by one contributor. We can replace it by 3 contributors executing \(\rho '_1,\rho '_2,\rho '_3\), without the rest of the network noticing any difference. Indeed, the three processes can execute \(r_a\) immediately after each other, which for the rest of the network is equivalent to the old contributor executing one \(r_a\). Then we replace the execution of \((\rho )_{2..4}\) by \((\rho '_2)_2 (\rho '_3)_{2..3}\).

We introduce some definitions allowing us to formally describe such facts. Given \(k \in \text {dom}(\rho )\), we denote by \(c(\rho ,k)\) the configuration reached by \(\rho \) after \(k\) steps. We naturally extend this notation to define \(c(\rho , 0)\) as the initial configuration. We denote by \({ last}_{\psi }(\rho ', i)\) the largest position \(k \in \text {dom}(\rho ')\) such that \(\psi (\rho ',k)\le i\) (similarly if none exists, we fix \({ last}_\psi (\rho ',i) =0\)). Further, we denote by \(c_\psi (\rho ',k)\) the configuration reached by \(\rho '\) after \(k\) steps of \(\rho \), that is, the configuration reached by \(\rho '\) after the execution of \({ last}_{\psi }(\rho ', k)\) transitions; formally, \(c_\psi (\rho ',k) = c(\rho ', { last}_{\psi }(\rho ', k))\).

*Example 4*

Let \(\rho \), \(R\), and \(\psi \) as in Example 3. Assuming that the PDM \(P\) has one single state \(p\), stack symbols \(\{\bot , \alpha \}\) such that the three rules \(r_a,r_b\) and \(r_c\) are given by \(r_a :p\bot \rightarrow p\alpha \bot \), \(r_b :p\alpha \rightarrow p\alpha \alpha \), and \(r_c :p\alpha \rightarrow p\), then we have \(c(\rho ,5)=p\alpha \bot \). Further, \({ last}_\psi (\rho '_1, 5)=1\), \({ last}_\psi (\rho '_2, 5)=3\), and \({ last}_\psi (\rho '_3, 5)=3\). Finally, \(c_\psi (\rho '_1, 5)= p \alpha \bot \), \(c_\psi (\rho '_2, 5)=p\alpha \bot \), and \(c_\psi (\rho '_3, 5) = p\alpha \bot \).

Given \(Z \in \text {dom}(\rho )\) and \(K \in \mathbb {N}\), we say that a distribution \(R\) of \(\rho \) is \((Z,K)\)-bounded if for every \(\rho ' \in R\) and for every \(i \le Z\), the effective stack height of \(c_\psi (\rho ',i)\) is bounded by \(K\). Further, we say that \(R\) is synchronized if for every configuration \(c(\rho , i)\) with effective stack height 1 and for every \(\rho ' \in R\), \(c_\psi (\rho ', i) = c(\rho ,i)\) (same control state and same stack content), and also has effective stack height 1.^{3} The Boundedness Lemma states that there is a constant \(N\), depending only on \(C\), such that for every run \(\rho \) of \(C\) and for every \(Z\in \text {dom}(\rho )\) there is a \((Z,N)\)-bounded and synchronized distribution \(R_Z\) of \(\rho \). The key of the proof is the following lemma.

**Lemma 6**

Let \(N = 2 |Q_C|^2 |{\varGamma }_C|+1\). Let \(\rho \) be a run of \(C\) and \(Z \in \text {dom}(\rho )\) be the first position of \(\rho \) such that \(c(\rho , Z)\) is not \(N\)-bounded. Then there is a \((Z,N)\)-bounded and synchronized distribution of \(\rho \).

*Proof sketch.* We construct a \((Z,N)\)-bounded and synchronized distribution \(\{\rho _a, \rho _b\}\) of \(\rho \). Let \(\alpha _{N+1}\alpha _N \cdots \alpha _1w_0\) be the stack content of \(c(\rho ,Z)\). Define \(\{ \overrightarrow{p}_1, \overleftarrow{p}_1, \overrightarrow{p}_2, \overleftarrow{p}_2, \ldots , \overrightarrow{p}_N, \overleftarrow{p}_N\} \subseteq \text {dom}(\rho ) \) such that for each \(i\), \(1\le i\le N\) we have \(c(\rho ,\overrightarrow{p}_i)\) and \(c(\rho ,\overleftarrow{p}_i)\) are the configurations immediately after the symbol \(\alpha _i\) in \(c(\rho ,Z)\) is pushed, respectively popped and such that the stack content of each configuration between \(\overrightarrow{p}_i\) (included) and \(\overleftarrow{p}_i\) (excluded) equals \(w_p \alpha _i \alpha _{i-1} \cdots \alpha _1 w_0\) for some \(w_p \in {\varGamma }_C^*\). We get \(c(\rho ,\overrightarrow{p}_i) = q_i \alpha _i \alpha _{i-1} \ldots \alpha _0 w_0\) and \(c(\rho ,\overleftarrow{p}_i) = q'_i \alpha _{i-1} \ldots \alpha _0 w_0\) for some \(q_i, q'_i\in Q_C\). Observe that the following holds: \(\overrightarrow{p}_1 < \cdots < \overrightarrow{p}_{N-1} < \overrightarrow{p}_N < Z < \overleftarrow{p}_N < \overleftarrow{p}_{N-1} < \cdots < \overleftarrow{p}_1\).

Now define \(\rho _a\) from \(\rho \) by simultaneously deleting \( (\rho )_{\overrightarrow{p}_{j_1+1}..\overrightarrow{p}_{j_2}}\) and \( (\rho )_{\overleftarrow{p}_{j_2+1}..\overleftarrow{p}_{j_1}}\). We similarly define \(\rho _b\) by deleting \( (\rho )_{\overrightarrow{p}_{j_2+1}..\overrightarrow{p}_{j_3}}\) and \( (\rho )_{\overleftarrow{p}_{j_3+1}..\overleftarrow{p}_{j_2}}\).

The following shows that \(\rho _a\) defines a legal run since it is given by A similar reasoning holds for \(\rho _b\). Finally, one can show that \(\{\rho _a, \rho _b\}\) is a \((Z,N)\)-bounded and synchronized distribution of \(\rho \).

**Lemma 7**

**(Boundedness Lemma).** Let \(N = 2|Q_C|^2 |{\varGamma }_C|+1\), and let \(\rho \) be a run of \(C\). For every \(Z \in \text {dom}(\rho )\) there is an \((Z,N)\)-bounded and synchronized distribution \(R_Z\) of \(\rho \).

The proof is by induction on \(Z\). The distribution \(\psi _{Z+1},R_{Z+1}\) is obtained from \(\psi _{Z}, R_Z\) by distributing each run \(\rho '\) of \(R_Z\) to a \((\psi _Z(\rho ',Z)+1,N)\)-bounded run (applying Lemma 6).

*Proof Sketch of Theorem* 4. Given a run \(\sigma \) of \(D\) compatible with a finite multiset \(M\) of runs of \(C\), we construct another run \(\tau \) of \(D\), and a multiset \(R\) of \(N\)-bounded runs of \(C_N\) such that \(\tau \) and \(R\) are compatible as well. We consider only the special case in which \(M\) has one single element \(\rho \) (and one single copy of it). Since \(\sigma \) is compatible with \(\rho \), we fix a witness \(\pi \in \mathcal{S}\) such that \(\pi \in \sigma \between \rho \). We construct a “lasso run” out of \(\pi \) of the form \( \lambda _1 [\lambda _2]^{\omega } \). It suffices to find two positions in \(\pi \) where the content of the store is the same, the corresponding configurations of the leader are the same, and similarly for each contributor; the fragment between these two positions can be repeated (is “pumpable”).

^{4}Further, for every \(Z\) let \(R_Z\) be a \((Z, N)\)-bounded and synchronized distribution of \(\rho \) with embedding function \(\psi \) (which exists by the Boundedness Lemma). Let \(R_Z(i_\rho ) = \{ c_\psi (\eta , i_\rho ) \mid \eta \in R_Z \}\) denote the multiset of configurations reached by the runs of \(R_Z\) after \(i\) steps of \(\pi \). Using Proposition 1 and that (i)the store has a finite number of values, (ii) \(R_Z\) is \((Z, N)\)-bounded, and (iii)there are only finitely many active prefixes of length at most \(N\), we can apply the pigeonhole principle to find a sufficiently large number \(Z\) and three positions \(i < j < k \le Z\) in \(\pi \) satisfying the following properties:

- (1)
The contents of the store at positions \(i\) and \(k\) of \(\pi \) coincide.

- (2)
The configurations \(c(\sigma ,i_\sigma )\) and \( c(\sigma ,k_\sigma )\) of the leader have effective stack height 1, same topmost stack symbol and same control state. Further, \(\sigma \) enters and leaves some accepting state between \(i_\sigma \) and \(k_\sigma \).

- (3)
The configuration \(c(\rho , j_\rho )\) has effective stack height 1.

- (4)
For every configuration of \(R_Z(i_\rho )\) there is a configuration of \(R_Z(k_\rho )\) with the same control state and active prefix, and vice versa.

Condition (4) means that, after removing the dark suffixes, \(R_Z(i_\rho )\) and \(R_Z(k_\rho )\) contain the same pruned configurations, although possibly a different number of times (same set, different multisets). If we obtain the same multiset, then the fragment of \(\pi \) between positions \(i\) and \(k\) is pumpable by (1) and (2), and we are done. Otherwise, we use (3) and the fact that \(R_Z\) is synchronized (which had not been used so far) to obtain a new distribution in which the multisets coincide. This is achieved by adding new runs to \(R_Z\).

A finite-state automaton (FSA) is an FSM which decides languages of finite words. Therefore an FSA is an FSM with a set \(F\) of accepting states.

Notice that the effective stack height of a configuration depends on the run it belongs to, and so \(c(\rho , i) = c_\psi (\rho ', i)\) does not necessarily imply that they have the same effective stack height.

Position \(p\) in \(\pi \) defines position \(p_{\sigma }\) in \(\sigma \) such that \( (\sigma )_{1..p_\sigma } = Proj _{{\varSigma }_{\mathcal{D}}}( (\pi )_{1..p})\), similarly \(p_\rho \) is defined as satisfying \((\rho )_{1..p_\rho } = Proj _{{\varSigma }_{\mathcal{C}}}((\pi )_{1..p})\).