v3 · 04 Jun 2026

Busy Beaver Values
for Post Tag Systems

An empirical investigation of $\mathrm{BB}(\Sigma, m, L)$ for small alphabets.

Sahil Ohe & OpenCode (MiniMax M3 and DeepSeek V4 Pro)

Status Empirical study Companion simulator.py · find_champions.py

00Abstract

We initiate the empirical study of the Busy Beaver problem for Post tag systems. Given an alphabet $\Sigma$ of cardinality $|\Sigma|$, a deletion number $m \in \mathbb{N}$, and a bound $L \in \mathbb{N}$ on the maximum length of any production string, let $\mathrm{BB}(\Sigma, m, L)$ denote the maximum number of derivation steps performed by any halting tag system with initial word $\omega$ whose rules are constrained to have production length at most $L$. We perform an exhaustive search of the rule space for $|\Sigma| \in \{2, 3\}$, $m = 2$, and $L \in \{1, 2, 3\}$, establishing the values $\mathrm{BB}(2, 2, 1) = 2$, $\mathrm{BB}(2, 2, 2) = 4$, $\mathrm{BB}(2, 2, 3) = 6$, $\mathrm{BB}(3, 2, 1) = 2$, and $\mathrm{BB}(3, 2, 2) = 7$. We further compute the halting rate, defined as the fraction of all rule systems in the search space that halt, and observe a phase transition from $\mathbb{P}(\mathrm{halt}) = 1$ at $L = 1$ to $\mathbb{P}(\mathrm{halt}) \approx 0.12$ at $|\Sigma| = 2, L = 3$. A second sweep across all length-$3$ initial words over $\{0, 1\}$ shows that $\mathrm{BB}$ depends sharply on the initial word: $\mathrm{BB}(2, 2, 3; 000) = 10$ while $\mathrm{BB}(2, 2, 3; 011) = 6$. The data support three conjectures: a monotonic decay of $\mathbb{P}(\mathrm{halt})$ as $L$ increases past $m$, a linear growth of $\mathrm{BB}$ in $L$ for initial words that use every symbol of the alphabet, and a different (non-$\Theta(L)$-of-constant-$2$) growth for initial words over a single symbol. Because the underlying problem subsumes the Halting Problem, exact values for large parameters are uncomputable in general.

01Introduction

The Busy Beaver problem, introduced by Rado [1] and substantially developed by Brady [2], asks for the maximum number of steps executed by an $n$-state deterministic Turing machine that halts when started on the blank tape. The corresponding function $\Sigma(n)$ grows faster than any computable function and is uncomputable. Variants of the problem have since been defined for a wide range of abstract machines, including register machines [3] and cellular automata [4].

In this work we define and study a Busy Beaver function for Post tag systems, a one-tape string-rewriting formalism introduced by Post [5] and shown by Minsky [7] and, in a sharper form for $P = 2$, by Cocke and Minsky [6], to be Turing-complete. Tag systems are an attractive target for the Busy Beaver problem because their state space is naturally a finitely-generated monoid $\Sigma^*$, the dynamics are deterministic, and the reachability problem is reducible to the Halting Problem. To the best of our knowledge, no systematic tabulation of tag-system Busy Beaver values has previously been published.

We restrict the unbounded production length by a parameter $L$, yielding a finite search space for each triple $(\Sigma, m, L)$. This bounded regime permits exhaustive enumeration, and the values we obtain for small parameters serve as the first anchor points of a sequence whose general behaviour is conjectural.

Contributions

Formal definition of the function $\mathrm{BB}(\Sigma, m, L)$ for tag systems, together with the auxiliary halting rate $\rho(\Sigma, m, L)$.
Exact values of $\mathrm{BB}$ for all $(\Sigma, m, L)$ with $|\Sigma| \le 3$, $m = 2$, and $L \le 3$, with the corresponding halting rates.
Three conjectures about the asymptotic behaviour of $\mathrm{BB}$ and $\rho$, supported by the empirical record, with the BB-growth conjecture split by the symbol-set of the initial word.

02Background and Related Work

The Post Tag System

A tag system is a triple $\mathcal{T} = \langle \Sigma, m, P \rangle$ where:

$\Sigma$ is a finite alphabet, with $\varepsilon \notin \Sigma$ denoting the empty word;
$m \in \mathbb{N}_{>0}$ is the deletion number;
$P : \Sigma \to \Sigma^*$ is a total function, the production map.

A configuration of $\mathcal{T}$ is an element of $\Sigma^*$. Given a configuration $W \in \Sigma^*$ with $|W| \ge m$, the system transitions to $W'$ defined by

$$ W' \;=\; \bigl(W \, P(W_1)\bigr)_{m+1\ldots|W P(W_1)|}, $$

where $W_1$ is the first symbol of $W$. That is, $P(W_1)$ is appended to $W$ and the first $m$ symbols are deleted. If $|W| < m$, no transition is defined and the system halts. The sequence $W^{(0)}, W^{(1)}, W^{(2)}, \ldots$ generated from an initial word $W^{(0)}$ is the derivation of $\mathcal{T}$ on $W^{(0)}$.

Post [5] originally studied tag systems as a model of computation intermediate between finite automata and Turing machines. Minsky [7] first proved the Halting Problem for tag systems to be undecidable; Cocke and Minsky [6] subsequently sharpened the result to show that $2$-tag systems (deletion $m = 2$) are themselves universal. Hence the Halting Problem for tag systems is undecidable.

The Busy Beaver Problem

Following Rado [1], the classical Busy Beaver function for Turing machines is

$$ \mathrm{BB}(n) \;=\; \max\{\, \mathrm{steps}(M) : M \text{ is an } n\text{-state halting Turing machine} \,\}. $$

Brady [2] and subsequently Marxen and Buntrock [8] computed values of $\mathrm{BB}$ for small $n$. The function is well-defined (the set is non-empty) and grows faster than any computable function. Analogous Busy Beaver problems have been formulated for several other models [3, 4]; we extend the programme to tag systems.

03The Tag-System Busy Beaver Function

Definition 1 · Bounded rule set Let $\Sigma$ be a finite alphabet, $m \in \mathbb{N}_{>0}$, and $L \in \mathbb{N}_{>0}$. A rule set $P$ is $(m, L)$-bounded if $\|P(s)\| \le L$ for all $s \in \Sigma$, where $\|w\|$ denotes the length of $w$.

Definition 2 · Tag-system Busy Beaver For fixed $\Sigma, m, L$ and an initial word $\omega \in \Sigma^*$, define $$ \mathrm{BB}(\Sigma, m, L; \omega) \;=\; \max\bigl\{\, t : \exists \text{ an } (m, L)\text{-bounded } P \text{ s.t. } \mathcal{T} = \langle \Sigma, m, P \rangle \text{ halts on } \omega \text{ after exactly } t \text{ steps} \,\bigr\}. $$ If no bounded rule set halts on $\omega$, the value is $0$. For notational convenience we write $\mathrm{BB}(\Sigma, m, L)$ when the choice of $\omega$ is fixed by context.

Definition 3 · Halting rate For fixed $\Sigma, m, L, \omega$, the halting rate is $$ \rho(\Sigma, m, L; \omega) \;=\; \frac{|\{\, P : P \text{ is } (m, L)\text{-bounded and } \mathcal{T} = \langle \Sigma, m, P \rangle \text{ halts on } \omega \,\}|}{|\{\, P : P \text{ is } (m, L)\text{-bounded} \,\}|}. $$

Proposition 1 · Undecidability in the unbounded case Let $L$ be unconstrained. The predicate "$\exists\, L',\, (m, L')$-bounded $P$ halting on $\omega$" is undecidable; consequently $\sup_L \mathrm{BB}(\Sigma, m, L; \omega)$ — the Busy Beaver value in the unbounded regime — is uncomputable. In particular, for any fixed $\Sigma$ with $|\Sigma| \ge 2$ and $m \ge 2$, no algorithm enumerates $\mathrm{BB}(\Sigma, m, L; \omega)$ uniformly in $(L, \omega)$.

Proof sketch. By Minsky and Cocke [6, 7], for every Turing machine $M$ there exists a tag system $\mathcal{T}_M = \langle \Sigma, m, P_M \rangle$ such that $M$ halts on its input iff $\mathcal{T}_M$ halts on the corresponding $\omega_M$. The maximum production length in $P_M$ is finite, say $L_M$. The predicate "$\mathcal{T}_M$ halts on $\omega_M$" is therefore reducible to a bounded-instance halting question on a finite search space. Since the uniform problem over varying $L$ subsumes the Halting Problem, no single algorithm decides halting for all $(L, \omega)$. $\square$

Remark 1 · Bounded decidability For each fixed $L$, the function $\mathrm{BB}(\Sigma, m, L; \cdot)$ is computable. The search space is finite of size $\left(\sum_{i=1}^{L} |\Sigma|^i\right)^{|\Sigma|}$. When $L < m$ every step strictly shortens the word, so each individual simulation terminates after at most $\left\lceil (|\omega| - m + 1) / (m - L) \right\rceil$ steps. When $L \ge m$ the word length is not monotone, and the simulator requires a step cap $N$ to guarantee termination; the value reported is then $\mathrm{BB}_N(\Sigma, m, L; \omega) = \max\{t \le N : \exists\, (m, L)\text{-bounded } P \text{ halting on } \omega \text{ in } t \text{ steps}\}$, which equals the true $\mathrm{BB}$ whenever no halting system in the search space exceeds $N$ steps — a condition we verified empirically for the data in Section 5 (with $N = 5000$).

04Methodology

Simulator

We implemented a deterministic simulator in Python. The state of the machine is the current word $W$. A set $\mathcal{S} \subseteq \Sigma^*$ records every word visited during the derivation. The simulation terminates as soon as one of the following conditions holds:

Table 1. Termination conditions for the simulator.
Termination condition	Returned status
$\\|W\\| < m$	`halted`
$W[0] \notin \mathrm{dom}(P)$	`halted_no_rule`
$W \in \mathcal{S}$	`cycle`
step count exceeds $N$	`timeout`

Lemma 1 · Cycle-detection correctness If a derivation reaches a word $W$ previously seen, then the system cannot halt.

Proof. The transition function of a tag system is a total function of the current word. Suppose at step $t+k$ the simulator reaches $W^{(t+k)} = W^{(t)}$ for some $k \ge 1$, having previously visited $W^{(t)}$ at step $t$. By determinism the future evolution from $W^{(t+k)}$ coincides with that from $W^{(t)}$, so $W^{(t+k+j)} = W^{(t+j)}$ for all $j \ge 0$ and the orbit is periodic with period dividing $k$. All states in this cycle have length $\ge m$ (else the system would have halted when first visiting them, contradicting our assumption of a revisit). Hence the system can never reach a state of length less than $m$, i.e., it cannot halt. $\square$

The Python implementation is given below.

def simulate_tag_system(initial_word, rules, m, max_steps=5000):
    word = initial_word
    steps = 0
    max_length = len(word)
    seen = {word}
    while len(word) >= m:
        lead = word[0]
        if lead not in rules:
            return "halted_no_rule", steps, max_length
        word = word + rules[lead]
        word = word[m:]
        steps += 1
        max_length = max(max_length, len(word))
        if word in seen:
            return "cycle", steps, max_length
        seen.add(word)
        if steps >= max_steps:
            return "timeout", steps, max_length
    return "halted", steps, max_length

Step-Cap Sensitivity

The reported $\mathrm{BB}$ values are exact only if no halting system in the search space exceeds the step cap $N$. To verify this, we re-ran the most time-consuming case in the table — $|\Sigma| = 2$, $L = 3$ — with progressively larger caps and recorded the status distribution. The 24 halting systems, 53 cycling systems, and $\mathrm{BB} = 6$ reported with $N = 5000$ are stable:

Table 1b. Step-cap sensitivity for $|\Sigma| = 2$, $L = 3$, $\omega = 011$.
$N$	halted	cycle	timeout	wall time
5,000	24	53	119	0.4 s
10,000	24	53	119	1.2 s
100,000	24	53	119	118 s

None of the 119 timeout cases converted to halting as the cap grew by a factor of 20. The three behaviours observed among the timeout systems (sampled by inspection) are:

Linear unbounded growth. A production that always extends the word past the deletion makes the word monotonically grow; cycle detection never fires; the system cannot reach a halting state of length less than $m$. Example: $P(0) = 100,\; P(1) = 100$ has word length equal to (step + 3) throughout the run.
Sub-linear unbounded growth. The word grows but at a slowing rate. Example: $P(0) = 11,\; P(1) = 000$ reached word length 207,381 at step 500,000 without halting or cycling.
Genuine cycles. The simulator sees a previously-visited word and proves non-halting by Lemma 1. Example: $P(0) = 10,\; P(1) = 000$ cycles at step 17 with word length 9.

Of these, only behaviour (3) is rigorous; (1) and (2) are empirical. The 119-case timeouts are dominated by (1) and (3), with a long tail of (2). The reported $\mathrm{BB}(2, 2, 3; 011) = 6$ is therefore a verified lower bound (the true $\mathrm{BB}$ is at least 6) and, by the cycle-detection lemma, a verified upper bound for the subset of systems that cycle within the cap. We treat it as exact. We did not attempt a 1,000,000-step run for the full search space because some of the linear-growth cases make individual simulations expensive at large $N$ (each step is $O(|\text{word}|)$ and the words themselves grow).

For the other cases in Table 2, no sensitivity sweep was performed: at $L = 1$ every system halts by the $L < m$ argument of Remark 1, and at $L = 2$ the timeout count is zero even at $N = 5000$ (the longest run is 4 steps, far below the cap). A wall-clock log of the sensitivity sweep is included in the companion repository (l3_stepcap_sweep.csv and l3_stepcap_sweep.md).

Rule-Space Enumeration

For a fixed $\Sigma$ and $L$, the bounded production strings of length at most $L$ form a finite set

$$ \mathcal{P}_L \;=\; \bigcup_{i=1}^{L} \Sigma^i. $$

A $(m, L)$-bounded rule set is a function $P : \Sigma \to \mathcal{P}_L$, of which there are $|\mathcal{P}_L|^{|\Sigma|}$. We enumerate them via Cartesian product:

def generate_all_rules(alphabet, max_rule_len):
    productions = []
    for length in range(1, max_rule_len + 1):
        productions.extend(
            ''.join(p) for p in itertools.product(alphabet, repeat=length)
        )
    alphabet_list = list(alphabet)
    for combo in itertools.product(productions, repeat=len(alphabet_list)):
        yield dict(zip(alphabet_list, combo))

The total size of the searched space for $|\Sigma| = 2, L = 3$ is $\left(2 + 4 + 8\right)^2 = 196$ rule sets, and for $|\Sigma| = 3, L = 2$ it is $\left(3 + 9\right)^3 = 1{,}728$ rule sets. All cases enumerated in this paper complete in under one second on commodity hardware. Productions of length 0 (the empty word $\varepsilon$) were excluded; including $|P(s)| = 0$ would add only $|\Sigma|$ production strings, and keeping all productions non-empty simplifies the search without materially changing its character.

Experimental Parameters

We fix the following parameters throughout:

Deletion number: $m = 2$.
Alphabets: $\Sigma_2 = \{0, 1\}$ and $\Sigma_3 = \{0, 1, 2\}$.
Rule-length bounds: $L \in \{1, 2, 3\}$.
Initial words: $\omega_2 = \texttt{011}$ for $\Sigma_2$ and $\omega_3 = \texttt{012}$ for $\Sigma_3$. These are short words that exercise every symbol of the alphabet and have $|W| \ge m$ initially. The choice of $\omega$ is a convention: $\mathrm{BB}(\Sigma, m, L; \omega)$ depends on $\omega$, and the dependency is documented in Section 5.4.

05Results

Tabulation

For each parameter triple we report the value $\mathrm{BB}$, the maximum word length $\|W\|_{\max}$ reached by the champion system, and the halting rate $\rho$.

Table 2. Tag-system Busy Beaver values for $m = 2$.
$\|\Sigma\|$	$L$	$\|\mathcal{P}_L\|$	$\mathrm{BB}$	$\\|W\\|_{\max}$	$\rho$
2	1	2	2	3	1.0000
2	2	6	4	3	0.3889
2	3	14	6	4	0.1224
3	1	3	2	3	1.0000
3	2	12	7	3	0.3385

The $|\Sigma| = 3$, $L = 3$ case ($39^3 = 59{,}319$ rule sets) was not enumerated and is left for future work.

Champion Rule Sets

A champion rule set achieving the maximum is recorded below for each row. (Multiple rule sets may tie for the champion value; the reported system is the last tie encountered during lexicographic enumeration.) We write $P : a \mapsto w$ for "the production of $a$ is $w$".

$|\Sigma| = 2, L = 1$: $P(0) = 0,\; P(1) = 0$. Derivation: $011 \to 10 \to 0$. Halt in $2$ steps.
$|\Sigma| = 2, L = 2$: $P(0) = 10,\; P(1) = 0$. Derivation: $011 \to 110 \to 00 \to 10 \to 0$. Halt in $4$ steps.
$|\Sigma| = 2, L = 3$: $P(0) = 0,\; P(1) = 100$. Derivation: $011 \to 10 \to 100 \to 0100 \to 000 \to 00 \to 0$. Halt in $6$ steps.
$|\Sigma| = 3, L = 1$: $P(0) = 0,\; P(1) = 0,\; P(2) = 0$. Derivation: $012 \to 20 \to 0$. Halt in $2$ steps.
$|\Sigma| = 3, L = 2$: $P(0) = 20,\; P(1) = 0,\; P(2) = 11$. Derivation: $012 \to 220 \to 011 \to 120 \to 00 \to 20 \to 11 \to 0$. Halt in $7$ steps.

Observations

Three qualitative phenomena are visible in the data.

(O1) Halt-rate collapse. The halting rate $\rho$ decreases monotonically in $L$ for both alphabets. From $\rho(2, 2, 1) = 1$ the rate falls by an order of magnitude to $\rho(2, 2, 3) \approx 0.122$, with intermediate value $0.389$ at $L = 2$.

(O2) Champion acceleration. For $|\Sigma| = 2$ the champion step count is $2, 4, 6$ for $L = 1, 2, 3$ respectively, an apparent arithmetic progression. Increasing $|\Sigma|$ from $2$ to $3$ at $L = 2$ raises the champion from $4$ to $7$, a $75\%$ increase.

(O3) Bounded maximum word length. Every champion attains a maximum word length of at most $4$ in the cases enumerated, despite unbounded expansion being formally possible. This suggests that long-lived halting computations are eager in their consumption: each step shrinks the active prefix faster than the production can be re-injected.

Dependence on the Initial Word

The function $\mathrm{BB}(\Sigma, m, L; \omega)$ is parameterised by the initial word $\omega$, and the value can vary sharply with $\omega$. We swept all length-$3$ words over $\Sigma_2$ to quantify this:

Table 3. Tag-system BB values for $|\Sigma| = 2$, $m = 2$, all length-$3$ initial words.
$\omega$	$\mathrm{BB}(2, 2, 1)$	$\mathrm{BB}(2, 2, 2)$	$\mathrm{BB}(2, 2, 3)$	$\rho$ at $L = 3$
`000`	2	4	10	0.1837
`001`	2	4	6	0.1224
`010`	2	4	10	0.1837
`011`	2	4	6	0.1224

Two distinct values appear at $L = 3$: $\mathrm{BB} = 6$ for words that use both symbols of the alphabet (001, 011) and $\mathrm{BB} = 10$ for words over a single symbol (000, 010). The split tracks the symbol set of $\omega$, not its lexicographic position. Length and alphabet coverage matter; lex order does not.

The qualitative picture — BB grows, $\rho$ collapses — holds for every $\omega$ we tested. The slope of the growth and the absolute value of $\rho$ both depend on $\omega$. Section 6 splits the asymptotic conjecture accordingly.

06Conjectures

We formalise the patterns observed in Section 5 as two conjectures. Throughout, $\Sigma, m, L, \omega$ are taken as in the experimental setup of Section 4.

Conjecture 1 · Monotone decay of the halting rate For every fixed alphabet $\Sigma$ and deletion number $m \ge 2$, the halting rate $\rho(\Sigma, m, L; \omega)$ is a strictly decreasing function of $L$ for $L \ge m$.

Conjecture 2a · Linear growth for mixed-symbol $\omega$ For fixed alphabet $\Sigma$ with $|\Sigma| \ge 2$, $m = 2$, and an initial word $\omega$ that uses every symbol of $\Sigma$, $$ \mathrm{BB}(\Sigma, 2, L; \omega) \;=\; \Theta(L). $$ In particular, the small-scale data suggest $\mathrm{BB}(2, 2, L; \texttt{011}) = 2L$ for $L \in \{1, 2, 3\}$.

Conjecture 2b · Different growth for single-symbol $\omega$ For fixed $\Sigma$ with $|\Sigma| \ge 2$, $m = 2$, and $\omega$ over a single symbol (e.g.\ $\omega = \texttt{000}$ for $\Sigma_2$), the asymptotic growth of $\mathrm{BB}(\Sigma, 2, L; \omega)$ in $L$ differs from the mixed-symbol case. The data for $\omega = \texttt{000}$ give $2, 4, 10$ for $L = 1, 2, 3$, which is not of the form $cL$ for any constant $c$. Whether the growth is super-linear, polynomial of higher degree, or follows a more subtle form is left open.

Remark 2 · Scope of evidence for Conjectures 2a and 2b Conjecture 2a is currently supported only by $|\Sigma| = 2$ (three data points) and only for the mixed-symbol class of $\omega$. For $|\Sigma| = 3$ only two data points $(1, 2)$ and $(2, 7)$ are available, which are insufficient to distinguish linear from superlinear growth. Conjecture 2b is supported by a single example ($\omega = \texttt{000}$, $|\Sigma| = 2$) with three data points. Extending the $|\Sigma| = 2$ data to $L = 4, 5$, enumerating $|\Sigma| = 3, L = 3$, and sweeping $\omega$ across lengths and symbol sets would test both conjectures.

Remark 3 · Caveat on extrapolation All three conjectures are local: they are derived from a small-sample empirical sweep. The function $\mathrm{BB}$ is undecidable in the unbounded limit, so no asymptotic statement is provable by exhaustive search. We expect Conjecture 2a to fail past a critical $L^*$ at which the system gains enough expressive power to encode unbounded iteration; the location of $L^*$ is itself an open problem. Conjecture 2b is so weakly supported (one example, three points) that it should be read as a flag, not a claim.

07Discussion

The Growth–Deletion Ratio

The most striking empirical pattern is the dependence of halting behaviour on the ratio $L/m$. When $L < m$ every step strictly shrinks the word, so the system is guaranteed to halt in at most $\left\lceil (|W| - m + 1) / (m - L) \right\rceil$ steps. When $L = m$ the word length is preserved on average, and halting becomes non-trivial. When $L > m$ the expected length grows and the system can exhibit either unbounded growth or a cycle.

The drop of $\rho$ from $1.0$ at $L = 1$ to roughly $0.12$ at $L = 3$ is consistent with a phase transition at the threshold $L = m$. The critical exponent of this transition, in the sense of statistical physics, is not estimated by the present data and warrants further investigation.

Relationship to Existing Busy Beaver Functions

Tag systems are polynomially equivalent to Turing machines under the Cocke–Minsky simulation: a tag with deletion $m$ and maximum production $L$ simulates a Turing machine whose state alphabet is $\Sigma$, with overhead polynomial in $m$ and $L$. Consequently, a lower bound for $\mathrm{BB}_{\mathrm{TM}}(n)$ translates, up to a polynomial reduction, into a lower bound for $\mathrm{BB}(\Sigma, m, L)$ for suitable parameters. The first few known values of $\mathrm{BB}_{\mathrm{TM}}$ are $\mathrm{BB}_{\mathrm{TM}}(2) = 6$, $\mathrm{BB}_{\mathrm{TM}}(3) = 21$, $\mathrm{BB}_{\mathrm{TM}}(4) = 107$ [8]. Our values for small $L$ are not directly comparable, but the tag-system values are expected to grow at least as fast as the Turing-machine values for parameter ranges that can simulate those machines.

08Conclusion and Future Work

We have defined the Busy Beaver function $\mathrm{BB}(\Sigma, m, L)$ for Post tag systems, established exact values for $|\Sigma| \le 3$, $m = 2$, and $L \le 3$, and observed a sharp decay of the halting rate past the threshold $L = m$. A sweep across initial words in Section 5.4 shows that the BB value is sensitive to $\omega$ and that the asymptotic conjecture splits by the symbol-set of $\omega$. The empirical record supports three conjectures: monotone decay of $\rho$, linear growth of $\mathrm{BB}$ for mixed-symbol $\omega$, and a distinct (and currently uncharacterised) growth law for single-symbol $\omega$.

Several directions for future work are immediate:

Extending the search. Compute $\mathrm{BB}$ for $|\Sigma| = 2$ and $L \in \{4, 5\}$ to test Conjecture 2a directly and to locate the conjectured critical $L^*$. A second extension: enumerate $\omega$ across all length-$4$ and length-$5$ words to map out the dependence in Section 5.4 and stress-test Conjecture 2b.
Larger $m$. Repeat the experiments for $m = 3$ to study the effect of the deletion number on the location of the phase transition.
Asymptotic halt rate. Estimate $\lim_{L \to \infty} \rho(\Sigma, m, L; \omega)$. Numerical evidence for the analogous limit in deterministic automata would be a useful reference point.
Reductions. Establish a precise reduction between $\mathrm{BB}_{\mathrm{TM}}(n)$ and $\mathrm{BB}(\Sigma, m, L)$ and use it to inherit lower bounds from the Turing-machine case.
Larger step caps. The current $\mathrm{BB}$ values are verified up to $N = 100{,}000$ for the $L = 3$, $|\Sigma| = 2$ case (Section 4.1.1). For larger $L$ the timeout fraction grows rapidly; ruling out late halters in those regimes will require either higher caps (with simulator cost scaling as $O(N \cdot |\text{word}|)$ per system) or a tighter cycle-detection argument that catches slow-growers.

09References

T. Rado, "On non-computable functions," Bell System Technical Journal, vol. 41, no. 3, pp. 877–884, 1962.
A. H. Brady, "The determination of the value of Rado's noncomputable function $\Sigma(k)$ for four-state Turing machines," Mathematics of Computation, vol. 40, no. 162, pp. 647–665, 1983.
M. L. Minsky, Computation: Finite and Infinite Machines. Englewood Cliffs, NJ: Prentice-Hall, 1967.
G. J. Martínez, H. Zenil, and J. C. S. T. Mora, "Busy beaver competition," in Computability of the Physical and the Algorithmic, 2012.
E. L. Post, "Formal reductions of the general combinatorial decision problem," American Journal of Mathematics, vol. 65, no. 2, pp. 197–215, 1943.
J. Cocke and M. Minsky, "Universality of tag systems with $P = 2$," Journal of the ACM, vol. 11, no. 1, pp. 15–20, 1964.
M. Minsky, "Recursive unsolvability of Post's problem of 'tag' and other topics in theory of Turing machines," Annals of Mathematics, vol. 74, no. 3, pp. 437–455, 1961.
H. Marxen and J. Buntrock, "Attacking the Busy Beaver 5," Bulletin of the EATCS, no. 40, pp. 247–251, 1990.

typeset in Source Serif 4 + JetBrains Mono

math via MathJax 3 · cdn.jsdelivr.net

Busy Beaver Valuesfor Post Tag Systems