On the Order Type of Scattered Context-Free Orderings

(1)

J. Leroux and J.-F. Raskin (Eds.): Tenth International Symposium on Games, Automata, Logics, and Formal Verification (GandALF’19).

EPTCS 305, 2019, pp. 169–182, doi:10.4204/EPTCS.305.12

Kitti Gelle

University of Szeged, Hungary kgelle@inf.u-szeged.hu

Szabolcs Iván

University of Szeged, Hungary szabivan@inf.u-szeged.hu

We show that if a context-free grammar generates a language whose lexicographic ordering is well- ordered of type less thanω², then its order type is effectively computable.

1 Introduction

If an alphabetΣis equipped by a linear order<, this order can be extended to the lexicographic ordering

<ℓ on Σ^∗ as u<ℓv if and only if either u is a proper prefix of v or u=xay and v =xbz for some x,y,z∈Σ^∗and lettersa<b. So any languageL⊆Σ^∗can be viewed as a linear ordering (L, <_ℓ). Since {a,b}^∗contains the dense ordering(aa+bb)^∗aband every countable linear ordering can be embedded into any countably infinite dense ordering, every countable linear ordering is isomorphic to one of the form(L, <_ℓ) for some language L⊆ {a,b}^∗. A linear ordering (or an order type) is called regular or context-free if it is isomorphic to the linear ordering (or, is the order type) of some language of the appropriate type. It is known [2] that an ordinal is regular if and only if it is less thanω^ω and is context- free if and only if it is less thanω^ω^ω. Also, the Hausdorff rank [13] of any scattered regular (context-free, resp.) ordering is less thanω (ω^ω, resp) [10, 8].

It is known [9] that the order type of a well-ordered language generated by a prefix grammar (i.e. in which each nonterminal generates a prefix-free language) is computable, thus the isomorphism problem of context-free ordinals is decidable if the ordinals in question are given as the lexicograpic ordering of prefixgrammars. Also, the isomorphism problem of regular orderings is decidable as well [15, 3]. At the other hand, it is undecidable for a context-free grammar whether it generates a dense language, hence the isomorphism problem of context-free orderings in general is undecidable [7].

Algorithms that work for the well-ordered case can in many cases be “tweaked” somehow to make them work for the scattered case as well: e.g. it is decidable whether(L, <ℓ) is well-ordered or scattered [6] and the two algorithms are quite similar.

In this paper we continue to explore the boundary of decidability of the isomorphism problem of context-free orderings. We show that if the order type o(L) of a context-free language Lis known to have the formω×k+nfor some integerskandn, thenkandncan be effectively computed. The main building block for proving this is a decision procedure for solvingo(L(X))=^? ω for each nonterminalX, and a recursive algorithm that terminates for languages of order type less thanω².

2 Notation

Alinear ordering is a pair (Q, <), where Q is some set and the< is a transitive, antisymmetric and connex (that is, for eachx,y∈Qexactly one ofx<y,y<xorx=yholds) binary relation onQ. The pair

*Ministry of Human Capacities, Hungary grant 20391-3/2018/FEKUSTRAT is acknowledged. Szabolcs Iván was supported by the János Bolyai Scholarship of the Hungarian Academy of Sciences. Kitti Gelle was supported by the ÚNKP-19-3-SZTE-86 New National Excellence Program of the Ministry of Human Capacities.

(2)

(Q, <)is also written simplyQif the ordering is clear from the context. A (necessarily injective) function h:Q1→Q2, where(Q1, <1)and(Q2, <2)are some linear orderings, is called an(order) embeddingif for eachx,y∈Q1,x<1yimpliesh(x)<2h(y). Ifhis also surjective,his anisomorphism, in which case the two orderings areisomorphic. An isomorphism class is called anorder type. The order type of the linear orderingQis denoted byo(Q).

For example, the class of all linear orderings contain all the finite linear orderings and the orderings of the integers (Z), the positive integers (N) and the negative integers (N₋) whose order type is denoted ζ,ω and−ωrespectively. Order types of the finite sets are denoted by their cardinality, and[n]denotes {1, . . . ,n}for eachn≥0, ordered in the standard way.

The ordered sum∑_x∈QQ_x, whereQis some linear ordering and for eachx∈Q,Q_xis a linear ordering, is defined as the ordering with domain{(x,q):x∈Q,q∈Q_x}and ordering relation(x,q)<(y,p)if and only if eitherx<y, orx=yandq<pin the respectiveQx. If eachQxhas the same order typeo1andQ has order typeo2, then the above sum has order typeo1×o2. IfQ= [2], then the sum is usally written as Q₁+Q₂.

If(Q, <)is a linear ordering andQ^′⊆Q, we also write(Q^′, <)for the subordering of(Q, <), that is, to ease notation we also use<for the restriction of<toQ^′.

A linear ordering (Q, <)is calleddense if it has at least two elements and for eachx,y∈Qwhere x<ythere exists az∈Qsuch thatx<z<y. A linear ordering isscatteredif no dense ordering can be embedded into it. It is well-known that every scattered sum of scattered linear orderings is scattered, and any finite union of scattered linear orderings is scattered. A linear ordering is called awell-orderingif it has no subordering of type−ω. Clearly, any well-ordering is scattered. Since isomorphism preserves well-orderedness or scatteredness, we can call an order type well-ordered or scattered as well, or say that an order type embeds into another. The well-ordered order types are calledordinals. For any setΩof ordinals,(Ω, <)is well-ordered by the relationo1<o2 ⇔ “o1can be embedded injectively intoo2but not vice versa”. The principle of well-founded induction can be formulated as follows. AssumePis a property of ordinals such that for any ordinalo, ifPholds for all ordinals smaller thano, thenPholds foro. ThenPholds for all the ordinals.

For standard notions and useful facts about linear orderings see e.g. [13] or [14].

Hausdorff classified the countable scattered linear orderings with respect to their rank. We will use the definition of the Hausdorff rank from [8], which slightly differs from the original one (in whichH0

contains only the empty ordering and the singletons, and the classesH_α are not required to be closed under finite sum, see e.g. [13]). For each countable ordinalα, we define the classHαof countable linear orderings as follows. H0 consists of all finite linear orderings, and whenα >0 is a countable ordinal, thenH_α is the least class of linear orderings closed under finite ordered sum and isomorphism which contains all linear orderings of the form∑i∈ZQ_i, where eachQ_i is inH_β_i for someβ_i<α.

By Hausdorff’s theorem, a countable linear orderQis scattered if and only if it belongs toHα for some countable ordinalα. Therank r(Q)of a countable scattered linear ordering is the least ordinal α withQ∈H_α.

As an example, ω,ζ,−ω and ω+ζ or any finite sum of the form ∑

i∈[n]

o_i witho_i∈ {ω,−ω,1}for eachi∈[n]each have rank 1 while(ω+ζ)×ωhas rank 2.

LetΣbe an alphabet (a finite nonempty set) and letΣ^∗(Σ⁺, resp) stand for the set of all (all nonempty, resp) finite words overΣ, ε for the empty word, |u|for the length of the word u,u·vor simply uvfor the concatenation ofuandv. Alanguageis an arbitrary subsetLofΣ^∗. We assume that each alphabet is equipped by some (total) linear order. Two (strict) partial orderings, the strict ordering<sand the prefix ordering<_pare defined overΣ^∗as follows:

(3)

• u<_svif and only ifu=u1au2andv=u1bv2for some wordsu1,u₂,v₂∈Σ^∗and lettersa<b,

• u<pvif and only ifv=uwfor some nonempty wordw∈Σ^∗.

The union of these partial orderings is the lexicographical ordering<_ℓ=<_s∪<_p. We call the language L well-ordered or scattered, if (L, <ℓ) has the appropriate property and we define the rank r(L) of a scattered languageLasr(L, <ℓ). The order typeo(L)of a languageLis the order type of(L, <ℓ). For example, ifa<b, theno

{a^kb:k≥0}

=−ω ando

{(bb)^ka:k≥0}

=ω.

Whenρis a relation over words (like<_ℓor<_s), we writeKρLifuρvfor each wordu∈Kandv∈L.

An ω-word over Σ is an ω-sequence a1a2. . . of letters ai ∈Σ. The set of all ω-words over Σis denoted Σ^ω. The orderings <_ℓ and <_p are extended to ω-words. An ω-word w is called regular if w=uv^ω =uvvvv. . .for some finite wordsu∈Σ^∗and v∈Σ⁺. Whenwis a (finite orω-) word overΣ andL⊆Σ^∗is a language, thenL<wstands for the language{u∈L:u<w}. Notions likeL≥w,L<sware also used as well, with the analogous semantics.

Acontext-free grammaris a tupleG= (N,Σ,P,S), whereNis the alphabet of thenonterminal sym- bols,Σis the alphabet ofterminal symbols(orletters) which is disjoint fromN,S∈Nis thestart symbol andPis a finite set ofproductionsof the formA→α, whereA∈N andα is asentential form, that is,

α =X1X2. . .Xk for somek≥0 and X1, . . . ,Xk ∈N∪Σ. The derivation relations ⇒, ⇒_ℓ, ⇒^∗ and ⇒^∗_ℓ

are defined as usual (where the subscriptℓstands for “leftmost”). Thelanguage generated bya grammar Gis defined as L(G) ={u∈Σ^∗ |S⇒^∗u}. Languages generated by some context-free grammar are called context-free languages. For any set ∆ of sentential forms, the language generated by∆ is L(∆) ={u∈Σ^∗|α ⇒^∗ufor someα ∈∆}. As a shorthand, we defineo(∆)aso(L(∆)). A languageLis prefix(or prefix-free) if there are no wordsu,v∈Lwithu<_pv. A context-free grammarG= (N,Σ,P,S) is called aprefixgrammar ifL(A)is a prefix language for eachA∈N. WhenX,Y∈N∪Σare symbols of a grammarG, we writeY XifX⇒^∗uY vfor some wordsuandv;X≈Y ifX Y andY X both hold; andY ≺X ifY X but notX Y. A production of the formX→X1. . .XnwithXi≺X for each i∈[n]is called anescaping production.

A regular language over Σis one which can be built up from the singleton languages {a}, a∈Σ and the empty language /0 with finitely many applications of taking (finite) union, concatenationKL= {uv:u∈K,v∈L} and iteration K^∗={u1. . .u_n:n≥0,u_i∈K}. For standard notions on regular and context-free languages the reader is referred to any standard textbook, such as [11].

Linear orderings which are isomorphic to the lexicographic ordering of some context-free (regular, resp.) language are calledcontext-free (regular, resp.) orderings.

3 If o(L) < ω

²

, then o(L) is computable

In this section we consider a context-free grammar G= (N,Σ,P,S) which contains no left recursive nonterminals, and generates a(n infinite) scattered language such that for eachX ∈N,X is usable and L(X)is an infinite language of nonempty words, moreover, each nonterminal but possiblySis recursive and there is no left recursive nonterminal (that is,X⇒⁺uX vimpliesu6=ε). Any context-free grammar can effectively be transformed into such a form, see e.g. [9].

The section is broken into two parts: the first subsection contains some technical decidability lemmas, while the second one contains the main result that if we know thato(L)<ω²for a well-ordered context- free languageL(so that the Hausdorff-rank ofLis at most one), theno(L)is effectively computable. This computability is already known for so-called ordinal grammars which generate a well-ordered language such that for each nonterminalX,L(X)is a prefix language [9]. However, this is a serious restriction and

(4)

makes many proofs easier since ifKis a prefix language, theno(KL) =o(L)×o(K)for any languageL.

This does not hold for arbitrary languages since e.g.o(a^∗) =ω,o(b) =1 ando(a^∗b) =−ω,o(a^∗a^∗) =ω, o((ac)^∗) =ω ando((ac)^∗(b+ab)) =ω+ (−ω)so a more careful case analysis is required. The reader is advised to skip the first subsection at first read – the proofs of the second part extensively refer to the lemmas of the first part.

3.1 Some technical lemmas

For anω-wordw, letPref(w)⊆Σ^∗stand for the set of the finite prefixes ofw. For eachu=a1. . .ak∈Σ^∗

andv=b1. . .bt ∈Σ⁺letMu,v denote the automaton (without specified final states) depicted in Figure 1.

q_ε

start qa1 · · · qu q_ub₁ · · · quv

q_<_s

q>s

Q<p

a1 a2 a_k b1 b2 bt−1

b_t σ<a₁

σ<a2 σ<b1 σ<b₂ σ<b_t

σ>a1

σ>a₂ σ>b1

σ>b2

σ>bt

σ

Figure 1: The automatonM_u,v

Proposition 1. For any words u∈Σ^∗ and v∈Σ⁺, the languages Pref(uv^ω),{w∈Σ^∗:w<_suv^ω}and {w∈Σ^∗:uv^ω<sw}are regular.

Proof. Letu=a1. . .ak andv=b1. . .bt for the integersk≥0,t>0 and letters ai,b_j and consider the automatonM_u,vgiven in Figure 1. Then, by settingq_<_s (q_>_s, respectively) for the unique accepting state we recognize{w∈Σ^∗:w<_suv^ω}({w∈Σ^∗:uv^ω <_sw}, resp.), and by settingQ_<_pas the set of final states we recognizePref(uv^ω).

Lemma 1. For each sentential formαwith L(α)being infinite, we can generate a sequence w0,w1, . . .∈ L(α)and a regular word w∈Σ^ω satisfying one of the following cases:

i) w1<_sw2<_s. . .and w= ^W

i≥0w_i ii) w1>sw2>s. . .and w= ^V

i≥0wi

iii) w1<_pw2<_p. . .and w= ^W

i≥0w_i

Proof. By the pumping lemma of context-free languages, asL(α)is infinite, one can generate a word u∈L(α)and a partitionu=u1u2u3u4u5such that|u2u4| ≥1 and for eachn≥0, the wordu1uⁿ₂u3uⁿ₄u5is inL(α).

Based on the relative order of the five subwords we consider the following cases:

(5)

1. There exists ann0such thatu3uⁿ₄⁰ <_su2u3uⁿ₄⁰

Let us define the sequence asw_n=u1uⁿ₂⁰⁺ⁿu3uⁿ₄⁰⁺ⁿu5. Let us fixnand letmdenotem=n0+n.

Here we get that wn<_swn+1 if and only ifu3u^m₄u5<_su2u3u^m+1₄ u5, which is true since u3uⁿ₄⁰ <_s u2u3uⁿ₄⁰impliesu3uⁿ₄⁰x<su2u3uⁿ₄⁰yfor anyx,y∈Σ^∗, thus in particular forx=uⁿ₄u5andy=uⁿ⁺¹₄ u5

as well.

Hence the type of the sequence is of i) and the supremum isw= ^W

i≥0wi =u1u^ω₂. (Observe that u26=εas that could not satisfyu3uⁿ₄⁰ <_su2u3uⁿ₄⁰).

2. There exists ann0such thatu2u3uⁿ₄⁰ <_su3uⁿ₄⁰

Again, u2 cannot be the empty word. Similarly to the first case, let us define the sequence as w_n=u1uⁿ₂⁰⁺ⁿu3uⁿ₄⁰⁺ⁿu5and fixnand letmdenotem=n0+n.

Here we get thatwn+1<_sw_nif and only ifu2u3u^m+1₄ u5<_su3u^m₄u5, which is true sinceu2u3uⁿ₄⁰ <_s u3uⁿ₄⁰ impliesu2u3uⁿ₄⁰x<_su3uⁿ₄⁰yfor anyx,y∈Σ^∗, thus in particular forx=uⁿ⁺¹₄ u5andy=uⁿ₄u5

as well. So we get that the type of the sequence is ii) (with order type of−ω) and the infimum is w= ^V

i≥0wi=u1u^ω₂.

3. For eachnit holds thatu3uⁿ₄≤pu2u3uⁿ₄andu46=ε In this caseu3u^ω₄ =u2u3u^ω₄.

Let us fixN=l

|u2|

|u4|

m

+1 and the sequence aswn=u1u^N+n₂ u3u^N+n₄ u5. Furthermore, letx∈Σ^∗ be the unique word withu3u^N₄x=u2u3u^N₄. That is,xis the unique suffix ofu^N₄ of length|u2|. Hence, for anyn≥Nwe also haveu3uⁿ₄x=u2u3uⁿ₄for the samex(as we know thatu3uⁿ₄≤_pu2u3uⁿ₄, their length differ by|u2|, and the latter word ends withu^N₄).

We have three subcases:

(a) It holds thatu3u^N₄u5<_su2u3u^N+1₄ u5

First observe that asu2u3u^N+1₄ u5=u3u^N₄xu4u5, the assumption of the subcase yields u5<s

xu4u5. Then for eachm≥N we have u3u^m₄u5<_su3u^m₄xu4u5=u2u3u^m₄u4u5 =u2u3u^m+1₄ u5, implyingu1u^m₂u3u^m₄u5<_su1u^m+1₂ u3u^m+1₄ u5. So the sequence is of type i) and its supremum is eitheru1u^ω₂ (ifu2is nonempty) oru1u3u^ω₄ (otherwise).

(b) It holds thatu2u3u^N+1₄ u5<su3u^N₄u5

First sinceu2u3u^N+1₄ u5 can be written asu3u^N₄xu4u5, the assumption of the subcase yields xu4u5 <_su5. Then for each m≥N we have u3u^m₄xu4u5 =u2u3u^m₄u4u5 =u2u3u^m+1₄ u5 <_s u3u^m₄u5, so we get a sequence of words such thatu1u^m+1₂ u3u^m+1₄ u5<su1u^m₂u3u^m₄u5. Hence, the sequence is of type ii) with the infimum of eitheru1u^ω₂ (if u2 is nonempty) or u1u3u^ω₄ (otherwise).

(c) It holds thatu3u^N₄u5<_pu2u3u^N+1₄ u5

In this case for eachm≤Nwe haveu3u^m₄u5<pu2u3u^m+1₄ u5, so we get an ascending prefix chain. Hence the order type of this sequence of iii) isω with the supremum of eitheru1u^ω₂ (ifu2is nonempty) oru1u3u^ω₄ (otherwise).

4. It holds thatu3<_pu2u3andu4=ε Note thatu2cannot be empty in this case.

We have three subcases:

(a) It holds thatu3u5<_su2u3u5

(6)

In this case for each n≤0 we have u1uⁿ₂u3u5 <_su1uⁿ⁺¹₂ u3u5 iff u3u5 <_su2u3u5 which is the assumption of this subcase. So we get that the sequence type is i) and the supremum is w=u1u^ω₂.

(b) It holds thatu2u3u5<_su3u5

Here, similarly to the previous case for each n≤0 we have u1uⁿ⁺¹₂ u3u5 <_su1uⁿ₂u3u5 iff u2u3u5<_su3u5, which is implied by the assumption. Hence we have an infinite descending chain with the sequence type of ii) and infimumw=u1u^ω₂.

(c) It holds thatu3u5<_pu2u3u5

In the last case sinceu3u5<_pu2u3u5, for eachn≤0 we haveu1uⁿ₂u3u5<_pu1uⁿ⁺¹₂ u3u5which is a prefix chain with sequence type of iii) and supremumw=u1u^ω₂.

Observe that it is also decidable which (sub)case applies: first we check for the condition of Case 4 (which is clearly decidable). Then, if that condition does not hold, we check whetheru3u^ω₄ =u2u3u^ω₄ holds. As equality of regular words is decidable, this can be done, and if they are the same, then we again have three sub-conditions concerning finite words. Otherwise, ifu3u^ω₄ 6=u2u3u^ω₄, then either u3u^ω₄ <_s u2u3u^ω₄, that is,u3uⁿ₄⁰ <su2u3uⁿ₄⁰ for somen0, oru2u3u^ω₄ <su3u^ω₄, in which caseu2u3uⁿ₄⁰ <su3uⁿ₄⁰ for some n0. But since we know that one of these two cases has to hold, we only have to iterate through all the integersnand compareu3uⁿ₄ withu2u3uⁿ₄ and eventually there will be annfor which these two become comparable by<_s. (A more efficient algorithm also exists, e.g. by analyzing the direct product automatonMu3,u4×Mu2u3,u4.)

We recall the following characterizations of those context-free grammars generating a scattered (or well-ordered) language from [1]:

Theorem 1 ([1]). Assume G = (N,Σ,P,S) is a context-free grammar such that each nonterminal is usable,ε-free and there are no left-recursive nonterminals. Then

• L(G)is scattered if and only if for each recursive nonterminal X there exists a word u_X ∈Σ⁺such that whenever X⇒⁺uXα for some u∈Σ^∗,α ∈(N∪Σ)^∗, then u∈u⁺_X.

• If L(G)is scattered and X≈X^′ are recursive nonterminals, then there exists a word u_X_,X^′<_pu_X such that whenever X⇒⁺uX^′αfor some u∈Σ^∗,α ∈(N∪Σ)^∗, then u∈u^∗_Xu_X,X^′.

• L(G)is well-ordered if and only if it is scattered and for each recursive nonterminal X , L(X)<_ℓu^ω_X. Moreover, for each X,X^′the words u_X and u_X,X^′ are effectively computable and it is decidable whether L(X)is scattered, or well-ordered.

Proposition 2. If L(X)is well-ordered for the recursive nonterminal X , then^WL(X) =u^ω_X.

Proof. By Theorem 1,L(X)<_ℓu^ω_X. SinceXis recursive and all the nonterminals are usable, there exists some derivation of the formX ⇒uX vfor some wordsu,v∈Σ^∗and sou=u^m_X for somem>0. SinceX is usable, there exists some wordw∈L(X)and so for eachn>0, the wordu^m·n_X wvⁿis inL(X)and is still upperbounded byu^ω_X. As the supremum of these words isu^ω_X, we got the claimed result.

We call an infinite languageL⊆Σ^∗aprefix chainif for eachu,v∈L, eitheru≤pvorv≤pu, that is, Lis totally ordered by the prefix relation, or equivalently,L⊆Pref(w)for someω-wordw. Clearly, any languageLis either a prefix chain or contains two wordsu,vwithu<_sv. Note that ifLis a prefix chain, theno(L) =ω.

Lemma 2. It is decidable for each nonterminal X whether L(X)is a prefix chain.

(7)

Proof. By Lemma 1 we can effectively generate an infinite sequence w0,w₁, . . ., either ascending or descending, belonging toL(X)along with its limit, which is of the formuv^ω for some u∈Σ^∗,v∈Σ⁺. Now if the sequence is either a>_s-chain or a<_s-chain, thenL(X)cannot be a prefix chain.

Otherwise, the sequence itself is a prefix chain and its limit isuv^ω, hence the whole language L(X) is a prefix chain if and only ifL(X)⊆Pref(uv^ω)which can be effectively decided sincePref(uv^ω)is a regular language.

Lemma 3. If L is a context-free language with o(L) =ω, then^WL is a computable regular word.

Proof. Applying Lemma 1 we can generate a (necessarily increasing) sequencew0<w1< . . .of words belonging toLalong with their supremumuv^ω. Since the order type ofLis alsoω, its supremum has to coincide byuv^ω.

Lemma 4. Let X be a nonterminal such that L(X)is not a prefix chain andα be a sentential form with L(α)being infinite. Then o(Xα)is an infinite order type different fromω.

Proof. SinceL(X)is not a prefix chain and is infinite, there existsu,v∈L(X)withu<sv. ThenuL(α)<s

vyfor any memberyofL(α), hence each suchvyhas infinitely many lower bounds inL(Xα), thuso(Xα) cannot beω.

The last lemma of the subsection is a bit technical:

Lemma 5. If L1⊆Pref(uv^ω)is a context-free prefix chain with order typeω for some words u∈Σ^∗and v∈Σ⁺and L2⊆Σ^∗is a context-free language with order typeω, then it is decidable whether there exists some w₁∈L₁, u^′∈Σ^∗and a∈Σwith w₁u^′<_puv^ω, w₁u^′a<_suv^ω and u^′a<_p^WL₂.

Proof. Let us writeu=a1. . .akandv=b1. . .bt and consider the automatonMu,v of Figure 1. For each stateqofM_u,v, letL_u,v(q)stand for the (regular) language{w∈Σ^∗:q_ε·w=q}.

LetQ1⊆Q<p be the set of those states qfor whichL1∩Lu,v(q)is nonempty. Since eachLu,v(q)is regular,Q1is computable, moreover,q∈Q1if and only ifqε·w=qfor somew∈L1.

Now by Lemma 3 we can compute the regular wordu2v^ω₂ =^WL2 and consider the direct product automatonM=M_u,v×Mu2,v2 where inMu2,v2 we use the primed versionq^′of each stateq.

We claim that there exists wordsw1,u^′ and a letterasatisfying the conditions of the lemma if and only if a state of the form(q<s,q^′)is reachable from a state(p,q^′_ε)inMfor someq^′∈Q^′_<_p and p∈Q1.

Indeed: assume (p,q^′_ε)·w= (q_<_s,q^′) for such states: let us choose p,q^′ and w so that |w|is the shortest possible. Sincep∈Q1andq<s ∈/Q1,w=u^′afor some wordu^′∈Σ^∗anda∈Σ. Then,q^′_ε·u^′∈ Q^′_<_p, since bothq^′_<_s andq^′_>_sare trap states inMu2,v2. Sincewis a shortest possible word andq_<_s,q_>_s are trap states inM_u,v, we get that p·u^′∈Q_<_p. Since p∈Q1, there is some wordw1∈L1withq_ε·w1=p.

Thus, this choice ofw1,u^′andasatisfies the conditions of the lemma.

And similarly, givenw1∈L1,u∈Σ^∗anda∈Σsatisfying the conditions we can define p=q_ε·w1, q^′=q^′_ε·u^′a.

3.2 The main decision procedures

In this part we flesh out the “top-level” results leading to the aforementioned computability result: that the order type of well-ordered context-free languages with Hausdorff-rank at most 1 is computable.

The main building block is the result that it is decidable for any context-free language Lwhether o(L) =ω holds.

(8)

Lemma 6. Ifα=X1X2is a sentential form and for each1≤i≤2we know whether o(X_i) =ωholds or not, then it is also effectively computable whether o(α)isω.

Proof. Clearly, ifo(X1)oro(X2)is some infinite order type different fromω, theno(α)is also such an order type and we can stop. Also, ifX1∈Σ, then o(α) =o(X₂) and we are done. We can also decide whetherL(X1X2)is well-ordered and if not, it cannot beω and we can stop.

So we can assume thatX1∈Nand thusL(X1)is infinite and hence o(X1) =ω by assumption. Let L1,L2andLrespectively stand forL(X₁),L(X₂)andL(X₁X2).

IfX2∈Σ, then we have several subcases:

1. If there exists a wordu∈L₁and some lettera<X₂withuabeing a prefix of infinitely many words inL1, thenuX2 is strictly larger than each of these wordsv, and sovX2<suX2as well, thuso(L) cannot beωbut some other infinite order type (asuX2∈Lis preceded by infinitely many members ofL).

2. Otherwise, letu∈L1. It suffices to show that there are only finitely many elements inLwhich are smaller thanuX2. AssumevX2∈Lis so thatvX2<puX2 thenv<pu as well, and asuhas only finitely many proper prefixes, we have that there can only be a finite number of such wordsv. Now ifvX2<_suX2, then eitherv<_ℓu(that’s again a finite number of possibilities, aso(L1) =ωimplies that any wordu∈L1has only a finite number of lower bounds inL1) oru<ℓv. Thus, in this case (asvX2<_suX2rules out the possibility ofu<_sv) it has to hold thatu<_pvandv=uaxfor some a<X2. There are only finitely many possible choices for such lettersa<X2and by assumption (see the condition of the previous subcase), for each such letter,uacan be a prefix of only finitely many wordsv∈L1.

Thus,uX2 is larger than only a finite number of members ofLfor eachu∈L1, ando(L) =ω in this case.

We still have to show that it is decidable which of the two cases holds. Observe that ifuais a prefix of infinitely many words inL1 for some wordu∈L1 and letter a<X2, then no wordw∈L1can satisfy ua<_swas then the order type ofL1could not beω. Thus,ua<_p^WL1in this case for some wordu∈L1

and lettera<X₂. On the other hand, ifua<_p^WL₁for some wordu∈L₁and lettera<X₂, then for any w∈L1withua<_ℓwwe cannot haveua<swsince in that caseua<sW

L1would hold sincew<_ℓ^WL1. Hence, wheneverua<_ℓwfor some wordw∈L1, thenuais a prefix ofw. Since the order type ofL1is assumed to beω, anduais a prefix of^WL₁, there has to be an infinite number of such wordsw.

Thus, if X2 ∈Σ, then o(L) is not ω if and only ifua<p W

L1 for some u∈L1 and a<X2. This condition is decidable: ^WL1is a computable regular wordu1v^ω₁ by Lemma 3 and we only have to check whether the language L1a∩Pref(u₁v^ω₁) is nonempty for some letter a<X2 and the latter is a regular language by Proposition 1.

IfX2∈N, and thuso(X2) =ω, then we again have several cases:

3. If there exist wordsu,v∈L1withu<_sv, then by Lemma 4 we geto(L)is some infinite order type different thanω.

4. Otherwise,L1is an infinite prefix chain, that is,L1⊆Pref(uv^ω)for some wordsu∈Σ^∗,v∈Σ⁺. We have several subcases.

(a) Assume ^WL1 <_ℓ^WL. Since both are ω-words, we have<_s here. Thus there exists some w=w₁w₂∈L,w₁∈L₁,w₂∈L₂with^WL₁<_sw. SinceL₁is an infinite prefix chain, there exists somew^′₁∈L1,w^′₁<pW

L1and|w^′₁|>|w|, yieldingw^′₁<sw. Sow^′₁L2<sw, thusw∈L has an infinite number of lower bounds inLando(L)6=ωin this subcase.

(9)

(b) Assume there exists some w1∈L1 such thatw1·^WL2<_ℓ^WL1. Again, both beingω-words this has to be a<srelation. This means that there exists somew^′₁∈L1withw1·^WL2<sw^′₁ and sow1·L2<_sw^′₁yfor any memberyofL2, thus again,o(L)6=ω in this case.

(c) Assume none of the previous conditions hold: ^WL≤ℓW

L1(hence^WL1L2=^WL1=uv^ω as well) and for eachw1∈L1we have^WL1≤_ℓw1·^WL2.

We claim that in this case o(L) =ω if and only if for each w1 ∈L1 and w<p uv^ω with w1<_pwthere exist only finitely many wordsw2∈L2such thatw1w2<_sw.

For one direction, assume the latter condition holds. It suffices to show that for eachw<_puv^ω there exist only finitely many many wordsw1∈L1,w2∈L2with w1w2<_ℓw, since (as the supremum of these words is^WL1L2=uv^ω) this yields that each prefix ofo(L)is finite, thus o(L) =ω. So letw1w2<_ℓw. Sincew1∈L1 andL2⊆Pref(uv^ω), and w<_puv^ω, we either havew1<_pworw≤_pw1. The latter would contradict tow1w2<_ℓw, hence we havew1<_pw.

Thus, there are only finitely many options for choosing such a wordw1∈L1. Clearly, for each fixedw1∈L1 there are only finitely many options for choosing wordsw2 withw1w2<_pw and by the condition there are only finitely many wordsw₂∈L₂ withw₁w₂<_sw, hence in total, there are only finitely many words inL1L2precedingw, showingo(L) =ω.

For the other direction, assume the latter condition doesnot hold. Then there exists some w1∈L1,w<puv^ω withw1<pwsuch thatw1w2<swfor infinitely many wordsw2∈L2. In this case we can writew2=w^′₂axandw=w1w^′₂byuniquely for some lettersa<band words w^′₂,x,y. Since there are only finitely many options for the fixed wordswandw1to choosew^′₂, banda, for some paira<bof letters and wordw^′₂withw1w^′₂b≤pwthere are infinitely many wordsw2∈L2such thatw^′₂a≤_pw2. LetL^′₂⊆L2denote the (infinite) set of these words and letw^′₁∈L1be some word inL1with|w^′₁| ≥ |w1w^′₂b|. SinceL1is an infinite prefix chain, such a wordw^′₁ exists andw1w^′₂b≤pw^′₁, thusw1L^′₂<sw^′₁, and sow1L^′₂<sw^′₁yfor an arbitrary memberyofL2, and sow^′₁yis preceded by infinitely many words inL, yieldingo(L)6=ω.

We still have to show that the condition of Subcase (c) is decidable. We claim that the condition doesnothold if and only if there exists somew1∈L1and wordsu^′∈Σ^∗,a∈Σwithw1u^′<_puv^ω, w1u^′a<_suv^ω and u^′a<_p^WL2. Indeed, ifw1,u^′,aare such objects, then there is a unique letter b∈Σwithw1u^′b<puv^ω. Now we can choosew=w1u^′bas the conditionu^′a<pW

L2implies the existence of infinitely many wordsw2∈Lwithu^′a≤_pw2. The other direction is already treated in the proof of Subcase (c).

The condition in this form is decidable due to Lemma 5.

As we covered all the possible scenarios, and in each case we got decidability, we proved the lemma.

Corollary 1. Assumeα=X1. . .X_nis some sentential form where for each X_iwe know whether o(X_i) =ω holds. Then we can decide whether o(α) =ω holds.

Proof. We can assume thatα∈/Σ^∗(otherwiseo(α) =1 and we can stop). Using the standard construc- tion of introducing fresh nonterminalsY1, . . .Yn−1 and productionsY1→X1Y2,Y2→X2Y3,. . . ,Yn−1→ Xn−1Xnwe can successively decide forYn−1,Y_n−2, . . . ,Y₁whethero(Y_i) =ω; if for any of them we have thato(Yi)is some other infinite order type, then so iso(Y1) =o(α), otherwiseo(α) =ω.

Theorem 2. It is decidable for each recursive nonterminal X whether o(X) =ω holds.

Proof. By our assumptions ofG, L(X) is infinite. In the first step, we decide whether L(X) is well- ordered. If not, theno(X)is clearly notω and we can stop.