Unavoidable pattern

In mathematics and theoretical computer science, a pattern is an unavoidable pattern if it is unavoidable on any finite alphabet.

Definitions

Pattern

Like a word, a pattern (also called term) is a sequence of symbols over some alphabet.

The minimum multiplicity of the pattern p {\displaystyle p} is m ( p ) = min ( c o u n t p ( x ) : x p ) {\displaystyle m(p)=\min(\mathrm {count_{p}} (x):x\in p)} where c o u n t p ( x ) {\displaystyle \mathrm {count_{p}} (x)} is the number of occurrence of symbol x {\displaystyle x} in pattern p {\displaystyle p} . In other words, it is the number of occurrences in p {\displaystyle p} of the least frequently occurring symbol in p {\displaystyle p} .

Instance

Given finite alphabets Σ {\displaystyle \Sigma } and Δ {\displaystyle \Delta } , a word x Σ {\displaystyle x\in \Sigma ^{*}} is an instance of the pattern p Δ {\displaystyle p\in \Delta ^{*}} if there exists a non-erasing semigroup morphism f : Δ Σ {\displaystyle f:\Delta ^{*}\rightarrow \Sigma ^{*}} such that f ( p ) = x {\displaystyle f(p)=x} , where Σ {\displaystyle \Sigma ^{*}} denotes the Kleene star of Σ {\displaystyle \Sigma } . Non-erasing means that f ( a ) ε {\displaystyle f(a)\neq \varepsilon } for all a Δ {\displaystyle a\in \Delta } , where ε {\displaystyle \varepsilon } denotes the empty string.

Avoidance / Matching

A word w {\displaystyle w} is said to match, or encounter, a pattern p {\displaystyle p} if a factor (also called subword or substring) of w {\displaystyle w} is an instance of p {\displaystyle p} . Otherwise, w {\displaystyle w} is said to avoid p {\displaystyle p} , or to be p {\displaystyle p} -free. This definition can be generalized to the case of an infinite w {\displaystyle w} , based on a generalized definition of "substring".

Avoidability / Unavoidability on a specific alphabet

A pattern p {\displaystyle p} is unavoidable on a finite alphabet Σ {\displaystyle \Sigma } if each sufficiently long word x Σ {\displaystyle x\in \Sigma ^{*}} must match p {\displaystyle p} ; formally: if n N .   x Σ .   ( | x | n x  matches  p ) {\displaystyle \exists n\in \mathrm {N} .\ \forall x\in \Sigma ^{*}.\ (|x|\geq n\implies x{\text{ matches }}p)} . Otherwise, p {\displaystyle p} is avoidable on Σ {\displaystyle \Sigma } , which implies there exist infinitely many words over the alphabet Σ {\displaystyle \Sigma } that avoid p {\displaystyle p} .

By Kőnig's lemma, pattern p {\displaystyle p} is avoidable on Σ {\displaystyle \Sigma } if and only if there exists an infinite word w Σ ω {\displaystyle w\in \Sigma ^{\omega }} that avoids p {\displaystyle p} .[1]

Maximal p-free word

Given a pattern p {\displaystyle p} and an alphabet Σ {\displaystyle \Sigma } . A p {\displaystyle p} -free word w Σ {\displaystyle w\in \Sigma ^{*}} is a maximal p {\displaystyle p} -free word over Σ {\displaystyle \Sigma } if a w {\displaystyle aw} and w a {\displaystyle wa} match p {\displaystyle p} a Σ {\displaystyle \forall a\in \Sigma } .

Avoidable / Unavoidable pattern

A pattern p {\displaystyle p} is an unavoidable pattern (also called blocking term) if p {\displaystyle p} is unavoidable on any finite alphabet.

If a pattern is unavoidable and not limited to a specific alphabet, then it is unavoidable for any finite alphabet by default. Conversely, if a pattern is said to be avoidable and not limited to a specific alphabet, then it is avoidable on some finite alphabet by default.

k-avoidable / k-unavoidable

A pattern p {\displaystyle p} is k {\displaystyle k} -avoidable if p {\displaystyle p} is avoidable on an alphabet Σ {\displaystyle \Sigma } of size k {\displaystyle k} . Otherwise, p {\displaystyle p} is k {\displaystyle k} -unavoidable, which means p {\displaystyle p} is unavoidable on every alphabet of size k {\displaystyle k} .[2]

If pattern p {\displaystyle p} is k {\displaystyle k} -avoidable, then p {\displaystyle p} is g {\displaystyle g} -avoidable for all g k {\displaystyle g\geq k} .

Given a finite set of avoidable patterns S = { p 1 , p 2 , . . . , p i } {\displaystyle S=\{p_{1},p_{2},...,p_{i}\}} , there exists an infinite word w Σ ω {\displaystyle w\in \Sigma ^{\omega }} such that w {\displaystyle w} avoids all patterns of S {\displaystyle S} .[1] Let μ ( S ) {\displaystyle \mu (S)} denote the size of the minimal alphabet Σ {\displaystyle \Sigma '} such that w Σ ω {\displaystyle \exists w'\in {\Sigma '}^{\omega }} avoiding all patterns of S {\displaystyle S} .

Avoidability index

The avoidability index of a pattern p {\displaystyle p} is the smallest k {\displaystyle k} such that p {\displaystyle p} is k {\displaystyle k} -avoidable, and {\displaystyle \infty } if p {\displaystyle p} is unavoidable.[1]

Properties

  • A pattern q {\displaystyle q} is avoidable if q {\displaystyle q} is an instance of an avoidable pattern p {\displaystyle p} .[3]
  • Let avoidable pattern p {\displaystyle p} be a factor of pattern q {\displaystyle q} , then q {\displaystyle q} is also avoidable.[3]
  • A pattern q {\displaystyle q} is unavoidable if and only if q {\displaystyle q} is a factor of some unavoidable pattern p {\displaystyle p} .
  • Given an unavoidable pattern p {\displaystyle p} and a symbol a {\displaystyle a} not in p {\displaystyle p} , then p a p {\displaystyle pap} is unavoidable.[3]
  • Given an unavoidable pattern p {\displaystyle p} , then the reversal p R {\displaystyle p^{R}} is unavoidable.
  • Given an unavoidable pattern p {\displaystyle p} , there exists a symbol a {\displaystyle a} such that a {\displaystyle a} occurs exactly once in p {\displaystyle p} .[3]
  • Let n N {\displaystyle n\in \mathrm {N} } represent the number of distinct symbols of pattern p {\displaystyle p} . If | p | 2 n {\displaystyle |p|\geq 2^{n}} , then p {\displaystyle p} is avoidable.[3]

Zimin words

Given alphabet Δ = { x 1 , x 2 , . . . } {\displaystyle \Delta =\{x_{1},x_{2},...\}} , Zimin words (patterns) are defined recursively Z n + 1 = Z n x n + 1 Z n {\displaystyle Z_{n+1}=Z_{n}x_{n+1}Z_{n}} for n Z + {\displaystyle n\in \mathrm {Z} ^{+}} and Z 1 = x 1 {\displaystyle Z_{1}=x_{1}} .

Unavoidability

All Zimin words are unavoidable.[4]

A word w {\displaystyle w} is unavoidable if and only if it is a factor of a Zimin word.[4]

Given a finite alphabet Σ {\displaystyle \Sigma } , let f ( n , | Σ | ) {\displaystyle f(n,|\Sigma |)} represent the smallest m Z + {\displaystyle m\in \mathrm {Z} ^{+}} such that w {\displaystyle w} matches Z n {\displaystyle Z_{n}} for all w Σ m {\displaystyle w\in \Sigma ^{m}} . We have following properties:[5]

  • f ( 1 , q ) = 1 {\displaystyle f(1,q)=1}
  • f ( 2 , q ) = 2 q + 1 {\displaystyle f(2,q)=2q+1}
  • f ( 3 , 2 ) = 29 {\displaystyle f(3,2)=29}
  • f ( n , q ) n 1 q = q q q n 1 {\displaystyle f(n,q)\leq {^{n-1}q}=\underbrace {q^{q^{\cdot ^{\cdot ^{q}}}}} _{n-1}}

Z n {\displaystyle Z_{n}} is the longest unavoidable pattern constructed by alphabet Δ = { x 1 , x 2 , . . . , x n } {\displaystyle \Delta =\{x_{1},x_{2},...,x_{n}\}} since | Z n | = 2 n 1 {\displaystyle |Z_{n}|=2^{n}-1} .

Pattern reduction

Free letter

Given a pattern p {\displaystyle p} over some alphabet Δ {\displaystyle \Delta } , we say x Δ {\displaystyle x\in \Delta } is free for p {\displaystyle p} if there exist subsets A , B {\displaystyle A,B} of Δ {\displaystyle \Delta } such that the following hold:

  1. u v {\displaystyle uv} is a factor of p {\displaystyle p} and u A {\displaystyle u\in A} u v {\displaystyle uv} is a factor of p {\displaystyle p} and v B {\displaystyle v\in B}
  2. x A B B A {\displaystyle x\in A\backslash B\cup B\backslash A}

For example, let p = a b c b a b {\displaystyle p=abcbab} , then b {\displaystyle b} is free for p {\displaystyle p} since there exist A = a c , B = b {\displaystyle A=ac,B=b} satisfying the conditions above.

Reduce

A pattern p Δ {\displaystyle p\in \Delta ^{*}} reduces to pattern q {\displaystyle q} if there exists a symbol x Δ {\displaystyle x\in \Delta } such that x {\displaystyle x} is free for p {\displaystyle p} , and q {\displaystyle q} can be obtained by removing all occurrence of x {\displaystyle x} from p {\displaystyle p} . Denote this relation by p x q {\displaystyle p{\stackrel {x}{\rightarrow }}q} .

For example, let p = a b c b a b {\displaystyle p=abcbab} , then p {\displaystyle p} can reduce to q = a c a {\displaystyle q=aca} since b {\displaystyle b} is free for p {\displaystyle p} .

Locked

A word w {\displaystyle w} is said to be locked if w {\displaystyle w} has no free letter; hence w {\displaystyle w} can not be reduced.[6]

Transitivity

Given patterns p , q , r {\displaystyle p,q,r} , if p {\displaystyle p} reduces to q {\displaystyle q} and q {\displaystyle q} reduces to r {\displaystyle r} , then p {\displaystyle p} reduces to r {\displaystyle r} . Denote this relation by p r {\displaystyle p{\stackrel {*}{\rightarrow }}r} .

Unavoidability

A pattern p {\displaystyle p} is unavoidable if and only if p {\displaystyle p} reduces to a word of length one; hence w {\displaystyle \exists w} such that | w | = 1 {\displaystyle |w|=1} and p w {\displaystyle p{\stackrel {*}{\rightarrow }}w} .[7][4]

Graph pattern avoidance[8]

Avoidance / Matching on a specific graph

Given a simple graph G = ( V , E ) {\displaystyle G=(V,E)} , a edge coloring c : E Δ {\displaystyle c:E\rightarrow \Delta } matches pattern p {\displaystyle p} if there exists a simple path P = [ e 1 , e 2 , . . . , e r ] {\displaystyle P=[e_{1},e_{2},...,e_{r}]} in G {\displaystyle G} such that the sequence c ( P ) = [ c ( e 1 ) , c ( e 2 ) , . . . , c ( e r ) ] {\displaystyle c(P)=[c(e_{1}),c(e_{2}),...,c(e_{r})]} matches p {\displaystyle p} . Otherwise, c {\displaystyle c} is said to avoid p {\displaystyle p} or be p {\displaystyle p} -free.

Similarly, a vertex coloring c : V Δ {\displaystyle c:V\rightarrow \Delta } matches pattern p {\displaystyle p} if there exists a simple path P = [ c 1 , c 2 , . . . , c r ] {\displaystyle P=[c_{1},c_{2},...,c_{r}]} in G {\displaystyle G} such that the sequence c ( P ) {\displaystyle c(P)} matches p {\displaystyle p} .

Pattern chromatic number

The pattern chromatic number π p ( G ) {\displaystyle \pi _{p}(G)} is the minimal number of distinct colors needed for a p {\displaystyle p} -free vertex coloring c {\displaystyle c} over the graph G {\displaystyle G} .

Let π p ( n ) = max { π p ( G ) : G G n } {\displaystyle \pi _{p}(n)=\max\{\pi _{p}(G):G\in G_{n}\}} where G n {\displaystyle G_{n}} is the set of all simple graphs with a maximum degree no more than n {\displaystyle n} .

Similarly, π p ( G ) {\displaystyle \pi _{p}'(G)} and π p ( n ) {\displaystyle \pi _{p}'(n)} are defined for edge colorings.

Avoidability / Unavoidability on graphs

A pattern p {\displaystyle p} is avoidable on graphs if π p ( n ) {\displaystyle \pi _{p}(n)} is bounded by c p {\displaystyle c_{p}} , where c p {\displaystyle c_{p}} only depends on p {\displaystyle p} .

  • Avoidance on words can be expressed as a specific case of avoidance on graphs; hence a pattern p {\displaystyle p} is avoidable on any finite alphabet if and only if π p ( P n ) c p {\displaystyle \pi _{p}(P_{n})\leq c_{p}} for all n Z + {\displaystyle n\in \mathrm {Z} ^{+}} , where P n {\displaystyle P_{n}} is a graph of n {\displaystyle n} vertices concatenated.

Probabilistic bound on πp(n)

There exists an absolute constant c {\displaystyle c} , such that π p ( n ) c n m ( p ) m ( p ) 1 c n 2 {\displaystyle \pi _{p}(n)\leq cn^{\frac {m(p)}{m(p)-1}}\leq cn^{2}} for all patterns p {\displaystyle p} with m ( p ) 2 {\displaystyle m(p)\geq 2} .[8]

Given a pattern p {\displaystyle p} , let n {\displaystyle n} represent the number of distinct symbols of p {\displaystyle p} . If | p | 2 n {\displaystyle |p|\geq 2^{n}} , then p {\displaystyle p} is avoidable on graphs.

Explicit colorings

Given a pattern p {\displaystyle p} such that c o u n t p ( x ) {\displaystyle count_{p}(x)} is even for all x p {\displaystyle x\in p} , then π p ( K 2 k ) 2 k 1 {\displaystyle \pi _{p}'(K_{2}^{k})\leq 2^{k}-1} for all k 1 {\displaystyle k\geq 1} , where K n {\displaystyle K_{n}} is the complete graph of n {\displaystyle n} vertices.[8]

Given a pattern p {\displaystyle p} such that m ( p ) 2 {\displaystyle m(p)\geq 2} , and an arbitrary tree T {\displaystyle T} , let S {\displaystyle S} be the set of all avoidable subpatterns and their reflections of p {\displaystyle p} . Then π p ( T ) 3 μ ( S ) {\displaystyle \pi _{p}(T)\leq 3\mu (S)} .[8]

Given a pattern p {\displaystyle p} such that m ( p ) 2 {\displaystyle m(p)\geq 2} , and a tree T {\displaystyle T} with degree n 2 {\displaystyle n\geq 2} . Let S {\displaystyle S} be the set of all avoidable subpatterns and their reflections of p {\displaystyle p} , then π p ( T ) 2 ( n 1 ) μ ( S ) {\displaystyle \pi _{p}'(T)\leq 2(n-1)\mu (S)} .[8]

Examples

  • The Thue–Morse sequence is cube-free and overlap-free; hence it avoids the patterns x x x {\displaystyle xxx} and x y x y x {\displaystyle xyxyx} .[2]
  • A square-free word is one avoiding the pattern x x {\displaystyle xx} . The word over the alphabet { 0 , ± 1 } {\displaystyle \{0,\pm 1\}} obtained by taking the first difference of the Thue–Morse sequence is an example of an infinite square-free word.[9][10]
  • The patterns x {\displaystyle x} and x y x {\displaystyle xyx} are unavoidable on any alphabet, since they are factors of the Zimin words.[11][1]
  • The power patterns x n {\displaystyle x^{n}} for n 3 {\displaystyle n\geq 3} are 2-avoidable.[1]
  • All binary patterns can be divided into three categories:[1]
    • ε , x , x y x {\displaystyle \varepsilon ,x,xyx} are unavoidable.
    • x x , x x y , x y y , x x y x , x x y y , x y x x , x y x y , x y y x , x x y x x , x x y x y , x y x y y {\displaystyle xx,xxy,xyy,xxyx,xxyy,xyxx,xyxy,xyyx,xxyxx,xxyxy,xyxyy} have avoidability index of 3.
    • others have avoidability index of 2.
  • a b w b a x b c y a c z c a {\displaystyle abwbaxbcyaczca} has avoidability index of 4, as well as other locked words.[6]
  • a b v a c w b a x b c y c d a z d c d {\displaystyle abvacwbaxbcycdazdcd} has avoidability index of 5.[12]
  • The repetitive threshold R T ( n ) {\displaystyle RT(n)} is the infimum of exponents k {\displaystyle k} such that x k {\displaystyle x^{k}} is avoidable on an alphabet of size n {\displaystyle n} . Also see Dejean's theorem.

Open problems

  • Is there an avoidable pattern p {\displaystyle p} such that the avoidability index of p {\displaystyle p} is 6?
  • Given an arbitrarily pattern p {\displaystyle p} , is there an algorithm to determine the avoidability index of p {\displaystyle p} ?[1]

References

  1. ^ a b c d e f g Lothaire, M. (2002). Algebraic Combinatorics on Words. Cambridge University Press. ISBN 9780521812207.
  2. ^ a b Combinatorics on Words: Christoffel Words and Repetitions in Words. American Mathematical Soc. p. 127. ISBN 978-0-8218-7325-0.
  3. ^ a b c d e Schmidt, Ursula (1987-08-01). "Long unavoidable patterns". Acta Informatica. 24 (4): 433–445. doi:10.1007/BF00292112. ISSN 1432-0525. S2CID 7928450.
  4. ^ a b c Zimin, A. I. (1984). "Blocking Sets of Terms". Mathematics of the USSR-Sbornik. 47 (2): 353–364. Bibcode:1984SbMat..47..353Z. doi:10.1070/SM1984v047n02ABEH002647. ISSN 0025-5734.
  5. ^ Joshua, Cooper; Rorabaugh, Danny (2013). Bounds on Zimin Word Avoidance. arXiv:1409.3080. Bibcode:2014arXiv1409.3080C.
  6. ^ a b Baker, Kirby A.; McNulty, George F.; Taylor, Walter (1989-12-18). "Growth problems for avoidable words". Theoretical Computer Science. 69 (3): 319–345. doi:10.1016/0304-3975(89)90071-6. ISSN 0304-3975.
  7. ^ Bean, Dwight R.; Ehrenfeucht, Andrzej; McNulty, George F. (1979). "Avoidable patterns in strings of symbols". Pacific Journal of Mathematics. 85 (2): 261–294. doi:10.2140/pjm.1979.85.261. ISSN 0030-8730.
  8. ^ a b c d e Grytczuk, Jarosław (2007-05-28). "Pattern avoidance on graphs". Discrete Mathematics. The Fourth Caracow Conference on Graph Theory. 307 (11): 1341–1346. doi:10.1016/j.disc.2005.11.071. ISSN 0012-365X.
  9. ^ Combinatorics on Words: Christoffel Words and Repetitions in Words. American Mathematical Soc. p. 97. ISBN 978-0-8218-7325-0.
  10. ^ Fogg, N. Pytheas (2002-09-23). Substitutions in Dynamics, Arithmetics and Combinatorics. Springer Science & Business Media. p. 104. ISBN 978-3-540-44141-0.
  11. ^ Allouche, Jean-Paul; Shallit, Jeffrey; Shallit, Professor Jeffrey (2003-07-21). Automatic Sequences: Theory, Applications, Generalizations. Cambridge University Press. p. 24. ISBN 978-0-521-82332-6.
  12. ^ Clark, Ronald J. (2006-04-01). "The existence of a pattern which is 5-avoidable but 4-unavoidable". International Journal of Algebra and Computation. 16 (2): 351–367. doi:10.1142/S0218196706002950. ISSN 0218-1967.
  • Allouche, Jean-Paul; Shallit, Jeffrey (2003). Automatic Sequences: Theory, Applications, Generalizations. Cambridge University Press. ISBN 978-0-521-82332-6. Zbl 1086.11015.
  • Berstel, Jean; Lauve, Aaron; Reutenauer, Christophe; Saliola, Franco V. (2009). Combinatorics on words. Christoffel words and repetitions in words. CRM Monograph Series. Vol. 27. Providence, RI: American Mathematical Society. ISBN 978-0-8218-4480-9. Zbl 1161.68043.
  • Lothaire, M. (2011). Algebraic combinatorics on words. Encyclopedia of Mathematics and Its Applications. Vol. 90. With preface by Jean Berstel and Dominique Perrin (Reprint of the 2002 hardback ed.). Cambridge University Press. ISBN 978-0-521-18071-9. Zbl 1221.68183.
  • Pytheas Fogg, N. (2002). Berthé, Valérie; Ferenczi, Sébastien; Mauduit, Christian; Siegel, A. (eds.). Substitutions in dynamics, arithmetics and combinatorics. Lecture Notes in Mathematics. Vol. 1794. Berlin: Springer-Verlag. ISBN 3-540-44141-7. Zbl 1014.11015.