Aperiodic Binary Strings Redux

Computer Science Level 5

A binary string is a string whose characters are only 0 s and/or 1 s. For example, $0101, 11111, 110001$ are binary strings.

A string $s$ is periodic if there exists another string $t$ whose length is less than $s$ such that concatenating several copies of $t$ produces $s$ . For example, $0101$ is periodic ( $t = 01$ ), and $1111$ is periodic ( $t = 11$ or $t = 1$ ), but $10101$ is not periodic. A string is aperiodic if it is not periodic.

This problem asks about the number of aperiodic binary strings of length $23$ . For this problem, compute the number of aperiodic binary strings of length $(23!)^{23}$ (that is, 23 factorial raised to the 23th power). Enter your answer modulo $10^9 + 7 = 1000000007$ .

Clearly inspired from the linked problem above .

The answer is 676103472.

1 solution

Ivan Koswara
Nov 28, 2015

Let's skip to the formal stuff immediately. For a gentler introduction, try the linked problem above.

First, definitions.

If $s$ is a string, define $|s|$ to be its length.
If $s$ is a string, let $t$ be a string such that $s$ is made up of several (probably one) copies of $t$ ; call $t$ to be a building block of $s$ . (A string is its own building block; the periodic strings are exactly the strings that have another building block besides its own. There can be multiple building blocks; for example, $1111$ has building blocks $1111, 11, 1$ .)
If $s$ is a string and $t$ is (one of) its building block, let $t$ 's repeat count be $\frac{|s|}{|t|}$ ; that is, the number of copies of $t$ necessary to make $s$ .
Define $n$ to be a natural number such that we want to compute the number of aperiodic binary strings of length $n$ . (That is, $n = (23!)^{23}$ in this problem, but we generalize.)
Define $P$ to be the set of prime divisors of $n$ .
If $X$ is a subset of the natural numbers, define $\Pi(X)$ to be the product of its elements.
For all $X \subset P$ , define $S_X$ to be the set of all binary strings of length $n$ which has a building block with repeat count $\Pi(X)$ .

Now, the claim: the number of aperiodic strings of length $n$ is

$\displaystyle\Large{\sum_{X \subset P} (-1)^{|S|} \cdot 2^{\frac{n}{\Pi(X)}}}$

To prove this is correct, we will begin with several simple observations about our $S_I$ 's.

Result 1 : If $s$ is periodic with repeat count $m$ , then $m$ divides $|s|$ .
Trivial; having $m$ copies of a string means the total length is a multiple of $m$ .

Result 2 : If $s$ is periodic with repeat count $m$ and $d|m$ , then $s$ is periodic with repeat count $d$ . If $s$ is made up of building block $t$ with repeat count $m$ , then $s$ is also made up of building block $t'$ with repeat count $d$ , where $t'$ is $t$ repeated $\frac{m}{d}$ times. The corollary uses the fact that $\Pi(A) | \Pi(B)$ .

Result 3 : $S_\emptyset$ is precisely the set of all binary strings of length $n$ .
Result 4 : If $X \neq \emptyset$ , $S_X$ only contains periodic binary strings (of length $n$ ).
Trivial by definition.

Result 5 : Every periodic binary string (of length $n$ ) is contained in $S_{\{d\}}$ for some $d \in P$ . This follows from Result 2. Suppose $s$ is periodic with repeat count $m$ , and let $d$ be one prime divisor of $m$ (exists because $m \neq 1$ by definition of being periodic). Then $s$ is also periodic with repeat count $d$ . Since $d$ is a prime number, it is also necessarily a prime divisor of $n$ (Result 1), and this implies $s \in S_{\{d\}}$ .

Result 6 : If $A,B \subset P$ , then $S_A \cap S_B = S_{A \cup B}$ .
This should be easy, but strangely I can't find a proof for this. I'll edit this later.

Result 8 : If $X \subset P$ , then $|S_X| = 2^{n/\Pi(X)}$ .
Straightforward. All building blocks are of length $\frac{n}{\Pi(X)}$ (because they are repeated $\Pi(X)$ times, and so we're counting binary strings of length $\frac{n}{\Pi(X)}$ .

Let $A$ be our sought number. By Results 3, 4, 5, we have $A = \left| S_\emptyset \setminus \cup_{d \in P} S_{\{d\}} \right|$ . By Results 6 and 7, we can implement this effectively by recursion. (This is also the Principle of Inclusion-Exclusion.) The base case is just a single set (without anything subtracted from it), which Result 8 can handle. We can actually expand this recursion fully to show that it's equal to the claim above.

import itertools

MOD = 1000000007

# one-liner factorial
def factorial(n): return 1 if n == 0 else n * factorial(n-1)

def factorize(n):
    # factorizes number n into [(f1,e1), (f2,e2), ...]
    # where n = f1^e1 * f2^e2 * ...
    divisor = 2
    result = []
    while n > 1:
        count = 0
        while n % divisor == 0:
            n //= divisor
            count += 1
        if count: result.append((divisor, count))
        divisor += 1
    return res

def triple_pow(a, b, c):
    # computes a^(b^c)
    result = a
    for i in range(c): result = pow(result, b, MOD)
    return result

def solve(factors, value=2):
    # factors is [(f1,e1), (f2,e2), ...]
    # finds number of aperiodic binary strings on f1^e1 * f2^e2 * ... bits
    if len(factors) == 0: return value
    f = factors[0]
    return (solve(factors[1:], triple_pow(value, f[0], f[1])) -
            solve(factors[1:], triple_pow(value, f[0], f[1]-1))) % MOD

number = factorial(23) ** 23
factors = factorize(number)
print(solve(factors))
# prints 676103472

Handling $(23!)^{23}$ is hard; we will denote our number in its prime factorization instead. Our number $n$ will be represented as [(f1,e1), (f2,e2), ...] , which means $n = f_1^{e_1} f_2^{e_2} \ldots$ . The functions factorial and factorize , along with the assignment to factors at the end, are just means to compute the factorization. (There should be a more efficient method on doing this, but I suppose this is the clearest. If necessary, you can sidestep it entirely by computing the expansion by hand, also to avoid having problems with finite precision integers.)

triple_pow computes $a^{b^c}$ . Since our number is represented in its prime factorization, we will use this a lot. For example, computing $2^n$ becomes computing $2^{f_1^{e_1} f_2^{e_2} \ldots} = ((2^{f_1^{e_1}})^{f_2^{e_2}})^\ldots$ . Of course, we apply modulo here, since we don't actually need to care about the result, just its value modulo $10^9+7$ .

solve is the meat of the problem, which implements the recursion in Result 7 in a straightforward manner.

Another method to compute $A$ is possible in this manner, thanks to Challenge Master. Let $A(k)$ be the number of aperiodic strings of length $k$ . For every binary string of length $n$ , there is a unique smallest building block, which is necessarily aperiodic; thus for every binary string of length $n$ , we can map it to an aperiodic string. All aperiodic strings considered have lengths that divide $n$ ; additionally, if $d|n$ and $t$ is an aperiodic string of length $d$ , then it has a unique pre-image to some string of length $n$ (namely $t$ repeated $\frac{n}{d}$ times). Thus we have the equality $2^n = \sum_{d|n} A(d)$ : every binary string of length $n$ is mapped uniquely to some aperiodic string of length that divides $n$ , and all such aperiodic string has a pre-image. Using the Mobius inversion formula, we obtain $A(n) = \sum_{d|n} \mu(d) 2^{n/d}$ . Since $\mu(d) = 0$ if $d$ is not square-free, the only terms that matter have $d$ that is square-free, and thus we can map it to a subset $X$ of prime divisors of $n$ , where $\mu(d) = (-1)^{|X|}$ and $d = \Pi(X)$ ; this gives the same formula as above.

Moderator note:

A slightly faster approach would be to find the Mobius inversion of $2^n$ . As pointed out, if we associate each binary string to it's smallest Aperiodic part, we get the bijection that $\sum_{d \mid n } A(d) = 2^n$ . Applying the Mobius inversion formula, we get that $A(n) = \sum_{ d \mid n} \mu(d) 2^{ n / d}$ . This allows us to ignore divisors that have a square term, which means that we only need to look at $2^9$ summands only.

Challenge Master: At the end, I do actually compute that number, although the method to reach it is a little roundabout with PIE. However, that method is very elegant, that I'll probably edit it into the solution. I guess I'm just not well-versed enough at number theory to think of that.

Ivan Koswara - 5 years, 6 months ago

Aperiodic Binary Strings Redux

Clearly inspired from the linked problem above .

The answer is 676103472.

1 solution

Moderator note:

0 pending reports