Brutomolecule: hacking biological defences (Part III)

Optimisation of Breach-time. \underline{\text{Optimisation of Breach-time.}}

Consider as per Part II a solar system with universal signature x Σ ω x\in\Sigma^{\omega} . Assume that x x satisfies the necessary and sufficient conditions under which h ( x ; p ) = lim n log ( E [ T x n ; p ] ) n h(x;~p)=\lim_{n}\frac{\log(\mathbb{E}[T_{x\mid n;~p}])}{n} exists for all distributions p p on Σ \Sigma .

Task 1. What distribution, p p , minimises h ( x ; p ) h(x;~p) ? What is the significance of this expression?

Task 2. Compute the minimised value of h ( x ; p ) h(x;~p) numerically under the setup Σ = { G , C , A , T } \Sigma=\{G,C,A,T\} and for

  • x = G G G G A C T T G G G G C T A T G G G G C A T T G G G G T T C A x = GGGGACTT\,GGGGCTAT\,GGGGCATT\,GGGGTTCA\ldots , ie blocks of the form G G G G A C T T , G G G G C A T T , GGGGACTT,~GGGGCATT,~\ldots randomly strewn together.

Bonus Task. Interpret this in terms of the problem in Part II with the Brutomolecule breaching life forms in a solar system, which deploy ever larger keys from the signature x Σ ω x\in\Sigma^{\omega} .


Please note. This is mathematics, so log \log is to base e e and not base 10 10 .


The answer is 1.213008.

This section requires Javascript.
You are seeing this because something didn't load right. We suggest you, (a) try refreshing the page, (b) enabling javascript if it is disabled on your browser and, finally, (c) loading the non-javascript version of this page . We're sorry about the hassle.

2 solutions

R Mathe
Jun 3, 2018

As per Part I the necessary and sufficient condition for h ( x ; p ) h(x;p) to exist under all non-trivial p p was that A α : = { i ω x ( i ) = α } A_{\alpha}:=\{i\in\omega\mid x(i)=\alpha\} be a set of asymptotic density for each α Σ \alpha\in\Sigma . Under this condition, one has

h ( x ; p ) = α Σ d ( A α ) log ( p ( α ) ) . h(x;~p)=-\sum_{\alpha\in\Sigma}d(A_{\alpha})\log(p(\alpha)).

Working with the inequality (basic result in Analysis) log ( x ) x 1 \log(x)\leq x-1 for all x ( 0 , ) x\in(0,\infty) , one has x log ( y / x ) x ( y / x 1 ) = x y -x\log(y/x)\geq -x(y/x-1)=x-y . From this one obtains the following identity

[ m c ] r c l α Σ d ( A α ) log ( p ( α ) ) = α Σ d ( A α ) log ( d ( A α ) ) + α Σ d ( A α ) log ( p ( α ) / d ( A α ) ) α Σ d ( A α ) log ( d ( A α ) ) + α Σ d ( A α ) p ( α ) = α Σ d ( A α ) log ( d ( A α ) ) + α Σ d ( A α ) α Σ p ( α ) = α Σ d ( A α ) log ( d ( A α ) ) + 1 1 \begin{array}{c}[mc]{rcl} -\sum_{\alpha\in\Sigma}d(A_{\alpha})\log(p(\alpha)) &= &-\sum_{\alpha\in\Sigma}d(A_{\alpha})\log(d(A_{\alpha})) +\sum_{\alpha\in\Sigma}-d(A_{\alpha})\log(p(\alpha)/d(A_{\alpha}))\\ &\geq &-\sum_{\alpha\in\Sigma}d(A_{\alpha})\log(d(A_{\alpha})) +\sum_{\alpha\in\Sigma}d(A_{\alpha})-p(\alpha)\\ &= &-\sum_{\alpha\in\Sigma}d(A_{\alpha})\log(d(A_{\alpha})) +\sum_{\alpha\in\Sigma}d(A_{\alpha})-\sum_{\alpha\in\Sigma}p(\alpha)\\ &= &-\sum_{\alpha\in\Sigma}d(A_{\alpha})\log(d(A_{\alpha}))+1-1\\ \end{array}

Hence h ( x ; p ) h(x;~p) is bounded below by h ( x ; p x ) h(x;~p_{x}) , where p x ( α ) = d ( A α ) p_{x}(\alpha)=d(A_{\alpha}) . Thus it is necessary and sufficient, in order to minimise h ( x ; ) h(x;\cdot) , that the distribution of letters in Σ \Sigma coincides with the asymptotic distribution of letters in x x . Under this choice one has

h min ( x ) = h ( x ; p x ) = α Σ d ( A α ) log ( d ( A α ) ) = H ( x ) h_{\min}(x) = h(x;~p_{x}) = -\sum_{\alpha\in\Sigma}d(A_{\alpha})\log(d(A_{\alpha})) = \mathcal{H}(x)

Thus the minimised exponential growth-exponent for waiting times coincides with the asymptotic entropy of the sequence x x . This completes Task 1.

Towards Task 2 we have that the letters in x x are asymptotically distributed with d ( A G ) = 1 2 d(A_{G})=\frac{1}{2} , d ( A T ) = 1 4 d(A_{T})=\frac{1}{4} and d ( A A ) = d ( A C ) = 1 8 d(A_{A})=d(A_{C})=\frac{1}{8} . The minimised value of h min ( x ) h_{\min}(x) is as above given by the asymptotic entropy

h min ( x ) = H ( x ) = 1 2 log 1 2 + 1 4 log 1 4 + 1 8 log 1 8 + 1 8 log 1 8 1.213008 h_{\min}(x)=\mathcal{H}(x)=-\frac{1}{2}\log\frac{1}{2}+-\frac{1}{4}\log\frac{1}{4}+-\frac{1}{8}\log\frac{1}{8}+-\frac{1}{8}\log\frac{1}{8}\approx 1.213008

In terms of Part II , the sequence x x represents the signature of all life forms in a solar system. The Brutomolecule can only steer how it randomly transmits its sequence of molecules. According to this, if the Brutomolecule can work out the asymptotic distribution of the letters in the growing key x n x\vert n , even without knowledge of the exact sequence of those letters, then it can adjust its strategy (choice of p p ) accordingly and minimise the exponential growth rate of its randomise brute force attack. The asymptotic entropy of the sequence x x thus represents a physical boundary on efficiency of these randomise brute force attacks. Accordingly. since entropy is maximised under uniformly distributed letters, the best protection against such randomised brute force attacks is for solar systems to choose keys with asymptotically uniformly distribute letters.

Lee Isaac
Jun 7, 2018

{G C A T G A} = 6.21 & (CATAGACA) == 29.22 29.22 (+) 6.21 = 79.31622 sqrt(79.31622) is not equal to 1.213 Thus the answer is 1.213

Not one part of what you wrote has any relevance to Tasks 1 and 2, let alone the problem. Nor is it coherent. For example:

  • why on Earth look at G C A T G A G\,C\,A\,T\,G\,A and C A T A G A C A CATAGACA ?
  • what do { s } \{s\} and ( s ) (s) mean for finite sequences, s s ?
  • why does { G C A T G A } = 6.21 \{G\,C\,A\,T\,G\,A\} = 6.21 ?
  • why does ( C A T A G A C A ) = = 29.22 (CATAGACA) == 29.22 ?
  • why write = = == ?
  • what on Earth is \oplus here?
  • why does 29.22 6.21 = 79.31622 29.22 \oplus 6.21 = 79.31622 ?
  • 79.31622 1.213 \sqrt{79.31622}\neq 1.213 , thus the answer is 1.213 1.213 \leftarrow this argument makes no sense whatsoever!

R Mathe - 3 years ago

0 pending reports

×

Problem Loading...

Note Loading...

Set Loading...