Infinite Xor Problem

Let \(m>n>d\) be positive naturals and \(X=\langle x_{i}\mid i<m\rangle\in\{0,1\}^m\). Define \(f_{X}:[m]^{n}\longrightarrow \{0,1\}\) via \(f_{X}(i_{0},i_{1},\ldots,i_{n-1})=1\) exactly in case \(|\{j<m\mid x_{i_{j}}=1\}|=d\). Note, that the map \(X\in\{0,1\}^{m}\mapsto f_{X}\) is not necessarily injective.

Now assume one has access to just m,n,d,fm,n,d,f. How can one recover the list of all possible sequences XX via computational means, such that fX=ff_{X}=f? Theoretically one can go through (via lexical ordering) the list of all elements Y{0,1}mY\in\{0,1\}^{m} and compute each fYf_{Y} and check if fY=ff_{Y}=f. But this will have a long average running time.

Construct an algorithm, A(,,,)\mathcal{A}(\cdot,\cdot,\cdot,\cdot) which satisfies A(m,n,d,f){0,1}m\mathcal{A}(m,n,d,f)\subseteq\{0,1\}^{m} and X{0,1}m: XA(m,n,d,f)fX=f\forall{X\in\{0,1\}^{m}:~}X\in \mathcal{A}(m,n,d,f)\Leftrightarrow f_{X}=f for all m>n>dm>n>d and f:[m]n{0,1}f:[m]^{n}\longrightarrow\{0,1\}. Further let T(m,n,d,f)\mathcal{T}(m,n,d,f) be the total number of calls to the oracle ff during the execution of A(m,n,d,f)\mathcal{A}(m,n,d,f) and T(m,n,d):=12[m]nf{0,1}[m]nT(m,n,d,f)\overline{\mathcal{T}}(m,n,d):=\frac{1}{2^{|[m]^{n}|}}\sum_{f\in\{0,1\}^{[m]^{n}}}\mathcal{T}(m,n,d,f)

which is effectively the average running time. Find an algorithm, A(,,,)\mathcal{A}(\cdot,\cdot,\cdot,\cdot), such that T(m,n,d)\overline{\mathcal{T}}(m,n,d) is minimised for all m,n,dm,n,d, if this is at all possible.

There is a more relaxed version of this problem in which d=1d = 1, and is a fixed value. This version may be more likely to have a polynomial time solution algorithm, or with an algorithm such that m n dX:T(m,n,d,f)<X \exists m\ \exists n\ \exists d \exists X: \mathcal{T}(m,n,d,f) < |X| .

A basic flowchart of how the solution algorithm should function in practice:

An example procedure:

The problem instance generator decides values m=3, n=2, d=1, X={0,1,1} m = 3,\ n = 2, \ d = 1, \ X = \{0, 1, 1\} . This information is sent to the oracle who can perform function ff, and the values of mm, nn and dd are sent to the algorithm.

The algorithm decides to call f(0, 1) to which the oracle returns 1 as it knows exactly one of x0 x_{0} and x1 x_{1} is 1 and the rest are 0. Then, the algorithm calls f(0, 2) which returns 1 and f(1, 2) which returns 0. Since x1 x_{1} and x2 x_{2} match and both are different to x0 x_{0} , the algorithm can determine that X={0,1,1} X = \{0, 1, 1\} or X={1,0,0}X = \{1, 0, 0\} . The algorithm sends both, and they compared with the XX sent by the problem instance generator. Since there is a match, the algorithm has done a successful run for one combination of XX for values m=3,n=2,d=1 m = 3, n = 2, d = 1.

Credit to R Mathe for helping rewrite and improve this.

#ComputerScience

Note by Darcy Morgan
3 years ago

No vote yet
1 vote

  Easy Math Editor

This discussion board is a place to discuss our Daily Challenges and the math and science related to those challenges. Explanations are more than just a solution — they should explain the steps and thinking strategies that you used to obtain the solution. Comments should further the discussion of math and science.

When posting on Brilliant:

  • Use the emojis to react to an explanation, whether you're congratulating a job well done , or just really confused .
  • Ask specific questions about the challenge or the steps in somebody's explanation. Well-posed questions can add a lot to the discussion, but posting "I don't understand!" doesn't help anyone.
  • Try to contribute something new to the discussion, whether it is an extension, generalization or other idea related to the challenge.
  • Stay on topic — we're all here to learn more about math and science, not to hear about your favorite get-rich-quick scheme or current world events.

MarkdownAppears as
*italics* or _italics_ italics
**bold** or __bold__ bold

- bulleted
- list

  • bulleted
  • list

1. numbered
2. list

  1. numbered
  2. list
Note: you must add a full line of space before and after lists for them to show up correctly
paragraph 1

paragraph 2

paragraph 1

paragraph 2

[example link](https://brilliant.org)example link
> This is a quote
This is a quote
    # I indented these lines
    # 4 spaces, and now they show
    # up as a code block.

    print "hello world"
# I indented these lines
# 4 spaces, and now they show
# up as a code block.

print "hello world"
MathAppears as
Remember to wrap math in \( ... \) or \[ ... \] to ensure proper formatting.
2 \times 3 2×3 2 \times 3
2^{34} 234 2^{34}
a_{i-1} ai1 a_{i-1}
\frac{2}{3} 23 \frac{2}{3}
\sqrt{2} 2 \sqrt{2}
\sum_{i=1}^3 i=13 \sum_{i=1}^3
\sin \theta sinθ \sin \theta
\boxed{123} 123 \boxed{123}

Comments

Unfortunately, I can't help you. Sorry !

Marcel Probst - 3 years ago

Log in to reply

That's okay! It's still reassuring to know that my idea is worth your time.

Darcy Morgan - 3 years ago

Please clearly and formally redo the first set of bullet points and the paragraph before this. Your text is both mathematically and grammatically incomprehensible. For example

  • What on Earth do you mean by pronumerals?
  • You say determine all values of X={x1,,xn}X=\{x_{1},\ldots,x_{n}\} fine, I get it, that’s the representation form, such that xx maps to {0,1}\{0,1\}. Sorry, what? Do you mean that each xi{0,1}x_{i}\in\{0,1\} or that each xix_{i} is a {0,1}\{0,1\}-valued function, and if so with what domain?
  • …for any dd in all mm This makes no sense in the context.

R Mathe - 3 years ago

Log in to reply

Thanks for the feedback. While they are admittedly obvious mistakes now that I look back, I haven't had experience writing "mathematically" like this. I've rewritten it and taken away the example case to try and make it read smoother. Please let me know if it makes sense to you now or if I missed something else.

It seems important to keep both versions of the problem as the first version seems more likely to have a polynomial-time algorithm.

Darcy Morgan - 3 years ago

Log in to reply

Your current formulation is a lot clearer. Having both versions is absolutely fine. Remaining issues:

  • I take it you really mean sequences X=xii<m{0,1}mX=\langle x_{i}\mid i<m\rangle\in \{0,1\}^{m} as opposed to sets, otherwise by the other condition there really is only one set, viz X={0,1}X=\{0,1\}.
  • What exactly is supposed to be computed. Your problem is still frustratingly vague. Are we suppose to compute the set of all nn-length {0,1}\{0,1\}-sequences XX with certain properties? Or just one, or …?
  • Let A\mathcal{A} be a given algorithm that returns upon valid input (n>m>dn>m>d in N+\mathbb{N}^{+}) an output A(n,m,d)({0,1}n)\mathcal{A}(n,m,d)\in\wp(\{0,1\}^{n}). What exactly is supposed to go in the output? All XX such that […what…]? Right now, your definition just requires all nn-length {0,1}\{0,1\}-sequences XX such that XX contains all values and that’s it. There’s nothing to compute.

R Mathe - 3 years ago

Log in to reply

@R Mathe Yes, I should be using sequences, I didn't realise there were angled brackets for that.

I've added a diagram and a more complete example to describe what I expect as a solution. What I am interested in is the lowest average case number of calls to f a successful algorithm needs and how it is affected by changing m, n and d.

Darcy Morgan - 3 years ago

Log in to reply

@Darcy Morgan Okay, I think I understand. Your original description prove that it is possible to determine all values of XX is thus thoroughly misleading. The known inputs are mm, nn, dd, the hidden inputs is the sequence XX itself. Moreover, the function is NOT to be executed as f(x0,x1,,xm1)f(x_{0},x_{1},\ldots,x_{m-1}), as per the original description, but rather as f(i0,i1,,in1)f(i_{0},i_{1},\ldots,i_{n-1}) where i0,i1,,in1{0,1,2,,m1}i_{0},i_{1},\ldots,i_{n-1}\in\{0,1,2,\ldots,m-1\} are distinct indices. The problem is thus:

Let m>n>dm>n>d be positive naturals and X=xii<m{0,1}mX=\langle x_{i}\mid i<m\rangle\in\{0,1\}^{m}. Define fX:[m]n{0,1}f_{X}:[m]^{n}\longrightarrow \{0,1\} via fX(i0,i1,,in1)=1f_{X}(i_{0},i_{1},\ldots,i_{n-1})=1 exactly in case {j<mxij=1}=d|\{j<m\mid x_{i_{j}}=1\}|=d. Note, that the map X{0,1}mfXX\in\{0,1\}^{m}\mapsto f_{X} is not necessarily injective.

Now assume one has access to just m,n,d,fm,n,d,f. How can one recover the list of all possible sequences XX via computational means, such that fX=ff_{X}=f? Theoretically one can go through (via lexical ordering) the list of all elements Y{0,1}mY\in\{0,1\}^{m} and compute each fYf_{Y} and check if fY=ff_{Y}=f. But this will have a long average running time.

Construct an algorithm, A(,,,)\mathcal{A}(\cdot,\cdot,\cdot,\cdot) which satisfies A(m,n,d,f){0,1}m\mathcal{A}(m,n,d,f)\subseteq\{0,1\}^{m} and X{0,1}m: XA(m,n,d,f)fX=f\forall{X\in\{0,1\}^{m}:~}X\in \mathcal{A}(m,n,d,f)\Leftrightarrow f_{X}=f for all m>n>dm>n>d and f:[m]n{0,1}f:[m]^{n}\longrightarrow\{0,1\}. Further let T(m,n,d,f)\mathcal{T}(m,n,d,f) be the total number of calls to the oracle ff during the execution of A(m,n,d,f)\mathcal{A}(m,n,d,f) and

T(m,n,d):=12mX{0,1}mT(m,n,d,fX)\overline{\mathcal{T}}(m,n,d):=\frac{1}{2^{m}}\sum_{X\in\{0,1\}^{m}}\mathcal{T}(m,n,d,f_{X})

which is effectively the average running time. Find an algorithm, A(,,,)\mathcal{A}(\cdot,\cdot,\cdot,\cdot), such that T(m,n,d)\overline{\mathcal{T}}(m,n,d) is minimised for all m,n,dm,n,d, if this is at all possible.


Note: [m]n=[{0,1,2,,m1}]n={A{0,1,2,,m1}A=n}[m]^{n}=[\{0,1,2,\ldots,m-1\}]^{n}=\{A\subseteq\{0,1,2,\ldots,m-1\}\mid |A|=n\}, thus [m]n=([m]lmn)|[m]^{n}|=\left(\begin{array}{c}[m]{l}m\\n\\\end{array}\right) elements.

R Mathe - 3 years ago

Log in to reply

@R Mathe Very well written, and a tremendous thank you for rewriting it in an easier to understand format.

Now that I think about it, the algorithm might not need to compute all values of XX to find the average to a reasonable degree of accuracy, but yes that is why computing all of them seemed important at the time of writing.

A concrete purpose of this problem is investigating if it is possible that m n d X:T(m,n,d)<X \exists m\ \exists n\ \exists d\ \forall X: \overline{\mathcal{T}}(m,n,d) < |X| or the more loose condition m n d X:T(m,n,d,f)<X \exists m\ \exists n\ \exists d\ \exists X: \mathcal{T}(m,n,d,f) < |X| which could lead to a multiple compression algorithm where nn, mm and dd are commonly known and the calls to fXf_{X} as decided by the algorithm is stored inside a new XX, which can be reduced using the algorithm again and so on.

Thanks again for all of your help.

Darcy Morgan - 3 years ago

Log in to reply

@Darcy Morgan not a problem. Do you want to compute the average over all ff ie

T(m,n,d):=12[m]nf{0,1}[m]nT(m,n,d,f)\overline{\mathcal{T}}(m,n,d):=\frac{1}{2^{|[m]^{n}|}}\sum_{f\in\{0,1\}^{[m]^{n}}}\mathcal{T}(m,n,d,f)

or over all XX ie

T(m,n,d):=12mX{0,1}mT(m,n,d,fX)\overline{\mathcal{T}}(m,n,d):=\frac{1}{2^{m}}\sum_{X\in\{0,1\}^{m}}\mathcal{T}(m,n,d,f_{X})

I came up with an algorithm. I’ll post it later. Not sure if it’s the most efficient, tho’.

R Mathe - 3 years ago

Log in to reply

@R Mathe Over all ff. The way I understand it currently, the average over all XX doesn't sound useful.

Darcy Morgan - 3 years ago

Log in to reply

@Darcy Morgan There are two possibilities:

  1. the problem is a kind of dd-SAT problem: one is given (coded via ff) knowledge of mm propositions and when for every nn-Tuple exactly dd are true, and wishes to prove the existence of a model, satisfying these conditions. Then we need to compute the average over all ff.

  2. the problem is of the form: a hidden XX is given and we only see fXf_{X}. We are to compute a possible value of XX. Then one needs to compute the average over all XX.

My first algorithm so far is really inefficient: T(m,n,d,f)=m!/(mn)!mn\mathcal{T}(m,n,d,f)=m!/(m-n)!\sim m^{n}. If n,dn,d are fixed then this is polynomial in X|X|. If not, then this is exponential time (or worse!). The algorithm I employ is a simple recursion and doesn’t dynamically react to partial results, which is probably why it is not very fast. I need to think about this further.

Whilst your problem isn’t exactly 33-SAT, there seem to be connections, and the latter is NP-hard, as far as I recall, so finding an efficient algorithm may be doomed. On the other hand, I’ve heard that replacing NP-problems by randomised counterparts may make more efficient algorithms possible.

R Mathe - 3 years ago

@Darcy Morgan Do you really want the algorithm to compute all or to just find any one value of XX?

R Mathe - 3 years ago
×

Problem Loading...

Note Loading...

Set Loading...