My current strategy to prove P=NP

Since I have already opened up about my intent and strategy to prove P=NP in some HN comments (https://news.ycombinator.com/item?id=36082578 and https://news.ycombinator.com/item?id=35729626), I decided I should explain it in a series of posts. After all, this was my original goal when I started the blog, to explain more the exciting possibility that P=NP (with a nice constructive algorithm), so that more people could get involved and contribute to that approach.

As for my motivation, that's probably a topic for another blog post, let's just say I think it is an extremely important problem to work on. I have been working on understanding P vs NP question as a serious hobby for about 6 years now, however, it was a discovery I made about a year ago that made me really optimistic about P=NP, and that a proof can be close. (Up to that point, I was always trying to be agnostic about the question, although I do admit I don't really buy many arguments or intuitions for P!=NP, as most of them can also be interpreted in a contrarian way.)

So, the current strategy is a sort of three step plan, outlined as follows:

  1. Prove that 2XSAT, a subclass of SAT that only allows 2-SAT and XORSAT clauses, is NP-complete, by reduction from 3-SAT.
  2. Prove that deduction system (logic) - I call these poset-linear - that governs 2XSAT is refutation complete. That means if you generate all tautologies in any theory of 2XSAT logic, and you don't reach the contradiction, then the corresponding theory is satisfiable.
  3. Find a polynomial representation of all tautologies of any 2XSAT theory, and show that all tautologies can be generated in polynomial time. 

I might detail each of the steps more in the upcoming blog posts; let me just explain each of them. I will also add a link to my work-in-progress rewritten paper (the previous blog post was, in retrospect, a rather naive attempt to use step 1 to combine algorithms for 2-SAT and XORSAT, without proper theoretical framework). The step 1 (in section 1) is done and detailed in the paper. As I was writing this blogpost (over few months in fact), step 2 turned out to be false (I used to think I almost have a proof), the poset-linear logic is not refutation complete. However, current research indicates that this problem might be fixable with adding small number of additional variables without change of semantics (this is detailed in the draft). Finally, step 3 is still a big unknown, while I have some ideas what to do, it's not described in the paper.

Step 1 - 2XSAT Decomposition

The first step, NP-completeness of 2XSAT, is what I discovered a year ago, and to my surprise, even though the result is pretty trivial, I couldn't find it mentioned in any literature. Back then, it was based on a boolean formula:

$$ x \vee y \vee z = (\neg a \vee x) \wedge (\neg b \vee \neg y) \wedge (c \vee z) \wedge (a \oplus b \oplus x \oplus \top) \wedge (b \oplus c \oplus y \oplus \top) $$

(However, since I found even more efficient methods of representing logic gates, see the WIP paper for more details.) The above equation means that we can characterize solutions to SAT instance as a projection (shedding away extra variables $a,b,c$ for each clause on literals $x,y,z$, similar to e.g. 1-IN-3-EXACT-SAT reduction) of an intersection of solutions to a XORSAT instance and a 2-SAT instance. Since both XORSAT and 2-SAT instances can be decided in P (but also "described", in the sense of finding all tautologies in the underlying logics), this opens an exciting possibility that there is a clever generalization of both algorithms.

To me, the 2XSAT reduction answered an important question - what exactly is the role of XORSAT (linear equations over $\mathbb{Z}_2^n$) in boolean satisfiability problems. I have felt for a long time that XORSAT exhibits many complications of 3-SAT; if we didn't know Gaussian elimination, we would be similarly baffled and the only thing we could do would be to try all the solutions with some kind of backtracking search. So it is natural to look at XORSAT for hints how to solve a more general problem of SAT (in particular, how to deal with "circularity" inherent in the stated instance).

The 2XSAT reduction can be interpreted in this way - the only clause you really need (if you allow enough extra variables) to represent an NP-complete problem is $(x \oplus y) \vee z$. This is very interesting to compare with 3-SAT, because $x \vee y \vee z$ only disallows one out of 8 combinations of literals $x,y,z$. However, because $$(x \oplus y) \vee z = (x \vee y \vee z) \wedge (\neg x \vee \neg y \vee z),$$ we can see that the clause $(x \oplus y) \vee z$ disallows two out of 8 combinations of literals $x,y,z$. It is argued by many that 2-SAT is simple and 3-SAT is hard for the reason that while 2-SAT clause disallows 1 out of 4 combinations, the 3-SAT clause disallows 1 out of 8. However, here we see that we can always sort of "pair up" the 3-SAT clauses in the underlying instance so that each clause has the "easy" property of 2-SAT.

Another way to look at 2XSAT is that it is pretty much just 2-SAT, but confined to an affine subspace of $\mathbb{Z}_2^n$. One can solve the linear system (the XORSAT portion of the instance) and substitute the dependent variables into 2-SAT portion of 2XSAT, and then we obtain a sort of 2-SAT instance, but instead of on original variables, we have it on some other set of vectors in $\mathbb{Z}_2^n$. It is really strange to think then that 2-SAT itself would be easy but after the linear transform (which can be done piecemeal), it would suddenly become really hard.

This perspective also explains what are actually "hard" instances of SAT. Those have really dominating XORSAT portion, the affine subspace of solutions has a really small dimension, and so these solutions are hard to find by testing variables. Resolution (which is the method used for 2-SAT) doesn't help either, because it won't "hone you into" the affine subspace where the solution is.

I will also note that I have been able to run 2XSAT reduction to transform small factoring problems (with several thousands of variables after the reduction), so I have also verified to a good extent experimentally that it works. From the current 3 point strategy, this is the result I am most certain about.

Step 2 - Refutation-completeness

So it was really the above similarity of 2XSAT to 2-SAT that spurred my current research project, and got me so excited about the 2XSAT reduction. This takes us to the step 2 of the strategy. I decided to read the original Melville Krom's 1967 paper on 2-SAT, and I was blown away. This guy, it seemed, had everything that I needed to find a polynomial algorithm for 2XSAT (note that his paper was written before NP class was even defined!).

The key idea of Krom's paper, as I interpret it, is the following. We face a decision problem of boolean satisfiability of certain class of formulas. If we want a polynomial algorithm, then clearly, we will have to be able to look at exponentially many variable assignments at once, because there are exponentially many variable assignments to begin with. So how do we efficiently represent these subsets of assignments? This question haunted me for many years, as I understood that it is the key to a polynomial algorithm for SAT.

Krom answers the question as follows. Create a custom deduction system (aka logic) with a small number of possible formulas as a finite subset of classical propositional logic (also make sure your deduction rules are sound, but that is easy). The boolean satisfiability instance will be some theory in that logic (additional axioms coming from the instance). The variable assignments are just classical interpretations of the theory, and the correct assignments are its models. Now what you also do, you prove that every theory in that small logic is refutation complete. That means, instead of checking all the variable assignments (interpretations) to verify satisfiability, you only have to generate all the true sentences in the theory, and if you don't generate a contradiction, from the refutation completeness, you know the instance (from which the theory was built) has to be satisfiable. So despite the fact that there can be exponentially many models, we can control the amount of true sentences to be checked by the choice of the deduction system.

So this is what Krom does for 2-SAT. He constructs a simple deduction system (so simple he doesn't even give it a formal name) that governs 2-SAT, which only has formulas of the form $x \vee y$ for any literals $x,y$, and the deduction rule (which represents resolution, but I call it transitivity in my paper) $$ x \vee y,\ \neg y \vee z\ \vdash\ x \vee z.$$ Then he proves that this deduction system is refutation complete, and also proves a nice criterion (existence of two chains of resolutions that implies both $x \vee x$ and $\neg x \vee \neg x$ for some variable $x$) for detecting the satisfiability of 2-SAT theories. However, he doesn't really need the criterion to show polynomial decidability of 2-SAT, because, as you can see, the total number of possible formulas in this deduction system is pretty small (quadratic in number of variables), so if we generate all of them using the deduction rule, from the refutation completeness, we can also decide the instance (though doing this is much less efficient than using the criterion).

It's also worth noting that using a refutation complete deduction system to prove satisfiability is a kind of non-deterministic algorithm. We don't really know in what order we arrive at true sentences for the given theory (deriving a true sentence can be seen as a "step" of the algorithm), but we know, from the refutation completeness, that if the system is not satisfiable, we will get a contradiction eventually. We can control the algorithm being polynomial by keeping the deduction system small (relative to number of variables involved). I think, in retrospect, it is very fitting to approach the search for a polynomial algorithm for NP by considering this class of algorithms, that are based on deduction systems and non-deterministic.

The second step of the strategy is then an adaptation of the Krom's approach to 2XSAT. First, we need to define our logic. After some experimentation, I picked the logic that is similar to 2-SAT, but instead of usual literals we allow for what I call generalized literals - any combination of literals joined by the XOR operation (with optional constant $\top$). I denote these generalized literals with greek letters and they are essentially linear equations over $\mathbb{Z}_2^n$. In our logic, the allowed formulas are disjunctions of two generalized literals. I call this logic poset-linear ("poset" refers to implications represented by the disjunctions and "linear" to the linear equations).

The logic also needs deduction rules. In classical propositional logic, there is only modus ponens rule, and everything is done with it and substitution into "logical axioms" (IMHO these should also be called deduction rules but it's a matter of taste). Here we only have limited ability to substitute (we can substitute for another generalized literal), so we will only have deduction rules for everything. After some experimentation, I arrived at the following set of deduction rules (greek letters $\alpha, \beta, \gamma$ can be arbitrary generalized literals):

  • Identity (I): $$\vdash\ \neg \alpha \vee \alpha$$
  • Commutativity (C): $$\alpha \vee \beta \ \vdash\ \beta \vee \alpha$$
  • Linearity (L): $$\alpha \vee \beta,\ \alpha \vee \gamma \ \vdash\ \alpha \vee \neg (\beta \oplus \gamma)$$
  • Transitivity (T): $$\alpha \vee \beta,\ \neg \beta \vee \gamma \ \vdash\ \alpha \vee \gamma$$
It's easy to verify that the rules are semantically sound, since each rule corresponds to a schema of tautologies in classical propositional logic. 

We can also see how the poset-linear logic generalizes logics of 2-SAT and XORSAT in a straightforward way. If we only consider formulas with normal literals, then rule (L) cannot be used. What we get, especially with the rule (T), is the logic of 2-SAT.

On the other hand, if we only consider formulas that look like $\bot \vee \alpha$ for any generalized literal, these are just linear equations. Then rule (T) becomes useless, and the rule (L) becomes equivalent transformation of the linear system, with which we can build any linear combination of the original linear equations. This is essentially the "logic" of XORSAT.

Unfortunately, the poset-linear logic has exponential number of formulas due to linearity. While there are ways to construct a polynomial logic representing 2XSAT by restricting number of variables per each generalized literal, let's instead worry about this problem in step 3, and focus on proving refutation completeness of full 2XSAT.

I want to follow the typical logical proof of refutation completeness which iteratively constructs a model for a consistent theory. This relies on a version of a deduction theorem for that particular logic, which effectively lets us prove some statements by contradiction. (Krom uses same technique in his 2-SAT proof, but since his proof is so short, he doesn't bother to state it explicitly.)

The analogue of deduction theorem for 2XSAT is as follows. Assume we have some theory $\Gamma$ (really just a set of formulas), and we extend it with another axiom $\bot \vee \xi$, and we find that after extending, we can now derive a formula $\bot \vee \phi$ (where $\xi$ and $\phi$ are some generalized literals, i.e. the formulas involved are linear equations). The deduction theorem then states, assuming the above, we can derive formula $\neg \xi \vee \phi$ from only theory $\Gamma$.

Unfortunately, my attempt to proof the deduction theorem for 2XSAT was hampered by finding a counterexample. Consider a theory $$\Gamma=\{ \phi\vee\neg(\psi_1\oplus\xi), \neg\psi_1\vee\neg(\psi_2\oplus\xi), \ldots, \neg\psi_n\vee\neg(\phi\oplus\xi)\}$$. This theory satisfies the assumption of the deduction theorem, because if we also assume $\bot\vee\xi$, we can easily deduce $\phi\vee\phi$. However, $\neg\xi\vee\phi$ (although it is a tautology) is impossible to prove in our logic, in fact, we can't do much with these axioms, because the generalized literals in all these formulas do not match each other, which is required to apply rules (L) and (T).

There is a potential way to resolve this problem, by adding more variables. I allude to it in the attached paper. This will probably lead to a study of deduction systems that allow not just 2, but 3 generalized literals in each clause (like 3XSAT). I have started some research on it and I will try to follow up.

As an aside, why I remain cautiously optimistic about P=NP, a similar situation arises in XORSAT. It doesn't seem that logical system underlying XORSAT, with a limit of only 3 variables per XORSAT clause, is refutably complete. This is because one can find linear systems that require adding together equations with arbitrary number of variables. To construct such a system, we can take a 3-regular undirected graph with minimal cycle of length k, and define a variable for each edge, and equation for each vertex. The decision whether such a system is solvable then requires constructing equations with at least k+1 variables. Despite this, XORSAT has a polynomial algorithm. It remains to be seen whether, by utilizing generalized literals, we can get by with only using a fixed number of ORs per formula (2 wasn't enough, let's try 3).

Step 3 - The Decision Procedure

As for the step 3 of the strategy, this is a really unknown terrain, and I have worked things out only partially. My original plan was to represent sets of 2XSAT formulas, which are closed to the (L) rule.
 
We can notice that generalized literals on $n$ variables correspond to linear equations (affine hyperplanes) in $\mathbb{Z}_2^n$, and these form a vector space of dimension $n+1$. In this vector space, the equation $0=0$ (zero vector) corresponds to true constant $\top$, and the contradictory equation $0=1$ corresponds to false constant $\bot$. The vector addition in this vector space is just usual addition of linear equations. 
 
The set of true formulas $\alpha\vee\beta$ for a fixed $\beta$ that is closed to (L) rule can be represented as a linear subspace of the vector space. This subspace has a basis, so despite the set being possibly exponential, we can represent it $O(n^2)$ space. We can then consider sets (which I call "butterfly sets") of formulas $\alpha\vee\beta$ where both $\alpha$ and $\beta$ is drawn from a different linear subspace, so each butterfly set is represented as a pair of sets of independent vectors. We can then reformulate deduction rules as operations with butterfly sets that produce other butterfly sets. Also note that from the logical standpoint (thanks to rule (I)), any vector subspace of generalized literals which contains the vector $\bot$ can be identified with the vector space of all possible generalized literals (i.e. from a contradiction anything follows).

Under the assumption of refutation completeness, the question of satisfiability then becomes a question of generating all the true formulas in a given theory, and figuring whether formula $\bot\vee\bot$ is among the true formulas. Because there is exponentially many formulas in a poset-linear deduction system, we will not do that by handling individual formulas, but by handling the butterfly sets. We only need to prove that we can cover any set of true formulas (provable in the theory we have been given) with a small (polynomial in number of variables) number of butterfly sets.

The actual algorithm for deciding an instance would then work as follows. Generate initial set of butterfly sets, which cover the axioms of the given theory in poset-linear deduction system (also in this stage, add the axioms from the (I) rule, because that way we don't have to worry about it anymore). Then for each combination of butterfly sets, generate new butterfly sets using the (L) and (T) rules, constantly checking if the new butterfly sets aren't already covered by or cover the existing butterfly sets, cleaning up the set of butterfly sets to be pairwise disjoint. Repeat this process until no new butterfly sets can be generated. If the system is not satisfiable, we should end up with only one butterfly set, which is a pair of full vector spaces (representing the formula $\bot\vee\bot$).
 
And if the system is satisfiable, we get a decent representation of it, which can actually be utilized to solve new instances. For example, if we run this process on a SAT instance that corresponds to a multiplier of two integers larger than 1 of a fixed bit size, we get some butterfly sets that relate how is input correlated with the output. We can then solve the factoring problem by simply specializing this representation for the integer we want to have factored, in other words, just do an intersection with a linear subspace, and we get our result! This is similar to when you factorize a system of linear equations, you can then solve different right hand sides cheaper, than solving the system for each of them.
 
(There are additional optimizations that can be done on the algorithm by using special properties of formulas $\bot\vee\alpha$ that correspond to XORSAT clauses. We can weed those out and keep them separate, substitute the dependent variables into remaining butterfly sets, effectively changing the basis to be aligned with the butterfly sets. This will also ensure that the two vector spaces of every butterfly set are pairwise disjoint.)
 
This procedure can be also seen as a generalization of 2-SAT algorithm, where we compute transition-closure of the implication graph. Here, instead of nodes that represent literals, we have all possible vector subspaces (literals are just a special case, 1-dimensional vector spaces aligned with the standard basis). Instead of edges, we have butterfly sets. Instead of the application of (T) rule, we check if the vector spaces of two butterfly sets intersect in a certain way. (Smarter 2-SAT algorithms group nodes together if they form a strongly-connected component; the analogy here is the above optimization of weeding out the formulas that represent linear equations.)
 
I should also note that this technique is also a generalization of the belief/survey propagation algorithms for SAT. With BP/SP, you generate only a certain types of true formulas (represented by messages being passed between nodes), not all of them. In particular, unlike the formulas in poset-linear logic which use generalized literals, the formulas in BP/SP only have a bounded number of variables they pertain to.

Summary

I think the approach that I am taking to P vs NP (or rather, attempt to discover a polynomial algorithm) is based on two main, IMHO underutilized, ideas:

1. Linear equations can be used to represent some of the constraints (the XORSAT portion of SAT) and thus can be utilized in representing exponential number of solutions and deciding some instances. (This is AFAICT my observation. Why is XOR usually not considered to be a basic logical connective, if linearity is so beautiful to work with? I think the whole circuit depth concept is grossly misleading.)

2. Representing SAT instances as logical systems, and converting satisfiability to refutation-completeness. The insight that we can represent large number of solutions (models of the theory) using a small number of true formulas, and using the size of the provable theories as a bound of the nondeterministic algorithm. This comes from Krom's paper on 2-SAT, although I am not aware of other people stating this as explicitly as I do.
 
The full strategy combines these two ideas, casting a wide net to catch a potential polynomial algorithm for SAT. I have a draft paper with some more details on poset-linear logic, and other stuff mentioned above; I am probably gonna rewrite it in a more general way. I would like to invite more people to comment on this approach and research the problem from these directions.

Komentáře

Populární příspěvky z tohoto blogu

Will trilinear logic help with the completeness obstacle?

A new SAT solving algorithm