In my graduate-level complexity theory class, I came across the concept of a P-splinter, an incredibly niche complexity class that is used in a few papers. Intuitively, a P-splinter is a language where a PTIME function can enumerate its elements.
This concept was defined by Lane A. Hemachandra—who is also teaching the course—and his coauthors in their 1989 paper “Polynomial-Time Functions Generate SAT: On P-Splinters.” They established that SAT is a P-bi-splinter. In other words, a single PTIME function can enumerate both \(\text{SAT}\) and \(\overline{\text{SAT}}\).
Precisely, a language \(L\) is a P-splinter if there exists a polynomial-time computable function \(f\) and a valid input string \(x\) such that
\[L = \{ x, f(x), f(f(x)), f(f(f(x))), \ldots \}.\]
Note that the enumeration need not be in lexicographic order.
A language \(L\) is a P-bi-splinter if, for the same function \(f\), there exists another valid input \(y\) such that
\[\overline{L} = \{ y, f(y), f(f(y)), f(f(f(y))), \ldots \}.\]
Hemachandra et al. proved that SAT is a P-bi-splinter. While I do not completely understand their proof—complexity theory is quite foreign to my main research focus—the splinter function is constructed using the following intuition:
- Within PTIME, attempt to check if a particular boolean formula is satisfiable. If successful, then enumerate it!
- If we fail to terminate in PTIME, then enumerate an easy element of SAT. Save any intermediate work on the hard problem for later.
Interestingly, despite this being an old result, state-of-the-art LLMs get this problem wrong. (For now—by writing this, I am inadvertently helping the next generation of LLMs improve via memorization.) I tested the five models used to create the Humanity's Last Exam benchmark (GPT-4o, Sonnet 3.5, Gemini Pro 1.5, OpenAI-o1, OpenAI-o3). None got it right! That was surprising.
I suspect that since P-splinters is a niche definition, LLMs do not have the capacity to memorize it. As such, LLMs have to resort to reasoning the problem on-the-fly, which they fail at—as most humans would, too.
My prompt was:
A P-splinter is a language L such that a PTIME function f and some valid input x exists, that satisfy
L = {x, f(x), f(f(x)), ... }.
Is SAT a P-splinter? Answer Yes or No.
OpenAI-o1, OpenAI-4o, and Sonnet-3.5 made the mistake of assuming that a language that is a P-splinter must be in P. This is not true, as the P-splinter function for SAT does not enumerate elements of SAT up to length \(n\) in PTIME, nor does it guarantee lexicographic ordering.
Exact Answer: No.
Explanation: A language L is called a P-splinter if there is a single string x and a polynomial-time computable function f such that L is exactly the orbit {x, f(x), f(f(x)), …}. For SAT to be a P-splinter, we would need a polynomial-time algorithm f that, given any SAT formula in the orbit, produces the “next” SAT formula in that same orbit, enumerating precisely all satisfiable formulas (and no unsatisfiable ones). Constructing such an f in polynomial time would effectively solve SAT in polynomial time, which is unknown (and would imply P = NP if it exists). Under standard assumptions (P ≠ NP), SAT is therefore believed not to be a P-splinter.
OpenAI-o3 made the mistake that a P-splinter must be sparse. If true, then SAT would not be a P-splinter, since it is dense. However, this assumption is false.
Exact Answer: No.
Explanation: A P-splinter language is one whose entire set of strings is generated by starting with a fixed input x and iteratively applying a PTIME-computable function f. This process produces a single “chain” of strings that, especially when f is designed to avoid repetition, is extremely sparse—typically only one string per each (or only a few strings per) input length. In contrast, SAT (the Boolean satisfiability problem) is NP-complete and is known to be “dense” in the sense that it has exponentially many different formulas of a given length that could be satisfiable. Thus, SAT cannot be captured by the sparse, single-chain structure that a P-splinter requires.
Gemini just entered an infinite loop of generating the same paragraph over and over and over again, reminiscent of the poem poem poem attack.
Let's consider a function f that takes a formula φ and returns a formula that is satisfiable if and only if φ is satisfiable, and also encodes the fact that φ is in SAT. This would allow us to generate a sequence of formulas that are all in SAT. [repeated ad infinitum]