What DeepSeek R1’s Math Struggle Teaches us About Human Thinking

Dong Liang
13 min readJan 28, 2025

--

Update Feb 2:

O3-mini-high became the first model that I have tested to solve the problem on the first try (3m9s).
The thinking process (just headlines in o1 and o3) also became a little more detailed, with short paragraphs. Some of these are brilliant mathematical thinking. I will list a few here which I think are significant in making progress toward clear thinking.


Evaluating pair boundaries
I’m mapping out the number of contiguous pairs, k, in the lineup. Considering cases from k=0 to k=3, I assess their impact on the arrangement and boundaries.
Assessing k values
I’m examining the feasibility of k=0 for contiguous pairs. Even if k=0, alternating L and R shoes will create at least one boundary, though it changes the distribution.
Pinpointing feasible k values
OK, let’s think it through: k=0 doesn’t work since it throws off the arrangement. Valid k values are 1, 2, or 3. For k=3, three pairs set up three blocks.
Confirming k=3 feasibility
I’m checking if k=3 works by placing three blocks of adjacent pairs. This setup ensures no mismatched L and R shoes touch each other.

Update Jan 31:

Today I got access to o3-mini and I tested the same question with it.

The initial run (took 1 minute 33seconds ) gave the result of 54. The analysis was on track. The result was close.
However, when I asked it to break it down so I can see where exactly went wrong, it gave me the answer of 72. This kind of laziness is characteristic of LLM. This show o3-mini has not solved the essential problem of context.

Recent discussions surrounding DeepSeek R1 extend beyond its technical capabilities, touching on algorithmic efficiency over computational power, AI geopolitics, and reinforcement learning implementations.

What sets R1 apart is its explicit reasoning process, which I observed closely while testing it on an AMC 10B 2025 probability problem. While R1, like other foundational models, failed to arrive at the correct solution, its approach revealed something fascinating: when presented with corrections, it maintained its original reasoning with a human-like conviction. This contrasts sharply with typical AI behavior of readily accepting corrections.

The model’s problem-solving approach provides a window into human cognition. Its process encompasses the full spectrum of mathematical problem-solving: sudden insights, logical deductions, eureka moments, and periods of cognitive gridlock — all remarkably similar to human mathematical reasoning.

This parallel opens new avenues in learning theory, particularly in understanding how confidence, intuition, reasoning capacity, computational efficiency, self-correction, and metacognition contribute to intellectual growth. While these insights could enhance human cognitive development, their greatest potential may lie in AI optimization, where systematic improvements could yield more substantial and scalable results.

Here is the math problem that DeepSeek R1, like every other LLM before it, failed to solve:

Three different pairs of shoes are placed in a row so that no left shoe is next to a right shoe from a different pair. In how many ways can these six shoes be lined up?

(AMC 10B problem 20)

In my first submission of the problem to DeepSeek R1, the model spent 359 seconds processing the problem before delivering an incorrect solution. Below, I’ll present the initial segment of its reasoning process, as the complete output was extremely long.

DeepSeek’s Thinking Process and My Annotation

Okay, let’s see. I need to figure out how many ways three different pairs of shoes can be arranged in a row, with the condition that no left shoe is next to a right shoe from a different pair. Hmm. So, there are three pairs, each pair has a left and a right shoe. Total six shoes. Without any restrictions, the number of ways would be 6! = 720. But we have this restriction: no left shoe can be adjacent to a right shoe from a different pair.

First, let me make sure I understand the problem correctly.

This is very good and very human like!

Each pair is different, so the left shoes are distinct and the right shoes are distinct. So, for example, if we have pairs A, B, C, then the left shoes are A_left, B_left, C_left, and the right shoes are A_right, B_right, C_right. The rule is that if a left shoe is next to a right shoe, they must be from the same pair. So, A_left can be next to A_right, but not next to B_right or C_right. Similarly, A_right can be next to A_left, but not B_left or C_left. However, left shoes can be next to left shoes, and right shoes can be next to right shoes. So, for example, A_left next to B_left is allowed, and A_right next to B_right is allowed. The restriction is only when a left and a right shoe are adjacent; they have to be from the same pair.

Although this is a bit wordy, if this analysis comes out of a human student, I’ll be quite happy about it.

So, the problem is similar to arranging the shoes such that any two shoes of different types (left and right) that are adjacent must form a pair. That is, you can’t have a left shoe adjacent to a right shoe of a different pair. So, how do we count the number of such permutations?

One approach might be to model this as a permutation with certain adjacency constraints. Maybe use inclusion-exclusion? But inclusion-exclusion can get complicated. Alternatively, maybe think recursively? Let me try to break it down.

This represents a critical juncture in problem-solving where multiple potential approaches emerge. The time investment required for each path makes the initial selection crucial. Human experts often rely on intuition developed through experience to identify promising directions.

At this stage, inclusion-exclusion and recursion naturally present themselves as candidates. However, experienced problem solvers typically resist the urge to immediately decompose these approaches. Instead, they first generate and evaluate a comprehensive list of possible strategies based on available information, allowing for more efficient path selection.

First, note that each pair’s shoes can either be placed together or separated, but if they are separated, the left shoe must not be adjacent to any right shoe from another pair. Wait, but since all the other shoes are either left or right from different pairs, perhaps the key is that whenever a left and a right shoe are adjacent, they must be from the same pair.

This is absolutely true.

So the arrangement must consist of blocks where each block is either a single shoe or a pair of shoes (left and right) from the same pair. But if a pair is separated, their left and right shoes must not be adjacent to any opposite shoe of another pair. Hmm, maybe not. Wait, maybe the entire arrangement can be broken into blocks where each block is either a left shoe followed by a right shoe of the same pair, or a right shoe followed by a left shoe of the same pair, or single shoes. But single shoes would have to be all left or all right? Because if you have a single left shoe next to a single right shoe, that would violate the rule if they are from different pairs. So actually, maybe the entire arrangement must be such that left and right shoes are either in a pair block or are separated by shoes of the same type.

At this point, the model’s thinking lacks clarity. The relationship can be distilled to a fundamental observation: a left point must be adjacent to either its matching right or another left point — no other possibilities exist. A skilled problem solver would immediately recognize this constraint’s significance and pursue its logical consequence: determining the maximum number of possible adjacent points in the system. Occasional lack of clarity is not fatal, but if one is not aware of this diminishing trend and take actions to come back on track, this will become a problem.

Wait, let me think again. Suppose we have an arrangement where there are some pairs placed together (either left-right or right-left) and some individual shoes. The individual shoes must be arranged such that no left shoe is adjacent to a right shoe. So, all the individual left shoes must be grouped together, and all the individual right shoes must be grouped together, and the pair blocks can be interspersed between them. Hmm, maybe not exactly, but perhaps in some way where left and right individual shoes are separated by pair blocks or same-type shoes.

Alternatively, maybe the entire arrangement must be a sequence where all the left shoes come first, followed by all the right shoes, or vice versa, but that’s too restrictive. Because pair blocks can be placed in between. Wait, no. For example, you could have a pair block (left-right) followed by another pair block (left-right), which would have left-right-left-right, which might be allowed, but actually, in that case, the right shoe from the first pair is adjacent to the left shoe from the second pair, which is a right and a left shoe from different pairs. Wait, but according to the problem statement, the restriction is that no left shoe is next to a right shoe from a different pair. So, in this case, a right shoe from the first pair is next to a left shoe from the second pair. Since they are different pairs, that would be a violation. So, actually, even two adjacent pair blocks would result in such a conflict. Wait, so if you have a left-right pair followed by another left-right pair, the right shoe of the first pair is adjacent to the left shoe of the second pair, which are different pairs, so that’s not allowed. Similarly, a right-left pair followed by a right-left pair would have the left shoe of the first pair adjacent to the right shoe of the second pair, which is also different. So, actually, pair blocks cannot be adjacent to each other? Because that would result in a left shoe adjacent to a right shoe from a different pair. Therefore, pair blocks must be separated by shoes of the same type. So, between two pair blocks, there must be left shoes or right shoes, but not a mix. Hmm, this is getting complicated.

This passage exemplifies inefficient problem-solving — a meandering thought process filled with hesitations and reversals that ultimately loses direction. Again, while such cognitive gridlock is common in human thinking, recognizing it is essential in mathematical training or problem solving in general. Like a robot vacuum trapped in a corner, the key is to recognize when we’re stuck. The solution lies in stepping back, reaffirming established facts, and reassessing the situation with fresh perspective. This requires the thinking process have a higher level of monitor and control function.

Wait, let’s try to model this properly. Let’s denote L1, L2, L3 as the left shoes of pairs 1, 2, 3, and R1, R2, R3 as the right shoes. The problem states that in the arrangement, if a left shoe is adjacent to a right shoe, they must be from the same pair. So, adjacent pairs can only be (Li, Ri) or (Ri, Li). Otherwise, adjacent shoes must be both left or both right. Therefore, the entire arrangement must be such that all transitions between left and right shoes are only within the same pair. So, for example, you could have a sequence like L1, R1, L2, R2, L3, R3, which is alternating left and right shoes of the same pairs. Or R1, L1, R2, L2, R3, L3. But also, you could have sequences where multiple left shoes are together or multiple right shoes are together, as long as any adjacent left-right or right-left shoes are from the same pair.

I think this denotation is excellent and should have been adopted from the very beginning. But this is not too late. And there is one very insightful comment here: “the entire arrangement must be such that all transitions between left and right shoes are only within the same pair.” After the first sign of frustration we now have a moment of clarity and one should seize this opportunity to proceed slowly and carefully, so that we are not lost again.

So, perhaps the arrangement can be thought of as starting with a sequence of left shoes, followed by a pair block (Li, Ri or Ri, Li), then a sequence of left or right shoes, etc. Wait, maybe not. Alternatively, think of the entire arrangement as a series of “runs” where each run is either a sequence of left shoes, a sequence of right shoes, or a pair block (Li, Ri or Ri, Li). But the pair block would have to separate runs of left and right shoes.

At this point, we are very very close to the true path. A intuitive sense of how close we are to the real solution to a problem is the most powerful weapon. Does DS R1 have this sense?

Wait, maybe another approach. Let’s consider the problem as a graph where each shoe is a node, and edges connect shoes that can be adjacent. Then, the problem reduces to counting the number of Hamiltonian paths in this graph. But Hamiltonian path counting is generally hard, but maybe the graph has a particular structure here.

What happened? At this critical moment, when the solution was nearly within grasp, the reasoning veers off into an unrelated direction. While such divergence might appear creative on the surface, it’s ultimately counterproductive. This illustrates another crucial aspect of problem-solving: the need for clear criteria for when to persist versus when to backtrack, rather than being arbitrarily drawn to new ideas. The fact that AI replicates this characteristically human tendency to chase tangential thoughts is particularly revealing about the nature of cognitive processes.

Alternatively, maybe we can model this using recurrence relations. Let’s think about the number of valid sequences of length n, but in our case, n is 6. However, since each pair is unique, maybe we need to think in terms of permutations.

Alternatively, think of the problem as arranging the shoes such that between any two pair blocks, there’s a run of left or right shoes. Wait, this is getting too vague. Maybe we can break it down based on how many pairs are kept together and how many are separated.

Each pair can either be “clumped” as a pair (either LR or RL) or separated. But if they are separated, then their individual shoes must not be adjacent to a shoe of the opposite type from another pair. Hmm. So, if a pair is separated, then its left shoe can only be adjacent to left shoes or the right shoe of its own pair, and similarly for the right shoe.

Wait, actually, if a pair is separated, their left and right shoes can be placed anywhere, but they cannot be adjacent to a shoe of the opposite type from another pair. So, if we have a left shoe from pair A somewhere in the sequence, it can be adjacent to left shoes from any pair or the right shoe from pair A, but not the right shoes from pairs B or C. Similarly, a right shoe from pair A can be adjacent to right shoes from any pair or the left shoe from pair A.

Therefore, the key is that each left shoe must only be adjacent to left shoes or its corresponding right shoe, and each right shoe must only be adjacent to right shoes or its corresponding left shoe.

Much of the above is falling under the non-productive category, as it repeats earlier steps without introducing new insights. While some statements are true, they are interspersed with less relevant points. In essence, this line of thinking is not progressing meaningfully.

From here, DeepSeek R1 spent another thirteen thousands words (that I’d not include it in here) working on the problem, trying different approaches: dividing arrangements into “blocks” of pairs, defining a recursive function f(n) to compute the number of valid arrangements for nn pairs, counting Hamiltonian paths in a graph where nodes represent shoes and edges indicate valid transitions, subtracting invalid arrangements from the total number of permutations. Each of these approaches encountered challenges of its own and failed to provide a definitive solution.

The Final Answer

The final answer given by DS R1 looks like this:

To solve the problem of arranging three different pairs of shoes such that no left shoe is next to a right shoe from a different pair, we need to consider the constraints and possible valid arrangements.

1. **Understanding the Constraints**:
— Each pair consists of a left shoe (L1, L2, L3) and a right shoe (R1, R2, R3).
— No left shoe can be adjacent to a right shoe from a different pair.
— Left shoes can be adjacent to other left shoes, and right shoes can be adjacent to other right shoes.

2. **Possible Arrangements**:
— Each pair can either be placed together (as LR or RL) or separated.
— If separated, the left and right shoes must be placed in separate groups (all lefts together and all rights together).

3. **Case Analysis**:
— **Case 1: All pairs are separated**:
— All left shoes are grouped together, and all right shoes are grouped together.
— Number of ways: \(2 \times 3! \times 3! = 2 \times 6 \times 6 = 72\).
— **Case 2: One pair is together, and the other two are separated**:
— Choose which pair to keep together (3 choices), arrange the pair (2 ways), and arrange the remaining left and right shoes.
— Number of ways: \(3 \times 2 \times 2! \times 2! = 3 \times 2 \times 2 \times 2 = 24\).
— **Case 3: Two pairs are together, and one is separated**:
— Choose which pairs to keep together (3 choices), arrange the pairs (2² ways), and arrange the remaining left and right shoes.
— Number of ways: \(3 \times 2² \times 1! \times 1! = 3 \times 4 \times 1 \times 1 = 12\).
— **Case 4: All three pairs are together**:
— Arrange each pair in 2 ways (LR or RL) and permute the three pairs.
— Number of ways: \(2³ \times 3! = 8 \times 6 = 48\).

4. **Summing All Cases**:
— Total number of valid arrangements: \(72 + 24 + 12 + 48 = 96\).

Thus, the number of ways to line up the six shoes under the given constraints is \(\boxed{96}\).

As you can see, both case 1, 3, 4 are incorrect! The correct case number and total should be 0+24+24+12 = 60

For a list of solutions of this problem at AOPS click here.

Drawing from our quick analysis of DeepSeek R1’s problem-solving behavior, several significant insights emerge:

1. The model exhibits remarkably human-like cognitive patterns in mathematical problem-solving, including:
- Getting stuck in inefficient thinking loops
- Being distracted by tangential ideas at critical moments
- Showing resistance to corrections
- Demonstrating both moments of insight and periods of confusion

2. From a learning theory perspective, the model’s behavior illuminates key aspects of effective problem-solving:
- The importance of strategic approach selection before detailed analysis
- The value of recognizing when one is stuck in unproductive thinking
- The need for structured criteria for when to persist versus backtrack
- The role of metacognition in monitoring one’s problem-solving process

Intriguingly, the model’s verbose thinking process serves as a mirror for human cognition, suggesting that these patterns might be inherent to complex problem-solving rather than unique to biological or artificial intelligence. This parallel offers valuable opportunities for both AI development and human education, particularly in developing more effective problem-solving strategies and metacognitive skills.

--

--

Dong Liang
Dong Liang

No responses yet