Optimal Strategy for Connect 4

(2swap.github.io)

227 points | by marvinborner 3 days ago

11 comments

sillysaurusx 10 hours ago

I'm surprised no one linked to his video on the topic. I can't overstate how high quality it is. The graphs are simply beautiful, and it made me think he had a whole production team behind him. That he was able to do cutting-edge work like this (it's new, which qualifies) while creating a work of art is incredible.
"I Solved Connect 4" https://www.youtube.com/watch?v=KaljD3Q3ct0

[-]
- lifthrasiir 9 hours ago
  
  > [I]t made me think he had a whole production team behind him.
  He has a program instead: https://github.com/2swap/swaptube/blob/1a0d5369d523536d48e4/... [1].
  [1] Note that this commit ID is visible from the video itself: https://youtu.be/KaljD3Q3ct0?t=1002
- laserbeam 6 hours ago
  
  1. The video is amazing. It does deserve a shoutout.
  2. I actually prefer when HN links to articles rather than videos. I kind of expect interesting reads, not watches when I open this forum.
- echelon 9 hours ago
  
  This guy's entire channel is amazing. You have to watch all of it. It's beautiful, mesmerizing, and academically delightful.
- CamelCaseName 8 hours ago
  
  Great channel, thanks for sharing.
  I'm not clear how they came to the steady state solution, or how it is memorizable, or is it just intended to be brute forced at that point?
  At some point long ago, I was bored and memorized a whole bunch of openings and intuitive rules, and would end up with a 90%+ win rate. Lots of fun.
  
  [-]
  - vscode-rest 7 hours ago
    
    Author used flash cards to memorize the steady state positions.
    Deriving the steady state solutions was most interesting to me as well, author just described it as “a genetic algorithm”.
tromp 11 hours ago

> which is fundamentally different from existing strong and weak solutions
It doesn't seem fundamentally different from Victor Allis' solution, but 2swap managed to generalize and streamline the rules available for static solutions, while also picking the winning moves that reduce the overall tree size.
> A weak solution can be visualized in a way that a strong solution (14tb uncompressed, 350gb compressed) cannot.
That is using an overly strict interpretation of strong solution. My database of all roughly 68000 8-ply positions allows for computing the best move from any position within seconds and takes only 12KB compressed (using one trit per 8-ply position, 5 trits per byte, using remaining 256-3^5=13 values for run length encoding).
[1] https://tromp.github.io/c4/c4.html

[-]
- adito 6 hours ago
  
  From 350gb compressed to 12kb is pretty impressive. What makes that particular strong solution needs so much space? It's from one of the paper mentioned right?
  Edit: oh, found it, it's from this repository: https://github.com/sidneycadot/connect4/blob/main/7x6/README...
ajkjk 3 hours ago

I've been intrigued about the thing they call the "data product" for a long time: the fact that there's a sort of equivalence between precomputing position analysis and doing no runtime analysis vs knowing nothing and doing everything at runtime. It is a general property of a lot of algorithms that you can precompute various amounts of computation in order to reduce runtime complexity. It's also the difference between "telling an LLM to do a task" and "telling an LLM to write a bunch of code to do a task", which are also the two ends of a spectrum.
Especially interesting is the fact that the optimal strategy, where on the spectrum you go, is affected by the effectiveness of your algorithms on each side. The more efficiently you can cognitively compress precomputed decisions, the more it makes sense to precompute; the more efficiently you can apply techniques for move selection at runtime. And the two interact a lot: for instance in chess, there are simple heuristics for a lot of endgames, meaning that the state space you explore at runtime can terminate as soon as it gets to an endgame that you have memorized a solution for.
I wonder if anyone knows anyone examining this phenomenon in generality?

[-]
- hydrolox 3 hours ago
  
  isn't this very similar or a case of trading space for time or vice versa (e.g in algorithm analysis)
  
  [-]
  - ajkjk 3 hours ago
    
    That is certainly an example of it, but I feel like what I'm talking about is more general. In algorithms you tend to be able to say concretely exactly how much space vs time you're trading off. But in a game like Connect 4 or chess it's a lot harder to say exactly how much you gain from, say, adding one more heuristic to your board analysis, or memorizing one more opening subtree. I don't know how to think about that mathematically, and I would like to. It's related, I guess, to what the OP says about finding a graph which is information-theoretically small rather than graph-theoretically small.
    Generally it's interesting to contemplate that there may exist heuristics which no one has thought of yet that dramatically improve your overall performance. Actually I thought of an example of this. When AlphaZero and Leela (the ML-based chess engines) started beating Stockfish (the preeminent graph-search chess engine), one of the early things people noticed was that they loved to push their a- and h- pawns down the board, which was, I guess, a sorta passive move that people had systematically undervalued before (cf [1]; I remember seeing a good youtube video about it also but I can't recall what it was). Which means that just having this heuristic was worth some amount of ELO the whole time, just, no one had realized it! Presumably the optimal play for a human in chess, whose mental capacity is limited, is a certain mix of memorizing fixed positions vs heuristics, and the interactions between those are what's interesting.
    [1]: https://www.reddit.com/r/chess/comments/ljp151/why_is_h4_pla...
Someone 3 days ago

FTA: “As a motivating example: player 1 (hereafter dubbed "Red") can win by playing in the center column on the first move and then following the weak solution's suggestions, but would not be guaranteed to win if the first disc is played elsewhere. The weak solution contains no information about what would happen in the other columns- As far as Red cares, it would be redundant to learn those branches, since they are not good.”
I don’t think that “since they are not good” is necessary for a weak solution. Even if every first move were winning, it still would be redundant to learn how to win for every possible opening move.
A weak solution gives you a guaranteed way to move from START to a win, whatever counterplay, not all ways to go from START to a win, whatever counterplay.
cachius 11 hours ago

This guy’s videos are awesome. He also has one on Klotski and the double pendulum. Beautiful graph animations.

[-]
- contraposit 11 hours ago
  
  I also liked the one on lambda calculus. I hope one day we will be able to find interpretation of what it actually means for PLUS Times Plus. Maybe this is how we will explore nonstandard arithmetic.
  What is PLUS times PLUS?
  https://www.youtube.com/watch?v=RcVA8Nj6HEo
  
  [-]
  - sillysaurusx 10 hours ago
    
    Godel's incompleteness theorem lets you turn PLUS into a number, do some operations on it, and then turn it back into a symbol. So PLUS times PLUS already has a definite answer. Perhaps not a sensible one, but a definite one.
    
    [-]
    - pxeger1 9 hours ago
      
      You're talking about Gödel encoding, not Godel's incompleteness theorem.
    - contraposit 7 hours ago
      
      Yes it could just simply be a syntactic sugar for a complex operation taking in 4 numbers. But this reminds me of Mirror Symmetry between two theories in String Theory where complex calculations in one theory gets mapped to simple calculation in another theory. Similarly we might have translation dictionary between standard arithmetic and non-standard arithmetic where complex calculation in standard arithmetic becomes easy calculation in non-standard arithmetic.
- jychang 11 hours ago
  
  OH it's that guy.
  His double pendulum video was orgasmic.
  Edit: Oh wait, no, I was thinking of the Drew's Campfire double pendulum video. That video was extra interesting because the creator is not a typical content producer. He just has a few videos without any views, then dropped what might be one of the best videos of all time, and then went back to his technical videos.
  [1] https://www.youtube.com/watch?v=8jVogdTJESw&t=212s
  
  [-]
  - satvikpendem 6 hours ago
    
    That one is a great video as well, both are good to watch back to back as they go well together.
  - jonahx 5 hours ago
    
    Not sure why this is being downvoted, but I watched the recommended video in a single riveted sitting. Absolutely amazing.
donkeyboy 7 hours ago

Sadly this doesnt have a simple way to know how to win besides the ForceEven approach and offering an Anki Deck. I wish there was some more intuition or guidance around winning that humans can memorize

[-]
- pfg_ 1 hour ago
  
  This is the first human-learnable weak solution for connect 4. Surely it can be improved.
tadasv 4 hours ago

for those who are into books i'd recommend taking a look at alphago simplified by mark liu. It has rule based strategies + DL for connect for and other simple games.
anant_who 7 hours ago

This is the clearest breakdown of Connect 4 I’ve seen.
sipsi 7 hours ago

sounds like 5 dimensional chess to me

Connect 4 is one of those games that looks simple until you realize it was solved in 1988 and the optimal strategy still takes serious compute to run in real time.

[-]

tromp 5 hours ago

Not that serious, since you can solve the 7x6 game from scratch on a typical 2013 single core in a few minutes (and in 67s on an Apple M2):

    $ time ./SearchGame < input.root 
    Fhourstones 3.2 (C)
    Boardsize = 7x6
    Using 8306069 transposition table entries of size 8.
    
    Solving 0-ply position after  . . .
    44+26
    45+28
    42+27
    47+26
    4-29
    +29
    score = 5 (+)  work = 29
    1479113766 pos / 139494 msec = 10603.4 Kpos/sec
    - 0.249  < 0.125  = 0.027  > 0.191  + 0.408
    
    real    2m24.904s

ineedasername 6 hours ago

No, no-- I've seen the movie and I'm pretty sure it was established that the only winning move was not to play. Not ruling out the possibility I'm misremembering, there was more than one game in the movie, it could have been Galaga?

[-]
- tialaramex 2 minutes ago
  
  Tic-Tac-Toe is the existing game the machine realises can't be won, and then Global Thermonuclear War is the next game it simulates and discovers also cannot be won according to the metrics it is using.
  Connect 4 is a win for the first player