Notes on the new Deepseek r1
Before anything, let’s bow to Richard Sutton; he was so early for this. Pure RL, neither Monte-Carlo tree search (MCTS) nor Process
Before anything, let’s bow to Richard Sutton; he was so early for this. Pure RL, neither Monte-Carlo tree search (MCTS) nor Process