rasbt/reasoning-from-scratch

每日信息看板 · 2026-03-02
开源项目
Category
github_search
Source
1
Score
2026-03-02T01:46:14Z
Published

AI 总结

rasbt/reasoning-from-scratch 提供配套《Build a Reasoning Model (From Scratch)》的代码与笔记本,手把手在开源基座LLM(Qwen3)上实现推理增强,帮助理解推理模型如何构建与评测。
#GitHub #repo #开源项目 #Qwen3 #inference-time scaling #self-refinement #GRPO

内容摘录

Build A Reasoning Model (From Scratch)

This repository contains the code for developing an LLM reasoning model and is the official code repository for the book *Build a Reasoning Model (From Scratch)*.

<br>
<br>

<a href="https://mng.bz/lZ5B"><img src="https://sebastianraschka.com/images/reasoning-from-scratch-images/cover.webp?123" width="250px"></a>

(Printed in color.)

<br>

In *Build a Reasoning Model (From Scratch)*, you will learn and understand how a reasoning large language model (LLM) works.

Reasoning is one of the most exciting and important recent advances in improving LLMs, but it’s also one of the easiest to misunderstand if you only hear the term reasoning and read about it in theory. This is why this book takes a hands-on approach. We will start with a pre-trained base LLM and then add reasoning capabilities ourselves, step by step in code, so you can see exactly how it works.

The methods described in this book walk you through the process of developing your own small-but-functional reasoning model for educational purposes. It mirrors the approaches used in creating large-scale reasoning models such as DeepSeek R1, GPT-5 Thinking, and others. In addition, this book includes code for loading the weights of existing, pretrained models.
Link to the official source code repository
Link to the book at Manning (the publisher's website)
Link to the book page on Amazon.com (TBD)
ISBN 9781633434677

<br>
<br>

To download a copy of this repository, click on the Download ZIP button or execute the following command in your terminal:

<br>
**Tip:**
Chapter 2 provides additional tips on installing Python, managing Python packages, and setting up your coding environment.

<br>
<br>
Table of Contents (In Progress)

Code tests Linux
Code tests macOS
Code tests Windows
Troubleshooting Guide

| Chapter Title | Main Code |
| ----------------------------------------------------------- | ------------------------------------------------------------ |
| Ch 1: Understanding reasoning Models | No code |
| Ch 2: Generating Text with a Pre-trained LLM | - ch02_main.ipynb<br/>- ch02_exercise-solutions.ipynb |
| Ch 3: Evaluating Reasoning Models | - ch03_main.ipynb<br/>- ch03_exercise-solutions.ipynb |
| Ch 4: Improving Reasoning with Inference-Time Scaling | - ch04_main.ipynb<br/>- ch04_exercise-solutions.ipynb |
| Ch 5: Inference-Time Scaling via Self-Refinement | - ch05_main.ipynb<br/>- ch05_exercise-solutions.ipynb |
| Ch 6: Training Reasoning Models with Reinforcement Learning | - ch06_main.ipynb<br/>- ch06_exercise-solutions.ipynb |
| Ch 7: Improving GRPO for Reinforcement Learning | - ch07_main.ipynb<br/>- ch07_exercise-solutions.ipynb |
| Ch 8: Distilling Reasoning Models for Efficient Reasoning | TBA |
| Appendix A: References and Further Reading | No code |
| Appendix B: Exercise Solutions | Code and solutions are in each chapter's subfolder |
| Appendix C: Qwen3 LLM Source Code | - chC_main.ipynb |
| Appendix D | TBA |
| Appendix E | TBA |
| Appendix F: Common Approaches to LLM Evaluation | - chF_main.ipynb |

<br>
&nbsp;

The mental model below summarizes the main techniques covered in this book.

<img src="https://sebastianraschka.com/images/reasoning-from-scratch-images/mental-model.webp" width="650px">

<br>

&nbsp;
Companion Book

Please note that *Build A Reasoning Model (From Scratch)* is a standalone book focused on methods to improve LLM reasoning.

In this book, we work with a pre-trained open-source base LLM (Qwen3) on top of which we code apply reasoning methods from scratch. This includes inference-time scaling, reinforcement learning, and distillation.

However, if you are interested in understanding how a conventional base LLM is implemented, you may like my previous book, *Build a Large Language Model (From Scratch)*.

<a href="https://amzn.to/4fqvn0D"><img src="https://sebastianraschka.com/images/LLMs-from-scratch-images/cover.jpg?123" width="120px"></a>
Amazon link
Manning link
GitHub repository

<br>
&nbsp;
Hardware Requirements

The code in the main chapters of this book is designed to mostly run on consumer hardware within a reasonable timeframe and does not require specialized server hardware. This approach ensures that a wide audience can engage with the material. Additionally, the code automatically utilizes GPUs if they are available. That being said, chapters 2-4 will work well on CPUs and GPUs. For chapters 5 and 6, it is recommended to use a GPU if you want to replicate the results in the chapter.

(Please see the setup_tips doc for additional recommendations.)

&nbsp;
Exercises

Each chapter of the book includes several exercises. The solutions are summarized in Appendix B, and the corresponding code notebooks are available in the main chapter folders of this repository (for example, ch02/01_main-chapter-code/ch02_exercise-solutions.ipynb).

&nbsp;
Bonus Material

Several folders contain optional materials as a bonus for interested readers:
**Chapter 2: Generating Text with a Pre-trained LLM**
Optional Python Setup and Cloud GPU Recommendations
Using a GPU-optimized version of the LLM
Using torch.compile() on Windows
Run inference and chat with the model
**Chapter 3: Evaluating LLMs**
MATH-500 Verifier Scripts
Advanced Parser (hybrid LaTeX parser)
**Chapter 4: Improving Reasoning with Inference-Time Scaling**
Inference Scaling on MATH-500 (CoT prompting, self-consistency)
**Chapter 5: Inference-Time Scaling Via Self-Refinement**
More Inference Scaling on MATH-500 (Best-of-N, self-refinement)
**Chapter 6: Training Reasoning Models with Reinforcement Learning**
GRPO scripts with a batched mode
**Chapter 7: Improving GRPO for Reinforcement Learning**
Advanced GRPO scripts (including DeepSeek-V3.2-, Olmo3-, and GDPO-style training)
**Appendix F: Common Approaches to LLM Evaluation**
MMLU Evaluation Methods
LLM leaderboards
LLM-as-a-judge

&nbsp;
Questions, Feedback, and Contributing to This Repository

For common problems, please see the Troubleshooting Guide.…