MIN GPT Copy

open source
MIN GPT Copy preview
Jan 2026 - Jan 2026
PythonPyTorchGPTTransformerNLPMachine Learning

Educational PyTorch re-implementation of GPT training and inference (fork of Karpathy's minGPT)

A clean, readable PyTorch re-implementation of GPT (training and inference) based on Andrej Karpathy's minGPT. The core model is ~300 lines of code: a standard Transformer decoder (masked self-attention, feed-forward, layer norm) with a BPE tokeniser matching OpenAI's GPT encoding. Includes demo notebooks for a sorting task and GPT-2 text generation, a character-level language model project, and an addition task trained from scratch.

~300-line educational PyTorch GPT implementation for learning Transformer internals. • Model (mingpt/model.py): decoder-only Transformer with masked self-attention heads, feed-forward layers, layer norm, and configurable depth/width. Supports gpt2, gpt2-medium, gpt2-large, gpt2-xl presets. • BPE tokeniser (mingpt/bpe.py): byte-pair encoding matching OpenAI's GPT-2 vocabulary (50,257 merges). • Trainer (mingpt/trainer.py): generic PyTorch training loop; AdamW optimiser; configurable learning rate, batch size, max_iters. • Projects: adder (trains GPT to add numbers), chargpt (character-level LM on arbitrary text), demo.ipynb (sorting example), generate.ipynb (GPT-2 text generation from prompt). • Installation: pip install -e . for use as an importable mingpt library.