What is Absolute Zero Reasoner – AZR?
Absolute Zero Reasoner (AZR), developed by researchers from Tsinghua University, the Beijing Institute for General Artificial Intelligence, and Pennsylvania State University, is an artificial intelligence model designed to autonomously develop reasoning abilities. Its fundamental feature is the ability to independently generate tasks for itself and then solve them. Importantly, the learning process of AZR is based on verifying the correctness of these solutions through an objective, external mechanism – in this case, a code executor – and does not require any pre-prepared training data by humans. This model operates within the RLVR (Reinforcement Learning from Verifiable Reward) paradigm, called “Absolute Zero,” meaning its development is driven by a reward system based on verifiable results of its own work. Sounds complicated? Let me explain!
AZR can autonomously generate tasks and then solve them, maximizing its own learning progress. Most importantly – it does this without the need to use any external data prepared by humans. It’s a bit like giving AI a sandbox and a shovel, and it starts building increasingly complex castles, learning from every grain of sand.
d-tags





