Decoding on Baam's Techlog

Decoding on Baam's Techloghttps://baampark.github.io/tags/decoding/Recent content in Decoding on Baam's TechlogHugo -- 0.128.0en-usTue, 03 Jun 2025 15:28:55 -0400LLM Decoding: Inference in Autoregressive Language Modelshttps://baampark.github.io/posts/2025-06-03_llm_decoding/Tue, 03 Jun 2025 15:28:55 -0400https://baampark.github.io/posts/2025-06-03_llm_decoding/Most large language models (LLMs) today are autoregressive models. Before LLMs, NLP was fragmented — different problems like text classification, translation, summarization, and question answering all needed their own models, datasets, and training tricks. But then came GPT-2, and everything changed. GPT-2 is an autoregressive model trained purely on text generation — predicting the next word in a sequence — that’s called decoding.Surprisingly, this simple setup made it capable of handling a wide range of NLP tasks, often without fine-tuning.