🛠️ Steven Gong

Search

SearchSearch

Jul 16, 2025, 1 min read

Policy

Categorical Policy

Learned this from spinning up https://spinningup.openai.com/en/latest/spinningup/rl_intro.html.

A categorical policy is like a classifier over discrete actions.

  • It’s essentially the same ideas that used to train a next-token autoregressive model in the LLM world

Graph View

Backlinks

  • Policy

Created with Quartz, © 2025

  • Blog
  • LinkedIn
  • Twitter
  • GitHub