SlowFast

Two-pathway 3D CNN for video recognition (Feichtenhofer et al ICCV 2019, https://arxiv.org/pdf/1812.03982.pdf). Inspired by retinal P-cells (slow, color, detail) vs M-cells (fast, motion):

Slow pathway: low frame rate ( $\sim$ 2 fps), high channel capacity — captures spatial semantics
Fast pathway: high frame rate ( $\sim$ 16 fps), $\sim$ 1/8 the channels — captures motion

Lateral connections fuse fast→slow features. The asymmetric channel split keeps the fast pathway cheap despite its 8× temporal resolution.

CS231n 2025 Lec 10 lists SlowFast (with Nonlocal block) at 79.8 Kinetics-400 top-1, slotting between I3D (74.2) and the modern ViT-style video models (MViTv2-L 86.1, VideoMAE V2-g 90).

Action Classification
Convolutional Neural Network

🛠️ Steven Gong

Table of Contents

SlowFast

Graph View

Backlinks

🛠️ Steven Gong

Table of Contents

SlowFast

Related

Graph View

Backlinks