PointNet
https://arxiv.org/abs/1612.00593 There is a need for 3D Deep Learning
However, 3D has ways to be represented:
- Point Cloud
- Mesh
- Volumetric Projected View RGB(D)
Point Cloud is the closest to raw sensor data. Point cloud is canonical.
Invariance Permutation invariance Max Pooling gives best performance.
Offers a unified approach to various 3D recognition tasks:
- Classification
- Part Segmentation
- Semantic Segmentation
I remember where they talk about these functions that are symmetric or something? YES, INVARIANT FUNCTIONS
- Like the max function, or the sum function
Farthest point sampling to select centroids.
Walkthrough (CS231n 2025 Lec 15)
The invariances a point-cloud network must satisfy
A point cloud is an unordered set with (possibly with RGB). Two invariances:
- Permutation invariance: for any permutation . The ordering in the input list is arbitrary.
- Sampling invariance: output should depend only on the underlying geometry, not on which subset of points was sampled.
Symmetric-function decomposition
A function is symmetric if itβs invariant to permutation. Simple examples: , . PointNet uses the factorization:
- β shared per-point MLP.
- β a symmetric aggregator (PointNet picks max pool over the points per feature channel).
- β final MLP producing the task output.
This is provably a universal approximator over symmetric continuous set functions.
Point-cloud distances (for generation / reconstruction)
When the prediction is itself a point cloud, you need a permutation-invariant loss:
- Chamfer: . Cheap, asymmetric in effect β bad at penalizing density mismatches.
- Earth Moverβs (EMD): over bijections (requires equal size). Expensive, but geometrically meaningful.
Graph extensions
EdgeConv (Wang et al. TOG 2019) treats points as graph nodes and -NN neighborhoods as edges β lets the per-point also see local geometric context, not just the point itself.
Source
CS231n 2025 Lec 15 slides ~63β75 (permutation / sampling invariance, symmetric-function decomposition , max-pool aggregator, Chamfer + EMD, graph-on-points / EdgeConv).