PointNet

https://arxiv.org/abs/1612.00593 There is a need for 3D Deep Learning

However, 3D has ways to be represented:

  • Point Cloud
  • Mesh
  • Volumetric Projected View RGB(D)

Point Cloud is the closest to raw sensor data. Point cloud is canonical.

Invariance Permutation invariance Max Pooling gives best performance.

Offers a unified approach to various 3D recognition tasks:

  • Classification
  • Part Segmentation
  • Semantic Segmentation

I remember where they talk about these functions that are symmetric or something? YES, INVARIANT FUNCTIONS

  • Like the max function, or the sum function

Farthest point sampling to select centroids.