Optical Character Recognition (OCR)
It seems that they split it in two stages:
- Text Detection (using some sort of Semantic Segmentation)
- Text Recognition
https://huggingface.co/docs/transformers/model_doc/trocr#inference
I need to do this project to detect LateX.
Image to LateX
image to latex, writing my notes here