šŸ› ļø Steven Gong

Search

SearchSearch

May 06, 2025, 1 min read

Vision-Language Model (VLM)

Resources

  • https://huggingface.co/blog/vision_language_pretraining

Models

  • PaliGemma
  • LLaVA
  • Google Gemini

For example, Prismatic VLM.

In robotics, we fine-tune these VLMs to make VLAs.

Graph View

Backlinks

  • Ā LargeĀ LanguageĀ andĀ VisionĀ Assistant (LLaVA)
  • PaliGemma
  • Prismatic VLM

Created with Quartz, Ā© 2025

  • Blog
  • LinkedIn
  • Twitter
  • GitHub