Universal Character Set Transformation Format 8 bit (UTF-8)

Internally, web browsers use 4-byte “wide characters”

  • in C/C++ it’s the wchart_t type

UTF-8 uses a multi-byte variable-width encoding: