tiktoken
This is what openAI uses. It uses Byte Pair Encoding under the hood. Essentially, you start with some base tokenization scheme will a very small vocabulary (256 tokens, each char is a byte). Then, you iteratively merge the most common tokens.
Gpt-2 has a vocabulary of ~50k different tokens.
Gpt-4 base model uses 100k different tokens.
You can see this:
What uses tiktoken?