Ability to count tokens for models other than OpenAI

Had a great tip [on Discord](https://discord.com/channels/823971286308356157/1128504153841336370/1150151127183134720) about `tokenziers` - which says: https://huggingface.co/docs/tokenizers/python/latest/quicktour.html#using-a-pretrained-tokenizer

> You can load any tokenizer from the Hugging Face Hub as long as a `tokenizer.json` file is available in the repository.

And sure enough, this seems to work:

```
>>> import tokenizers
>>> from tokenizers import Tokenizer
>>> tokenizer = Tokenizer.from_pretrained("TheBloke/Llama-2-70B-fp16")
Downloaded 1.76MiB in 0s
>>> tokenizer.encode("hello world")
Encoding(num_tokens=3, attributes=[ids, type_ids, tokens, offsets, attention_mask, special_tokens_mask, overflowing])
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ability to count tokens for models other than OpenAI #8

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Ability to count tokens for models other than OpenAI #8

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions