Improving our tokenization