Ramsri Goutham
1 min readMay 12, 2020

--

Hi Sajjad Dehqani, The preprocessing is done in the tokenizer step, encoding = tokenizer.encode_plus(question, context).

If you want you can print encoding[“input_ids”] or tokenizer.convert_ids_to_tokens[encoding[“input_ids”]] to see the preprocessed input tokens.

--

--

No responses yet