This page presents information on AI from the perspective of the University of Ottawa. Researchers at uOttawa have been involved in AI research for years. After ChatGPT launched in 2022, the university and the Teaching and Learning Support Service were quick to follow up with guidance on generative AI in terms of academic integrity, teaching, and learning. Finally, uOttawa, as a proudly bilingual institution, has a particular interest in how generative AI tools function differently in different languages.
Most well-known generative AI tools are trained primarily in English. For example, Common Crawl's web crawl data, which is used as part of the training corpus of most large language models, consists of more than 45% English data, less than 5% French data, and less than one thousandth of a percent for its identified North American Indigenous languages combined. In its Educator FAQ, OpenAI acknowledges that ChatGPT "is skewed towards Western views and performs best in English. Some steps to prevent harmful content have only been tested in English." While language models such as Cedille are being developed for French and other languages, the most well-known generative AI tools continue to overrepresent English and underrepresent other languages. Such linguistic imbalances are of concern to the proudly bilingual University of Ottawa.