Generative AI in Teaching and Learning: Implications

The purpose of this online resource is to assist faculty, instructors, and graduate students develop an understanding of the potential benefits, complexity and dilemmas associated with the use of generative AI tools in higher education and how this understanding can be implemented in their own teaching.

> How do we define 'generative AI' and its impact on higher education?

> What policies exists around the use of 'generative AI' tools?

> What kinds of 'generative AI' tools exist for teaching and learning purposes?

> What is the impact on course design when considering generative AI tools?

> What are a few examples of assignment design that employ generative AI?

> How do we address assessment concerns around generative AI tools in higher education?

What is generative artificial intelligence? 

Generative artificial intelligence (AI) is a term for various models that can produce text, images, video, etc., based on the given input data. ChatGPT is one of many such tools that use generative artificial intelligence to produce text. As a user, you can ask tools such as ChatGPT questions and get seemingly credible answers back. As ChatGPT is the most used service for the production of text using generative artificial intelligence, this will frequently be used as an example in this Toolkit. Other tools such as Bing chat from Microsoft and Bard from Google will be able to function similarly to ChatGPT.

Use cases for Generative AI Models include:

  • Responds to questions and prompts.
  • Analyze, improve, and summarize text.
  • Write computer code or LaTex.
  • Translate text from one language to another.
  • Create new ideas, prompts, or suggestions for a topic or theme.
  • Generate text with specific attributes, such as tone, sentiment, or formality.

How are generative AI tools trained on data?

In order for generative AI tools to be able to produce content, it first needs to be trained. This training takes place by feeding the model and training it on data. During the training, connections are identified in the data, and it is these connections that enable the model to produce content that is perceived as innovative or original. 

Which data is used during the training will to a very large extent influence the results you get, and this is something you have to be aware of when you receive a response from the generative model. In the same way that our opinions and attitudes can be shaped by the information we have, generative models will produce content in line with the data they are trained on. As a user of these services, you can never be sure what data has been used in the training, unless the services have made this available in a way that can be reviewed. Although generative artificial intelligence can be useful tools, domain expertise in the relevant subject area is absolutely necessary to be able to assess the reliability of the content these tools produce.  

For more information on how data can, and has influenced artificial intelligence to make fallacies, consider the following article: Rise of AI Puts Spotlight on Bias in Algorithms.Links to an external site.

Below are a few implications of the use of these tools:

  • The LLM exists a result of the data it is trained on 

Even if you know what data a model is trained on, it is not necessarily the case that it produces meaningful results. Briefly explained, a model that uses high-quality data will most often produce high-quality content. Similarly, a model trained on low-quality data will produce low-quality content. 

As a user of these services, you do necessarily not know the quality of the data or how it is processed. If you blindly trust the results of a generative model, you therefore run the risk of using unreliable answers and sources. Indeed, there are now reports that document research support for university librariesLinks to an external site. being impacted by requests for non-existing books and articles.

  • In most cases, you have little control over what happens to the data you submit 

While newer tools are being created that allow users to retain control over their input data/prompts, with many of the most widely available generative models on the internet, users send information to servers whose location that they do not necessarily know. US policies and the EU are developing very strict regulations that regulate what companies can and cannot do with your information, but for online services there is no guarantee that they operate in the US/EU/EEA in accordance with these regulations.  

Unless the company that supplies the model you use has committed to handling your data in an appropriate, discoverable way, users run the risk that their input will be used to train other models or, for various reasons, go astray. It is therefore very important that users become aware of what data they send, just like most other services on the internet. 

  • Language models have a limited scope 

At first glance, it may seem that ChatGPT and similar language models function very similarly to humans when it comes to thinking and reasoning. There are still several things that ChatGPT simply cannot do because it is a language model. For example, it cannot remember facts, and will often output "hallucinated" factual errors in a very convincing way.

It for similar reasons that we must learn to employ these tools for very specific tasks to which they may be best suited, ensuring that the human use of the output of these tools in higher education is the final arbiter of meaning. As the MLA citation for ChatGPT makes clear, ChatGPT and similar tools are not "authors" and do not possess human agencyLinks to an external site., meaning that they are unable to weigh reasonable alternatives or think logically.

Even if a chatbot, for example, is trained with extensive amounts of data from around the world, this does not necessarily mean that it is a reliable source of information. Every time you ask the chatbot a question, it will generate several different answers in the background, calculate the probability that the answer is optimal and present this. Therefore, the robot does not function as a search engine with access to all the information it has been trained on, and must not be confused with searching for information in a database.