Evaluating platforms

Whether a tool is good or bad depends on its use and the expectation of the result. On this page, we provide a number of questions that you should consider when using generative AI.

Evaluating platforms

Whether a tool is good or bad depends on its use and the expectation of the result. On this page, we provide a number of questions that you should consider when using generative AI.

Five key points to evaluate your use

1
The scope of sources the platform covers
Depending on your specific use, it may be more or less relevant to have a correct, trustworthy, and complete answer. For example, if you are working on tasks where it is necessary to know the exact scope of sources, then general platforms such as ChatGPT may have difficulty providing this information, other than that it has 'scraped' and read all publicly available information on the internet. However, if you are simply using generative AI to find inspiration, then it may be less important.

Other platforms will be able to provide more precise information about the data that underlies their language model, including considerations about how material is processed to end up in a language model. Unless, of course, the information is treated as a business secret.
2
Content cut-off date
Many platforms have a content cut-off date, for when they have included content in their language model. This can often be important for assessing how up-to-date the result is. Similarly, it can be relevant to assess whether you can get a form of temporal understanding of the development within an area. This can mean that recent assumptions and results are drowned out by older ones, because the older ones dominate in the language model.

Other platforms have implemented Retrieval-Augmented Generation (forkortes RAG), and compensate by including internet searching as input to it's answers.
3
Norms and values
Platforms generate their responses (whether text or images) based on the training data that forms the basis of their language model. There may be several factors that make a given result biased. This can manifest itself in several ways:

The language model is trained on data that represents a particular point of view, culture, stereotypical perceptions, value system, or similar. Or that the data is pre-cleaned in accordance with ethical norms defined by the platform?

The platform is coded to suppress selected statements in the source material if they go against the ethical and moral norms. If you are investigating skepticism about climate change, will a given platform actually allow these views to emerge?

Read more about some of these issues in Assessing Cross-Cultural Alignment between ChatGPT and Human Societies: An Empirical Study (Cao et al., C3NLP 2023).
4
Can the platform generate fictional content?
Some platforms use language models and algorithms that fill gaps in knowledge needed to generate an output with new, "creative" content that does not exist in reality. This phenomenon is also referred to as "hallucinations", and is a bi-product of the models' probable rather than factual approach along with common training practices. The degree of hallucination may vary from one service to another. But pay attention - hallucination can be very difficult to identify.
5
Do you have enough knowledge to evaluate the result?
You should always ask yourself the question: Do I have enough knowledge to evaluate whether an output from a generative AI platform is correct? If you have limited knowledge of a subject, it may be tempting to get an introduction or similar from generative AI. But given the challenges described above, this is not without risks. Therefore, you should always seek other sources that can verify the output created by generative AI.

Modules on generative AI

Generative AI with Microsoft Copilot

AAU students and staff have access to Microsoft's Copilot AI platform with commercial data protection. This ensures that your content is not stored or used as training data. We've created a page on how to use Copilot and what you need to be aware of.

Read about Microsoft Copilot and use at AAU

Evaluating platforms

Evaluating platforms

Five key points to evaluate your use

Modules on generative AI

Generative AI with Microsoft Copilot

Main branch

Find us

Chat, book and follow

Contact

About AAU

Shortcuts

Service