Skip to content
Free preview · Modules 1 and 2 of this paid course are open to everyone. Module 3 onward requires an access code.
Module 01 of 1245 min readBeginner

What AI is (and isn't) for an analyst

Definitions, history, why now: scaling, data, compute. The supervised/unsupervised/RL split, and where LLMs sit.

8%

Listen along

Read “What AI is (and isn't) for an analyst” aloud

Plays in your browser using on-device text-to-speech — nothing leaves the page.

Learning objectives

By the end of this module, you should be able to:

  • 01State a precise, useful working definition of AI vs machine learning vs deep learning
  • 02Distinguish supervised, unsupervised, and reinforcement learning, with one analyst-relevant example for each
  • 03Explain the three forces — scale of compute, scale of data, transformer architecture — that drove the 2017-2024 leap
  • 04Recognise where modern LLMs sit in the broader AI landscape

Artificial intelligence is the goal: build systems that do things humans associate with intelligence. Machine learning is the dominant modern approach: instead of programming the rules, you let the machine infer them from data. Deep learning is one family of machine-learning models that use neural networks with many layers — the family that has driven nearly every breakthrough since 2012.

The three flavours of machine learning

Supervised learning: you have labelled examples (loan applications labelled 'defaulted' or 'paid'), and the model learns to predict the label. This is most analyst work — credit scoring, churn prediction, fraud detection.

Unsupervised learning: no labels. The model finds structure in the data — clusters of similar customers, anomalies in transaction patterns. Useful for exploration.

Reinforcement learning: an agent takes actions in an environment, receives rewards, and learns a policy. The technology behind AlphaGo, behind robot control, and (post-pre-training) behind making LLMs helpful via RLHF.

Why now? The three forces that compounded

  • Compute: GPUs originally designed for video games turned out to be ideal for the matrix multiplications neural networks need. NVIDIA's market cap reflects that.
  • Data: the internet generated trillions of tokens of human text. The pre-training datasets for modern LLMs include large fractions of the publicly accessible web.
  • Architecture: the 2017 transformer paper introduced a way to handle sequences of any length in parallel, replacing recurrent networks. Every modern LLM (GPT, Claude, Gemini, Llama) is a transformer.

What an LLM actually does

A large language model is, mechanistically, a function: take a sequence of tokens, produce a probability distribution over the next token. Sampling from that distribution iteratively produces text. That's it. Everything impressive that LLMs do — answering questions, writing code, summarising documents — emerges from being very, very good at this one task at scale.

The honest definition

A modern LLM is a next-token predictor trained on roughly the entire public internet, with reinforcement-learning fine-tuning to make its outputs helpful, harmless, and honest. That's the whole machine — the rest is engineering around it.

What it isn't

It isn't a database of facts. It isn't a reasoning engine in the formal-logic sense. It doesn't 'understand' in the human sense, despite often producing outputs that read as if it does. These distinctions matter when you deploy it for analyst work — they tell you where to expect it to fail.

Exercise

For each of the following analyst problems, identify which flavour of machine learning is most appropriate (supervised, unsupervised, or reinforcement), and briefly justify the choice: (1) Predicting whether a SACCO loan applicant will default within 12 months. (2) Grouping 200,000 M-Pesa users into customer segments to target with different products. (3) Training an algorithmic trading agent that learns to execute large orders with minimal market impact. (4) Using a labelled set of 5,000 Kenyan court rulings to build a classifier that flags new cases as likely-appealable.

Key takeaways

  • Artificial intelligence is the goal; machine learning is the dominant approach; deep learning is one family of ML models
  • Supervised learning needs labels; unsupervised finds structure; RL learns by reward — most analyst work uses supervised
  • The 2017 'Attention Is All You Need' paper introduced the transformer, the architecture behind every major LLM today
  • Today's LLMs are remarkable in one specific way: they predict the next token, scaled to billions of parameters and trillions of training tokens

Further reading

  1. 01

    Attention Is All You Need

    Vaswani et al. · NeurIPS · 2017The transformer paper. Eight pages that started the modern era.

  2. 02

    Artificial Intelligence: A Modern Approach (4th Edition)

    Stuart Russell & Peter Norvig · Pearson · 2020

  3. 03

    Deep Learning

    Ian Goodfellow, Yoshua Bengio & Aaron Courville · MIT Press · 2016

Loading progress…
LeadAfrikPublic Economics Hub