Technology13 slides1 view

AI & Machine Learning

A 70-year arc that stalled twice, then accelerated past most of its critics. Below: dates, names, and the equations that built modern AI.

Standalone Download

Shared with ShipslidesCreate your own deck →

About this HTML presentation

This Shipslides page presents AI & Machine Learning as an interactive HTML presentation deck in the Technology catalog with 13 slides. The share page keeps the uploaded deck sandboxed while exposing readable context, topics, and a slide outline for viewers and search engines.

A 70-year arc that stalled twice, then accelerated past most of its critics. Below: dates, names, and the equations that built modern AI. Key sections include: The long path from neuron to network.; 1943–1958: The neuron, formalized.; The two AI winters.; 1986: Backpropagation, popularized.; Convolutions and the GPU.; 2012: AlexNet and the spark.; 2017: Attention is all you need.; Scaling laws.; The modern LLM stack.; Multimodal & tool use..

Key sections

01The long path from neuron to network.
021943–1958: The neuron, formalized.
03The two AI winters.
041986: Backpropagation, popularized.
05Convolutions and the GPU.
062012: AlexNet and the spark.
072017: Attention is all you need.
08Scaling laws.
09The modern LLM stack.
10Multimodal & tool use.
11Agents.
12Alignment & safety.
13Watch this.

Topics covered

technology and

Related decks

Technology32 slides

Blockchain Technology

Slide outline

01The long path from neuron to network.
021943–1958: The neuron, formalized.
03The two AI winters.
041986: Backpropagation, popularized.
05Convolutions and the GPU.
062012: AlexNet and the spark.
072017: Attention is all you need.
08Scaling laws.
09The modern LLM stack.
10Multimodal & tool use.
11Agents.
12Alignment & safety.
13Watch this.

Page data

Canonical: https://shipslides.com/d/technology-ai-and-ml
Category: Technology
Size: 153.1 KB
Updated: 2026-05-17
LLM text: https://shipslides.com/d/technology-ai-and-ml/llms.txt

Presentation Transcript

Detailed slide-by-slide text content extracted from this presentation.

Slide 01

The long path from neuron to network.

Deck 01 / Modern editorial
A 70-year arc that stalled twice, then accelerated past most of its critics.
Below: dates, names, and the equations that built modern AI.
Figure 1. Procedurally seeded image, picsum.photos. Decorative.

Slide 02

1943–1958: The neuron, formalized.

In 1943, Warren McCulloch and Walter Pitts proposed a binary threshold model of the
neuron — a logic gate with weighted inputs. Fifteen years later Frank Rosenblatt built the
Mark I Perceptron at the Cornell Aeronautical Laboratory, a 400-photocell machine that could learn
to distinguish marked cards. The New York Times announced an "embryo of an electronic computer
that the Navy expects will be able to walk, talk, see, write, reproduce itself."
Figure 2. The Rosenblatt perceptron: y = step(Σ wᵢxᵢ + b).

Slide 03

The two AI winters.

In 1969 Marvin Minsky and Seymour Papert published Perceptrons, proving the single-layer
model could not learn XOR. Funding dried up. A second winter followed in the late 1980s and early
1990s when expert systems failed to scale economically.
"There is no reason to suppose that any of these virtues carry over to the many-layered version."
— Minsky & Papert, Perceptrons, 1969 (later revised)

Slide 04

1986: Backpropagation, popularized.

Rumelhart, Hinton, and Williams' Nature paper "Learning representations by back-propagating errors"
showed that gradient descent through a chain rule could train multi-layer networks. The math had been
derived by Seppo Linnainmaa in 1970 and applied to NNs by Werbos in 1974 — but the 1986 paper made it stick.
# a tiny pure-python sketch
for epoch in range(N):
y_hat = forward(x, W)
loss = mse(y_hat, y)
grads = backward(loss, W) # chain rule
W -= lr * grads # gradient descent

Slide 05

Convolutions and the GPU.

Yann LeCun's LeNet-5 (1998) read postal codes with convolutional layers — local receptive fields,
weight sharing, pooling. The technique waited for hardware: in 2009 Raina, Madhavan and Ng showed
GPUs could train deep networks 70× faster than CPUs.
Figure 3. The classical convolutional pipeline (LeNet-5 family).

Slide 06

2012: AlexNet and the spark.

Krizhevsky, Sutskever, and Hinton's AlexNet halved the ImageNet top-5 error rate to 15.3%.
Two NVIDIA GTX 580s, ReLU activations, dropout, and 60M parameters. The result was so far ahead of
the field that the deep-learning revolution effectively dates from this paper.
YearTop-5 errorModel
201028.2%NEC-UIUC (SIFT + SVM)
201125.8%Xerox
201215.3%AlexNet
20146.7%GoogLeNet
20153.6%ResNet-152

Slide 07

2017: Attention is all you need.

Vaswani et al. dropped recurrence entirely. Self-attention computes a weighted average of values,
with weights from scaled dot-products of queries and keys.
Attention(Q, K, V) = softmax( Q · Kᵀ / √dₖ ) · V
Parallelizable across sequence positions, the transformer scaled to GPT-3's 175B parameters by 2020
and beyond. Every modern frontier model — GPT, Gemini, Claude, Llama — is a transformer or close descendant.

Slide 08

Scaling laws.

Kaplan et al. (2020) and Hoffmann et al. (2022, "Chinchilla") found loss falls as a power law in
parameters, data, and compute. The Chinchilla update: for a fixed compute budget, you want roughly
equal scaling of parameters and tokens (~20 tokens per parameter).
Figure 4. Stylized loss vs. compute (Kaplan/Hoffmann scaling).

Slide 09

The modern LLM stack.

Pretraining
Self-supervised next-token prediction on web text, code, books, and licensed corpora. Trillions of tokens.
SFT
Supervised fine-tuning on curated demonstrations. Teaches the model the desired output format and tone.
RLHF / RLAIF
Reinforcement learning from human or AI preferences. PPO, DPO, or constitutional methods.
Inference
KV-cache, speculative decoding, quantization, MoE routing. The serving layer is now a research field of its own.

Slide 10

Multimodal & tool use.

CLIP (2021) tied images and text into a shared embedding space. By 2024 frontier models were natively
multimodal: text in, text-image-audio-video out. Tool use — function calling, browsing, code execution —
turned chatbots into agents that can act.
CLIPDALL-ESoraGeminiClaude

Slide 11

Agents.

An agent is a model in a loop with tools and memory. The 2025–2026 wave — Claude with computer use,
OpenAI Operator, Devin, AutoGPT descendants — pushed reliability past the threshold for real work:
software engineering, research, customer support, ops.
while not done:
obs = env.observe()
thought, action = model(obs, history)
obs = env.act(action)
history.append((obs, action))

Slide 12

Alignment & safety.

The technical problem: train a system whose behavior matches human intent across distribution shift.
Key concepts include reward hacking, deceptive alignment, eval-gaming, and scalable oversight. The field
draws from RL, mechanistic interpretability, and formal verification.
"The genie does what you ask, not what you want."— folk maxim of the alignment community

Slide 13

Watch this.

Watch: transformers explained
Open problems
Sample-efficient continual learning without catastrophic forgetting.
Robust mechanistic interpretability of large transformers.
Scalable oversight of superhuman models.
Energy and water cost of inference at planetary scale.

Remove this deck