Curated library

Videos

Watch the clearest companion videos without browsing everything at once. Pick a path, continue where you left off, or use the filters when you know what you need.

Foundation

Build confidence with AI basics, prompting, privacy, hallucinations, and everyday use.

Practitioner

Learn workflows for meetings, writing, research, no-code tools, and repeatable business tasks.

Builder

Go deeper into agents, RAG, MCP, structured outputs, evals, APIs, and local AI.

Strategic

Cover governance, EU AI Act readiness, build-vs-buy decisions, ROI, and private AI choices.
63 results

Viewing learning path: BuilderShow all

4 minutes

Introducing EmbeddingGemma: The Best-in-Class Open Model for On-Device Embeddings

Google for Developers. The video introduces multilingual text embeddings that can run locally and support semantic search and RAG without sending every document to a hosted API. For Estonian companies, that is a useful technical complement to the article's internal-knowledge-search pattern: multilingual retrieval is valuable only when it also respects data locality, permissions and source authority.

Understand why multilingual embeddings matter for private internal search and where local retrieval can reduce data-exposure risk.

IntermediateAI for Business
18 minutes

AWS re:Invent 2025 - Implementing Human-in-the-Loop Controls for Multi-Agent AI Systems (CNS428)

AWS Events. This lightning talk names the business moments where human control is needed: high-stakes decisions, irreversible actions, regulatory requirements, trust-building phases, ambiguous edge cases and graceful degradation. It also shows concrete implementation mechanisms such as MCP elicitations, Step Functions callback waits and approval nodes.

See how approval gates can be implemented as explicit workflow checkpoints rather than informal manual review after something goes wrong.

IntermediateAutomations
7 minutes

Unlock Better RAG & AI Agents with Docling

IBM Technology. Explains the ingestion side of RAG and agents: preparing PDFs and other files so document structure, tables and layout survive into downstream retrieval. That supports the article's warning that RAG quality and safety begin before embedding, especially when parsing complex business documents.

Understand why document parsing, structure preservation and ingestion quality gates matter before building RAG over PDFs and mixed file formats.

AdvancedAI Safety & Data Privacy
20 minutes

Permissions & Access Control for RAG - a Deep Dive Tutorial

Paragon. Walks through the production RAG permission problem and compares tool-calling, namespaces, ACL tables and relationship-based permissions. That directly supports the article's core rule: retrieval must only return sources the current user is allowed to see, and source-system permissions cannot be treated as an afterthought.

Evaluate practical access-control patterns for company knowledge RAG before indexing sensitive internal documents.

AdvancedAI Safety & Data Privacy
48 minutes

How to Build Reliable AI Agents (Context + Evals Explained) | Tobias Leong, Axium

Arize AI. Explains why production agents fail when the system lacks the right context, evaluation data, tracing and domain expertise. It maps well to the article's failure-mode register because it makes reliability an engineering loop: separate retrieval from reasoning, define expected outcomes, evaluate tool calls, and trace failures before changing models.

Design AI workflows around context, evals and observability so production failures can be named, measured and fixed.

AdvancedAI Safety & Data Privacy
42 minutes

Vertical AI Agents Could Be 10X Bigger Than SaaS

Y Combinator. The Lightcone hosts work through why vertical AI agents — not horizontal wrappers — are the defensible shape for application-layer companies, with concrete examples and a clear-eyed take on which categories the model providers will eat. That is the anti-moat trap the article warns about, expressed as a positive playbook.

Assess when vertical AI agents create real defensibility and when they are only thin wrappers.

AdvancedAI for Business
34 minutes

How AI is Reinventing Software Business Models ft. Bret Taylor of Sierra

Sequoia Capital. Bret Taylor walks through the shift from per-seat SaaS to outcomes-based pricing — what to anchor on (resolution, CSAT, NPS), why incumbents struggle to follow, and how vertical specialisation creates pricing power. It directly mirrors the article's pricing and margin sections.

Evaluate AI product pricing and specialization around measurable outcomes rather than seat counts.

AdvancedAI for Business
32 minutes

Fast LLM Serving with vLLM and PagedAttention

Anyscale. Walks through why naive LLM serving wastes 60–80% of GPU memory, how PagedAttention borrows OS-style paging to fix that, and why continuous batching produces the 24× throughput numbers the article uses in its math. After this, the article's "you'll be lucky to hit 50% utilisation" line stops feeling abstract.

Understand why serving engines, batching and KV-cache memory dominate self-hosted inference economics.

AdvancedPrivate / Local AI
56 minutes

Build Hour: Prompt Caching

OpenAI. OpenAI's own Build Hour on prompt caching — the 1024-token threshold, the prefix-stability requirement, audio caching at 99% discount for realtime, time-to-first-token impacts at long inputs. Useful when you are sizing the engineering effort to actually hit the cache reliably on your production prompts.

Use prompt caching only when stable prefixes, latency and cost behavior match the workload.

AdvancedAI for Business
19 minutes

Is This the End of RAG? Anthropic's NEW Prompt Caching

Prompt Engineering. Walks through Anthropic's prompt caching against Gemini's context caching with concrete latency-and-cost reductions per use case (long-document chat, few-shot, multi-turn). The breakdown of cache-write surcharge vs. cache-read discount is exactly what the article assumes when it talks about when caching pays off.

Walks through Anthropic's prompt caching against Gemini's context caching with concrete latency-and-cost reductions per use case (long-document chat, few-shot, multi-turn).

AdvancedAI for Business
17 minutes

Defending LLM - Prompt Injection

LiveOverflow. Walks through the actual defence-in-depth playbook — taint analysis on LLM output, restricting expected output shapes, user isolation, few-shot scaffolds, fine-tuning, temperature 0 for determinism, redundancy for critical paths. It matches the article's defence-stack section almost item for item.

Review prompt-injection defenses such as taint analysis, output-shape restrictions, user isolation, deterministic settings and redundant checks for critical paths.

AdvancedAI Safety & Data Privacy
13 minutes

Attacking LLM - Prompt Injection

LiveOverflow. Frames prompt injection as a classic injection attack against systems that mix instructions and untrusted data — with a concrete content-moderation example where an attacker frames an innocent user. The mental shift from "the model is the target" to "the application is the target" is exactly the move the article opens with.

Model prompt injection as untrusted-data mixing and design boundaries around tool use.

AdvancedAI Safety & Data Privacy
5 minutes

Claude has taken control of my computer...

Fireship. The clearest short explanation on YouTube of the screenshot–action–screenshot loop, including the honest failure modes (Claude wandering off to look at Yellowstone, token burn, latency per step). Fireship is light on production detail by design — read the article for that — but it leaves you with the right intuition for why these systems are expensive and brittle before you commit one to your stack.

Understand why screenshot-based computer use is powerful, slow, expensive and brittle compared with API-native automation.

AdvancedAutomations
44 minutes

Building Brain-Like Memory for AI | LLM Agent Memory Systems

Adam Lucek. A longer implementation pass through the cognitive-science-inspired categories — episodic, semantic, working, procedural — wired into an agent in code. Worth watching after the LangChain conceptual video if you want a more opinionated mental model and a working example to crib from.

A longer implementation pass through the cognitive-science-inspired categories — episodic, semantic, working, procedural — wired into an agent in code.

AdvancedAutomations
7 minutes

Memory for agents (conceptual video)

LangChain. Short, no-code walkthrough of the short-term-vs-long-term split, the three shapes long-term memory tends to take (instructions, profile, list of objects), and the hot-path-versus-background trade-off for when to write. The article's memory-architecture section assumes exactly this taxonomy.

Separate short-term and long-term memory decisions and decide when agent memory should be written.

AdvancedAutomations
22 minutes

Context Engineering for Agents

LangChain. Lance Martin's framework — write, select, compress, isolate — with concrete examples of when to summarise an action history, when to offload state to files, and when to spin up sub-agents purely to protect the parent's context. Maps almost directly onto the article's section on managing 1M-token windows in practice.

Apply write, select, compress and isolate patterns to manage agent context deliberately.

AdvancedPrompt Engineering
8 minutes

Context Rot: How Increasing Input Tokens Impacts LLM Performance

Chroma. Kelly Hong walking through Chroma's research on 18 models — why needle-in-haystack scores are misleading, how performance degrades with ambiguity and distractors, why even simple string-repetition tasks degrade past 500 tokens. Short, evidence-based, and exactly the case the article needs you to take seriously before getting to the engineering moves.

Understand how long context can fail under ambiguity and distractors, then design tests around that risk.

AdvancedPrompt Engineering
66 minutes

CrewAI Tutorial: Complete Crash Course for Beginners

aiwithbrandon. The same kind of build, but in CrewAI's role-goal-backstory style — agents as team members, tasks as deliverables, the framework hiding the execution loop. Watch it immediately after the LangGraph course; the contrast in how much the framework decides for you is exactly what the article is asking you to weigh.

The same kind of build, but in CrewAI's role-goal-backstory style — agents as team members, tasks as deliverables, the framework hiding the execution loop.

AdvancedAutomations
190 minutes

LangGraph Complete Course for Beginners – Complex AI Agents with Python

freeCodeCamp.org. A long, code-along build through LangGraph's state graphs, nodes, edges, conditional routing, checkpoints, and tool use. By the end you have enough feel for the typed-state, "every transition is explicit" model that the article's comparison to CrewAI and to direct-API code stops being abstract.

A long, code-along build through LangGraph's state graphs, nodes, edges, conditional routing, checkpoints, and tool use.

AdvancedAutomations
18 minutes

Tips for building AI agents

Anthropic. Three Anthropic engineers walking through the most common pitfalls they see — agents that don't know when to stop, over-prompting in the system prompt instead of fixing the environment, the cost of multi-agent designs nobody actually needed. Useful right after Barry's talk; you will recognise the same patterns from a different angle.

Recognize common agent-building pitfalls before adding multiple agents, complex prompts or hidden state.

AdvancedAutomations
15 minutes

How We Build Effective Agents: Barry Zhang, Anthropic

AI Engineer. Barry Zhang on three rules — don't build an agent when a workflow would do, keep the loop as simple as possible, and "think like your agent" (sit in its context window and notice that it is making decisions in the dark between screenshots). The simplicity argument and the "is this task even worth an agent" checklist are exactly the discipline the article asks for.

Design simpler agent loops with clear stopping rules, task boundaries and human control points.

AdvancedAutomations
59 minutes

Developing an LLM: Building, Training, Finetuning

Sebastian Raschka. Sebastian Raschka's slower walkthrough of where fine-tuning sits in the broader LLM training pipeline — instruction tuning, classification fine-tuning, parameter-efficient methods, and the trade-offs the article calls out before recommending LoRA. Good calibration before you start, especially if your team is debating whether fine-tuning is even the right step.

Place fine-tuning inside the broader training pipeline and decide when it is better than prompting or RAG.

AdvancedPrivate / Local AI
157 minutes

Fine Tuning LLM Models – Generative AI Course

freeCodeCamp.org. Long, theory-then-code course covering quantisation, LoRA, QLoRA, and full PEFT on Llama 2 and Gemma — on hardware most developers actually have. It is the closest thing to a "shadow somebody who has done this" experience on YouTube and lines up with the article's "you don't need a cluster" claim with concrete VRAM budgets.

Long, theory-then-code course covering quantisation, LoRA, QLoRA, and full PEFT on Llama 2 and Gemma — on hardware most developers actually have.

AdvancedPrivate / Local AI
16 minutes

Graph RAG: Improving RAG with Knowledge Graphs

Prompt Engineering. A focused walkthrough of Microsoft's GraphRAG — entity extraction, community summaries, query-focused summarization — set up on a local machine with cost notes. Watch it for the graph-RAG section of the article specifically; the cost discussion is the part most write-ups skip.

Understand the Microsoft-style GraphRAG flow: entity extraction, communities, summaries and query-focused synthesis.

AdvancedAI for Business

Showing 24 of 63