Curated library

Videos

Watch the clearest companion videos without browsing everything at once. Pick a path, continue where you left off, or use the filters when you know what you need.

157 results
4 minutes

Introducing EmbeddingGemma: The Best-in-Class Open Model for On-Device Embeddings

Google for Developers. The video introduces multilingual text embeddings that can run locally and support semantic search and RAG without sending every document to a hosted API. For Estonian companies, that is a useful technical complement to the article's internal-knowledge-search pattern: multilingual retrieval is valuable only when it also respects data locality, permissions and source authority.

Understand why multilingual embeddings matter for private internal search and where local retrieval can reduce data-exposure risk.

IntermediateAI for Business
32 minutes

How to Build Human-Centered AI Workflows in Localization with Shashi Bhushan

Crowdin. Shashi Bhushan starts with workflow mapping rather than tool selection, then covers source-text quality, human review, AI proofreading, glossary checks, product-team involvement, pilots and privacy constraints. That is almost exactly the operating model the article recommends for Estonian teams working across Estonian, English, Russian, Finnish and customer-specific terminology.

Learn how to introduce AI into localization without removing human ownership of meaning, tone, terminology and final approval.

IntermediateAI for Business
18 minutes

AWS re:Invent 2025 - Implementing Human-in-the-Loop Controls for Multi-Agent AI Systems (CNS428)

AWS Events. This lightning talk names the business moments where human control is needed: high-stakes decisions, irreversible actions, regulatory requirements, trust-building phases, ambiguous edge cases and graceful degradation. It also shows concrete implementation mechanisms such as MCP elicitations, Step Functions callback waits and approval nodes.

See how approval gates can be implemented as explicit workflow checkpoints rather than informal manual review after something goes wrong.

IntermediateAutomations
17 minutes

12-Factor Agents: Patterns of reliable LLM applications — Dex Horthy, HumanLayer

AI Engineer. Dex Horthy explains why reliable agent systems are mostly disciplined software around a few LLM calls: own the prompt, own the context window, keep control flow deterministic and use tool calls to contact humans when the workflow needs judgment. That maps directly to the article's approval, exception and escalation patterns.

Learn how to design AI workflows that can pause, resume, ask for human judgment and keep business state separate from model guesses.

IntermediateAutomations
7 minutes

Unlock Better RAG & AI Agents with Docling

IBM Technology. Explains the ingestion side of RAG and agents: preparing PDFs and other files so document structure, tables and layout survive into downstream retrieval. That supports the article's warning that RAG quality and safety begin before embedding, especially when parsing complex business documents.

Understand why document parsing, structure preservation and ingestion quality gates matter before building RAG over PDFs and mixed file formats.

AdvancedAI Safety & Data Privacy
59 minutes

From Hype to Habit: How Tech Companies Are Scaling AI Beyond the Experimental

Propeller Consulting. Discusses governance, operating discipline, workforce adoption and ROI measurement as connected parts of scaling AI beyond experiments. That fits the article's maturity model because adoption is treated as changed work with owners and metrics, not as tool usage or workshop attendance.

Connect AI adoption maturity to workflow-level measurement, governance, operating health and sustained behavior change.

AdvancedAI for Business
41 minutes

Private AI vs. Cloud: How Enterprise Leaders Can Make Smarter Build-or-Buy Decisions

World Wide Technology. Connects build-or-buy choices to business outcomes, workload placement, cloud economics, data sovereignty, security, infrastructure readiness and hybrid operating models. That makes it a useful strategic companion for deciding when to buy a tool, extend a platform, build a thin custom layer or own more of the deployment stack.

Make AI build-vs-buy decisions around outcome, data control, workload economics, infrastructure readiness and operational ownership.

AdvancedAI for Business
20 minutes

Permissions & Access Control for RAG - a Deep Dive Tutorial

Paragon. Walks through the production RAG permission problem and compares tool-calling, namespaces, ACL tables and relationship-based permissions. That directly supports the article's core rule: retrieval must only return sources the current user is allowed to see, and source-system permissions cannot be treated as an afterthought.

Evaluate practical access-control patterns for company knowledge RAG before indexing sensitive internal documents.

AdvancedAI Safety & Data Privacy
48 minutes

How to Build Reliable AI Agents (Context + Evals Explained) | Tobias Leong, Axium

Arize AI. Explains why production agents fail when the system lacks the right context, evaluation data, tracing and domain expertise. It maps well to the article's failure-mode register because it makes reliability an engineering loop: separate retrieval from reasoning, define expected outcomes, evaluate tool calls, and trace failures before changing models.

Design AI workflows around context, evals and observability so production failures can be named, measured and fixed.

AdvancedAI Safety & Data Privacy
35 minutes

AI Code Generation: Wins, Fails and the Future

IBM Technology. Discusses the uneven "barbell" shape of AI coding performance, architecture ownership, agent orchestration, context limits, open-source versus proprietary tooling and why models can solve hard tasks while still failing ordinary engineering details. That supports the article's rule that tests and human review remain the shipping gate.

Build a realistic mental model for when repository-aware coding agents help and where senior engineering control is still required.

AdvancedAI for Business
37 minutes

VMware Private AI Foundation Capabilities and Features Update from Broadcom

Tech Field Day. Shows private AI as layered infrastructure: controlled compute, isolated environments, Kubernetes, inference containers, model governance, self-service provisioning, GPU sharing and monitoring. That maps directly to the article's warning that privacy depends on deployment boundaries, logs, access and operations, not on the word "local."

Evaluate private AI as an infrastructure and governance decision instead of defaulting to either SaaS or self-hosting by instinct.

AdvancedPrivate / Local AI
6 minutes

AI Voice Agents: How They Actually Work & Why They Sound So Human

CX Foundation. Breaks voice agents into the practical pipeline: speech recognition, language model, business-system APIs, text-to-speech and interruption handling. That gives the article's rollout framework a concrete technical foundation before readers choose Twilio, Retell, Vapi, LiveKit or another platform.

Recognize the core architecture of a voice agent and the failure points that affect customer trust in real calls.

AdvancedAutomations
33 minutes

The AI Engineer's Guide to Surviving the EU AI Act

GOTO Conferences. Connects the EU AI Act to data quality, MLOps, documentation and post-deployment monitoring. That makes it a good companion for the article's SME governance baseline: the work starts with knowing the system, data, owner, purpose and controls, not with buying a compliance platform.

Understand why AI Act readiness depends on practical AI system inventory, data governance, engineering controls and operational ownership.

AdvancedAI Safety & Data Privacy
42 minutes

Vertical AI Agents Could Be 10X Bigger Than SaaS

Y Combinator. The Lightcone hosts work through why vertical AI agents — not horizontal wrappers — are the defensible shape for application-layer companies, with concrete examples and a clear-eyed take on which categories the model providers will eat. That is the anti-moat trap the article warns about, expressed as a positive playbook.

Assess when vertical AI agents create real defensibility and when they are only thin wrappers.

AdvancedAI for Business
34 minutes

How AI is Reinventing Software Business Models ft. Bret Taylor of Sierra

Sequoia Capital. Bret Taylor walks through the shift from per-seat SaaS to outcomes-based pricing — what to anchor on (resolution, CSAT, NPS), why incumbents struggle to follow, and how vertical specialisation creates pricing power. It directly mirrors the article's pricing and margin sections.

Evaluate AI product pricing and specialization around measurable outcomes rather than seat counts.

AdvancedAI for Business
32 minutes

Fast LLM Serving with vLLM and PagedAttention

Anyscale. Walks through why naive LLM serving wastes 60–80% of GPU memory, how PagedAttention borrows OS-style paging to fix that, and why continuous batching produces the 24× throughput numbers the article uses in its math. After this, the article's "you'll be lucky to hit 50% utilisation" line stops feeling abstract.

Understand why serving engines, batching and KV-cache memory dominate self-hosted inference economics.

AdvancedPrivate / Local AI
56 minutes

Build Hour: Prompt Caching

OpenAI. OpenAI's own Build Hour on prompt caching — the 1024-token threshold, the prefix-stability requirement, audio caching at 99% discount for realtime, time-to-first-token impacts at long inputs. Useful when you are sizing the engineering effort to actually hit the cache reliably on your production prompts.

Use prompt caching only when stable prefixes, latency and cost behavior match the workload.

AdvancedAI for Business
19 minutes

Is This the End of RAG? Anthropic's NEW Prompt Caching

Prompt Engineering. Walks through Anthropic's prompt caching against Gemini's context caching with concrete latency-and-cost reductions per use case (long-document chat, few-shot, multi-turn). The breakdown of cache-write surcharge vs. cache-read discount is exactly what the article assumes when it talks about when caching pays off.

Walks through Anthropic's prompt caching against Gemini's context caching with concrete latency-and-cost reductions per use case (long-document chat, few-shot, multi-turn).

AdvancedAI for Business
17 minutes

Defending LLM - Prompt Injection

LiveOverflow. Walks through the actual defence-in-depth playbook — taint analysis on LLM output, restricting expected output shapes, user isolation, few-shot scaffolds, fine-tuning, temperature 0 for determinism, redundancy for critical paths. It matches the article's defence-stack section almost item for item.

Review prompt-injection defenses such as taint analysis, output-shape restrictions, user isolation, deterministic settings and redundant checks for critical paths.

AdvancedAI Safety & Data Privacy
13 minutes

Attacking LLM - Prompt Injection

LiveOverflow. Frames prompt injection as a classic injection attack against systems that mix instructions and untrusted data — with a concrete content-moderation example where an attacker frames an innocent user. The mental shift from "the model is the target" to "the application is the target" is exactly the move the article opens with.

Model prompt injection as untrusted-data mixing and design boundaries around tool use.

AdvancedAI Safety & Data Privacy
8 minutes

Anthropic's Claude Computer Use Is A Game Changer | YC Decoded

Y Combinator. Garry Tan walking through what computer use actually changes for the unautomatable long tail of software — legacy apps, internal portals, anything without an API. The framing here is exactly the article's "browser is the universal interface" argument, with a more business-realistic view of where it pays off first.

Decide where browser or computer-use agents might be commercially useful despite their operational risk.

AdvancedAutomations
5 minutes

Claude has taken control of my computer...

Fireship. The clearest short explanation on YouTube of the screenshot–action–screenshot loop, including the honest failure modes (Claude wandering off to look at Yellowstone, token burn, latency per step). Fireship is light on production detail by design — read the article for that — but it leaves you with the right intuition for why these systems are expensive and brittle before you commit one to your stack.

Understand why screenshot-based computer use is powerful, slow, expensive and brittle compared with API-native automation.

AdvancedAutomations
44 minutes

Building Brain-Like Memory for AI | LLM Agent Memory Systems

Adam Lucek. A longer implementation pass through the cognitive-science-inspired categories — episodic, semantic, working, procedural — wired into an agent in code. Worth watching after the LangChain conceptual video if you want a more opinionated mental model and a working example to crib from.

A longer implementation pass through the cognitive-science-inspired categories — episodic, semantic, working, procedural — wired into an agent in code.

AdvancedAutomations
7 minutes

Memory for agents (conceptual video)

LangChain. Short, no-code walkthrough of the short-term-vs-long-term split, the three shapes long-term memory tends to take (instructions, profile, list of objects), and the hot-path-versus-background trade-off for when to write. The article's memory-architecture section assumes exactly this taxonomy.

Separate short-term and long-term memory decisions and decide when agent memory should be written.

AdvancedAutomations

Showing 24 of 157

Freshly reviewed

Recently checked videos and companion picks from the AI Expert library.