Evaluations in Agentic Workflows – n8n Builders Berlin (Live Demo)

🔗 Link do vídeo: https://www.youtube.com/watch?v=1a-bYvfKoLc
🆔 ID do vídeo: 1a-bYvfKoLc

📅 Publicado em: 2025-11-13T14:56:00Z
📺 Canal: n8n

⏱️ Duração (ISO): PT27M47S
⏱️ Duração formatada: 00:27:47

📊 Estatísticas:
– Views: 499
– Likes: 31
– Comentários: 2

🏷️ Tags:

Recorded at the Advanced Track of n8n Builders Berlin, this talk features JP van Oosten, who leads the AI team at n8n, explaining how he uses evaluations to make AI workflows more reliable.

In the talk, JP walks through:
➡️ Why evaluations matter for AI workflows
➡️ How to handle inconsistent LLM outputs, context drift, and edge cases
➡️ Using evaluations while building, before deployment, and in production
➡️ Comparing models and prompts (including simple A/B-style comparisons)
➡️ Tracking things like correctness, helpfulness, and token usage
➡️ Working with evaluation triggers, data tables, and metrics inside n8n
➡️ Using LLM-as-a-judge with reference answers

Good for: anyone building AI workflows or agents in n8n who wants a clearer, more systematic way to test and monitor them.

Chapters:

00:00 Intro
02:00 Why AI Evaluations Matter
04:50 Evaluation Methods
06:22 How to use evaluations in n8n
06:49 Pre-Deployment Checks
07:41 Monitoring in Production
11:20 Live Demo
21:05 Q&A

👤 Connect with JP on LinkedIn: https://www.linkedin.com/in/jpvoosten/

#n8n #aiworkflows #AIEvaluations #automation #workflowautomation #llm #aiagents