Reasoning-models

Published on
2026년 4월 15일
LLM 완전 가이드 — Transformer·Attention·RLHF·RAG·Agent·Evaluation (Season 2 Ep 6, 2025)
llm transformer attention rlhf dpo rag agent evaluation ai-engineering reasoning-models season-2
LLM을 "프롬프트에 답하는 블랙박스"로만 쓰면 임계점에서 막힌다. Transformer의 Attention이 실제로 어떻게 토큰 관계를 계산하는지, Pre-training → SFT → RLHF → DPO 파이프라인이 왜 이 순서로 설계됐는지, RAG 1/2/3세대의 차이와 Agentic RAG의 본질, Agent 설계(ReAct, Plan-and-Execute, Multi-Agent)의 근본 패턴, 그리고 LLM 평가가 왜 미해결 문제인지까지 — 블랙박스를 뜯어보는 한 편. Season 2의 여섯 번째, 2025년 엔지니어의 필수 교양.
Published on
2026년 4월 12일
OpenAI RFT with Custom Graders: A Practical Guide for Product and Platform Teams
ai-platform openai rft reinforcement-fine-tuning custom-graders evals reasoning-models 2026-04 2026-04-12
A practical guide to OpenAI reinforcement fine-tuning with custom graders, including when to use it, how to prepare data, how to evaluate checkpoints, and how to roll it out safely.

LLM 완전 가이드 — Transformer·Attention·RLHF·RAG·Agent·Evaluation (Season 2 Ep 6, 2025)