← 2026-03-21 📂 All Days 2026-03-24 →
🏗️
🏗️ System Design
🏗️ 系统设计 Day 8 / System Design Day 8

🏗️ 系统设计 Day 8 / System Design Day 8

主题 / Topic: 数据库索引与查询优化 / Database Indexing & Query Optimization

分类 / Category: Fundamentals · Beginner · Foundation Phase


🌍 真实场景 / Real-World Scenario

想象你在设计 Twitter 的搜索功能。用户搜索某条推文,数据库里有 5 亿条记录——如果没有索引,数据库必须逐行扫描,花几分钟才能返回结果。有了索引,查询可以在 几毫秒内 完成。

Imagine you're designing Twitter's search feature. Users search for tweets, and there are 500 million records in the database — without indexing, the database must scan row-by-row, taking minutes. With indexes, queries return in milliseconds.


🏛️ 架构图 / ASCII Architecture Diagram

User Query: "SELECT * FROM tweets WHERE user_id = 42 AND created_at > '2026-01-01'" WITHOUT INDEX: WITH INDEX: ┌─────────────────────┐ ┌─────────────────────┐ │ Full Table Scan │ │ B-Tree Index │ │ Row 1: user_id=1 │ │ (user_id, date) │ │ Row 2: user_id=15 │ │ Root │ │ Row 3: user_id=42 │ │ / \ │ │ ... │ │ Node Node │ │ Row 500M: ??? │ │ / \ / \ │ │ ❌ 500M reads │ │ L1 L2 L3 L4 │ └─────────────────────┘ │ ✅ ~log(N) reads │ └─────────────────────┘ Index Storage: ┌──────────┬──────────┬─────────────────┐ │ user_id │ date │ row_pointer → │ │ 42 │ 2026-01 │ page 1042, r3 │ │ 42 │ 2026-02 │ page 2891, r7 │ └──────────┴──────────┴─────────────────┘

⚖️ 关键权衡 / Key Tradeoffs (为什么这样设计?)

索引加速读,但拖慢写 / Indexes Speed Reads, Slow Writes

指标 / Metric无索引 Without Index有索引 With Index
SELECT 查询O(N) 全表扫描O(log N) B-Tree 遍历
INSERT / UPDATE快 ⚡慢(需维护索引)
存储空间 Storage更大(索引占空间)

为什么用 B-Tree? B-Tree 保持数据有序,支持范围查询(BETWEEN, >),适合绝大多数业务场景。

Why B-Tree? It keeps data sorted, supports range queries (BETWEEN, >), fitting most business use cases.

复合索引的列顺序很重要 / Column order in composite indexes matters:

-- Index on (user_id, created_at)
-- ✅ Can use: WHERE user_id = 42 AND created_at > '2026-01-01'
-- ✅ Can use: WHERE user_id = 42
-- ❌ Cannot use: WHERE created_at > '2026-01-01' (alone)
-- 最左前缀原则 / Leftmost prefix rule!

⚠️ 常见错误 / Common Mistakes (别踩这个坑)

  1. 过度索引 Over-indexing — 给每列都加索引?写操作会变得极慢。生产中见过 INSERT 耗时 10 秒的案例。

Adding an index to every column? Writes become painfully slow.

  1. 索引列上做函数运算 Function on indexed columnWHERE YEAR(created_at) = 2026 无法使用索引!改用 WHERE created_at BETWEEN '2026-01-01' AND '2026-12-31'

WHERE YEAR(created_at) = 2026 can't use the index! Use range instead.

  1. 忽视 EXPLAIN / Ignoring EXPLAIN — 不跑 EXPLAIN SELECT ... 怎么知道是否用到了索引?

Never running EXPLAIN SELECT ... — how do you even know if the index is used?

  1. N+1 查询问题 / N+1 Query Problem — 循环里查询数据库,100 次查询 vs 1 次 JOIN。

Querying inside a loop: 100 queries vs 1 JOIN.


📚 References

🧒 ELI5 (小朋友也能懂)

索引就像书的目录。没有目录,你要找"索引"这个词,就得从第1页翻到最后。有了目录,直接翻到第 283 页。数据库索引做的是同样的事情——不用"翻遍所有数据",直接跳到你要找的地方。

An index is like a book's table of contents. Without it, you'd flip through every page to find "indexing." With a table of contents, you jump right to page 283. Database indexes do the same thing — skip straight to what you need.

💻
💻 Algorithms
💻 算法 Day 8 / Algorithms Day 8

💻 算法 Day 8 / Algorithms Day 8

题目 / Problem: #36 Valid Sudoku · 🟡 Medium

模式 / Pattern: Arrays & Hashing

🔗 LeetCode #36 · 📹 NeetCode Video


🌍 真实类比 / Real-World Analogy

想象你是数独游戏的裁判。你不需要解开这个数独——只需要检查当前状态是否合法(没有行、列、3×3格子里有重复数字)。就像检查停车场:不是找空位,而是确认没有两辆车占同一个格子。

Imagine you're a Sudoku referee. You don't need to solve the puzzle — just verify the current state is valid (no duplicate numbers in any row, column, or 3×3 box). Like checking a parking lot: not finding empty spots, but confirming no two cars share the same space.


🧩 题目 / Problem

给定一个 9×9 的数独棋盘,判断是否有效。规则:

  • 每行数字 1-9 不重复
  • 每列数字 1-9 不重复
  • 每个 3×3 子格数字 1-9 不重复
  • 空格用 '.' 表示

Given a 9×9 Sudoku board, determine if it is valid. Rules: no duplicates in any row, column, or 3×3 box. Empty cells are '.'.


💡 关键洞察 / Key Insight

用哈希集合跟踪所见数字。 同时遍历三种结构:行、列、3×3格。

关键公式: 位于 (r, c) 的格子属于哪个 3×3 格? → box_id = (r // 3) * 3 + (c // 3)

Use hash sets to track seen numbers simultaneously across rows, columns, and 3×3 boxes.

Key formula: which box does cell (r, c) belong to? → box_id = (r // 3) 3 + (c // 3)*

Box IDs: ┌───────┬───────┬───────┐ │ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ │ box 0 │ box 1 │ box 2 │ ├───────┼───────┼───────┤ │ box 3 │ box 4 │ box 5 │ ├───────┼───────┼───────┤ │ box 6 │ box 7 │ box 8 │ └───────┴───────┴───────┘ r=0,c=0: (0//3)*3+(0//3) = 0*3+0 = box 0 ✅ r=1,c=4: (1//3)*3+(4//3) = 0*3+1 = box 1 ✅ r=4,c=7: (4//3)*3+(7//3) = 1*3+2 = box 5 ✅

🐍 Python 解法 / Python Solution

from collections import defaultdict

def isValidSudoku(board):
    # Use sets for each row, col, box
    rows = defaultdict(set)   # rows[r] = set of digits seen in row r
    cols = defaultdict(set)   # cols[c] = set of digits seen in col c
    boxes = defaultdict(set)  # boxes[b] = set of digits seen in box b

    for r in range(9):
        for c in range(9):
            val = board[r][c]
            if val == '.':
                continue  # Skip empty cells

            box_id = (r // 3) * 3 + (c // 3)

            # Check for duplicates
            if val in rows[r] or val in cols[c] or val in boxes[box_id]:
                return False

            # Record this value
            rows[r].add(val)
            cols[c].add(val)
            boxes[box_id].add(val)

    return True

🔍 代码追踪 / Code Trace

Using a small example focusing on the top-left 3×3 box (box 0):

board[0] = ["5","3",".",".","7",".",".",".","."] board[1] = ["6",".",".","1","9","5",".",".","."] board[2] = [".","9","8",".",".",".",".","6","."] Processing r=0, c=0: val="5" box_id = (0//3)*3 + (0//3) = 0 "5" not in rows[0]={}, cols[0]={}, boxes[0]={} → OK rows[0]={"5"}, cols[0]={"5"}, boxes[0]={"5"} Processing r=0, c=1: val="3" box_id = (0//3)*3 + (1//3) = 0 "3" not in rows[0]={"5"}, cols[1]={}, boxes[0]={"5"} → OK rows[0]={"5","3"}, cols[1]={"3"}, boxes[0]={"5","3"} Processing r=1, c=0: val="6" box_id = (1//3)*3 + (0//3) = 0 "6" not in rows[1]={}, cols[0]={"5"}, boxes[0]={"5","3"} → OK boxes[0]={"5","3","6"} Processing r=2, c=1: val="9" box_id = (2//3)*3 + (1//3) = 0 "9" not in boxes[0]={"5","3","6"} → OK Processing r=2, c=2: val="8" box_id = 0 "8" not in boxes[0]={"5","3","6","9"} → OK boxes[0]={"5","3","6","9","8"} Final result: True ✅ (valid board)

⏱️ 复杂度 / Complexity

时间 Time空间 Space
复杂度O(9²) = O(81) = O(1)O(9²) = O(1)
说明固定 81 格,常数时间最多存 81 个数字

Board is always 9×9 — technically O(1) since input size is fixed!


🔄 举一反三 / Pattern Recognition

掌握"哈希集合去重"模式后,还能解:

Once you master "hash set deduplication," apply it to:


📚 References

🧒 ELI5

想象你有9张纸,每张代表一行。遇到数字就写到对应行的纸上。如果那张纸上已经有了这个数字,就说明不合法!同样对列和3×3格子也这样检查。

Imagine 9 sheets of paper, one per row. When you see a number, write it on that row's sheet. If the sheet already has that number — invalid! Do the same for columns and 3×3 boxes.

🗣️
🗣️ Soft Skills
🗣️ 软技能 Day 8 / Soft Skills Day 8

🗣️ 软技能 Day 8 / Soft Skills Day 8

题目 / Question: 你如何处理模糊不清的需求? / How do you approach working with ambiguous requirements?

分类 / Category: Ambiguity · Senior/Staff Level · Foundation Phase


🎯 为什么这很重要 / Why This Matters

在大厂,模糊是常态,而非例外。产品经理不可能把每个细节都想清楚,业务方也不总知道自己真正想要什么。能优雅处理模糊需求,是区分 senior 工程师和 staff 工程师的核心能力之一。

In big tech, ambiguity is the norm, not the exception. PMs can't anticipate every detail, and stakeholders don't always know what they really want. Navigating ambiguity gracefully separates senior engineers from staff engineers.


⭐ STAR 拆解 / STAR Breakdown

情境 Situation: 描述一个需求不清晰的真实场景

Describe a real scenario where requirements were unclear

任务 Task: 你被分配了什么?你需要负责什么?

What were you assigned? What were you accountable for?

行动 Action: 你具体做了哪些事来厘清需求、推进工作?

What specific steps did you take to clarify and move forward?

结果 Result: 量化影响——节省了多少时间?避免了什么返工?

Quantify impact — time saved, rework avoided, team unblocked?


❌ 差的回答 vs ✅ 好的回答 / Bad vs Good Answer

❌ 差的回答

"需求不清楚的时候,我就等产品经理把需求整理清楚,再开始做。"
"When requirements are unclear, I wait for the PM to clarify everything before starting."

问题: 被动等待,没有主动推进。这在面试中是红牌。

Problem: Passive waiting shows no ownership or initiative — a red flag in interviews.


✅ 好的回答(结构化)

S: 我们在设计一个新的通知系统,PM 只说"用户应该能收到重要通知",但没有定义什么是"重要",也没有给出频率和渠道的要求。

S: We were designing a new notification system. The PM only said "users should receive important notifications" — no definition of "important," no frequency or channel requirements.

T: 我是 tech lead,需要在两周内给出技术方案,但需求太模糊无法开始。

T: I was the tech lead, needing to deliver a technical plan in two weeks, but the requirements were too vague to start.

A: 我做了三件事:

  1. 先列假设清单:把我理解的"默认行为"写成文档,发给 PM 确认("我假设通知包括订单状态变更和系统告警,是否正确?")
  2. 识别 reversible vs irreversible 决策:渠道选择(邮件/推送)容易改,数据库 schema 难改——对难改的部分花更多时间对齐
  3. 用 spike + timebox:花一天做技术调研验证假设,而不是等两周后再发现方向错了

A: I did three things:

1. Made an assumption document — wrote down my "default understanding," sent to PM for confirmation

2. Identified reversible vs irreversible decisions — channel choice is easy to change; DB schema is hard — spent more alignment time on hard decisions

3. Used a spike + timebox — one day of research to validate assumptions rather than discover wrong direction two weeks later

R: 提前 4 天完成方案,避免了一次因误解"重要通知"而可能导致的数据库重新设计(估计 1.5 周返工)。

R: Delivered the plan 4 days early, avoiding a potential DB redesign from misunderstanding "important notifications" (estimated 1.5 weeks of rework).


👑 Senior/Staff 进阶技巧 / Senior/Staff Tips

  1. 区分"紧急不可逆"和"可以先行" — 并非所有模糊都需要澄清后才能动手。

Distinguish "critical & irreversible" ambiguity from "can start anyway." Not all ambiguity blocks progress.

  1. 用"约束条件反问" — 不要问"你想要什么",而是"有哪些约束条件?"(deadline、预算、不能动哪些系统)

Ask about constraints rather than desires: "What can't change?" uncovers real requirements faster.

  1. 两次沟通法则 — 如果你向同一个人问了两次同样的问题还没答案,换个方式:写 RFC,开会对齐,或者向上升级。

Two-ask rule: If you've asked the same person twice with no answer, escalate the format — write an RFC, schedule alignment, or escalate.

  1. 让数据说话 — "我们先做一个小实验来验证假设"比"我不确定需求是什么"更有说服力。

Let data clarify: "Let's run a small experiment" is more powerful than "I'm not sure what we need."


🎯 关键要点 / Key Takeaways

  • 🔑 模糊是工程师的日常,主动澄清是职业素养
  • 🔑 区分"必须澄清"和"可以默认处理"的需求
  • 🔑 把假设写成文档,发出去确认,保留记录
  • 🔑 先行动,后优化——不要等到 100% 清晰

Ambiguity is daily life in engineering. Proactive clarification is professional maturity. Document assumptions. Move forward on reversible decisions, pause on irreversible ones.


📚 References

🧒 ELI5

如果老师说"画一幅漂亮的画",你不知道要画什么。聪明的做法是先问:"可以是动物吗?用什么颜色?多大?" 然后开始画,完成后再调整。而不是坐在那里什么都不做,等老师来告诉你每一步。

If a teacher says "draw something beautiful" without details, the smart move is to ask: "Can it be an animal? What colors? How big?" Then start drawing and adjust. Not sit frozen waiting for the teacher to specify every brushstroke.

🎨
🎨 Frontend
🎨 前端 Day 8 / Frontend Day 8

🎨 前端 Day 8 / Frontend Day 8

主题 / Topic: CSS 动画与过渡 / CSS Animations & Transitions

分类 / Category: CSS Fundamentals · Week 2 · Foundation Phase


🤔 猜猜输出 / What's the Output?

.box{
  width: 100px;
  background: blue;
  transition: width 2s, background 0.5s;
}

.box:hover {
  width: 300px;
  background: red;
}

鼠标悬停时 (on hover),先发生什么?

On hover, which change happens first (visually)?

A. 宽度先变化 (width changes first — both at same time)

B. 背景颜色先变化,然后宽度变化 (background changes first, then width)

C. 两者同时开始,背景先完成 (both start together, background finishes first)

D. 两者同时开始并同时完成 (both start and finish at the same time)

(答案在最后 / Answer at the end)


📐 Transition vs Animation

Transition(过渡)— "A 到 B"

触发式的,从一个状态到另一个状态。Triggered change between two states.

/* 基础语法 / Basic syntax */
.button{
  background: blue;
  transform: scale(1);
  
  /* property | duration | timing-function | delay */
  transition: background 0.3s ease-in-out,
              transform 0.2s ease;
}

.button:hover {
  background: darkblue;
  transform: scale(1.05); /* 轻微放大 / slight scale-up */
}

Animation(动画)— "持续循环"

不需要触发,可以自动运行、重复。Can run automatically, loop indefinitely.

/* Step 1: 定义关键帧 / Define keyframes */
@keyframes pulse{
  0%   { transform: scale(1);    opacity: 1; }
  50%  { transform: scale(1.1);  opacity: 0.7; }
  100% { transform: scale(1);    opacity: 1; }
}

/* Step 2: 应用动画 / Apply animation */
.loading-icon{
  /* name | duration | timing | delay | iteration | direction */
  animation: pulse 1.5s ease-in-out 0s infinite alternate;
}

⏱️ Timing Functions — 让动画有灵魂 / Giving Motion Soul

ease (default): 慢开始,加速,慢结束 ease-in: 慢开始,快结束 → 适合"进入" ease-out: 快开始,慢结束 → 适合"退出" ease-in-out: 两端慢,中间快 → 最自然 linear: 匀速 → 适合旋转加载图标 cubic-bezier(): 完全自定义! Visual: ease: ___/‾‾‾ ease-in: __/‾‾‾‾ ease-out: ‾‾‾‾\__ linear: ///////

🔥 实际代码对比 / Real Code Comparison

❌ 没有过渡 / No Transition (Jarring)

.menu{
  display: none; /* 瞬间消失/出现 — 体验差 / Instant — bad UX */
}

✅ 用 opacity + visibility 平滑过渡

.menu{
  opacity: 0;
  visibility: hidden;
  transition: opacity 0.3s ease, visibility 0.3s ease;
}

.menu.active{
  opacity: 1;
  visibility: visible;
}
/* 注意:display:none 不能过渡! 用 opacity + visibility 代替 */
/* Note: display:none can't transition! Use opacity + visibility instead */

⚡ 性能陷阱 / Performance Gotcha

/* ❌ 触发 Layout Reflow — 慢!*/
.bad { transition: width 0.3s, height 0.3s, margin 0.3s; }

/* ✅ 只触发 Composite — 快!*/
.good { transition: transform 0.3s, opacity 0.3s; }
/*     transform 和 opacity 在 GPU 上运行,性能最佳 */
/*     transform and opacity run on GPU — best performance */

规则 / Rule: 动画时优先使用 transformopacity,避免 width/height/margin/top/left(会触发 reflow)。

Prefer transform and opacity for animations; avoid width/height/margin/top/left (triggers layout reflow).


🧩 迷你挑战 / Mini Challenge

用 CSS 实现一个 loading spinner,只用 @keyframesborder-radius

/* 试着完成这段代码 / Try completing this */
.spinner{
  width: 40px;
  height: 40px;
  border: 4px solid #f3f3f3;
  border-top: 4px solid #3498db;
  border-radius: 50%;
  /* 添加旋转动画 / Add rotation animation here */
  animation: ??? 1s linear infinite;
}

@keyframes ???{
  /* 定义旋转 / Define rotation */
}

/* 答案 / Answer:
animation: spin 1s linear infinite;
@keyframes spin{
  0%   { transform: rotate(0deg); }
  100% { transform: rotate(360deg); }
}
*/

📝 Quiz 答案解析 / Quiz Answer Explanation

正确答案:C — 两者同时开始,背景先完成

timeline (hover starts at t=0):
t=0ms:   ■ width transition starts (2000ms duration)
t=0ms:   ■ background transition starts (500ms duration)
t=500ms: ✅ background = red (DONE)
t=2000ms:✅ width = 300px (DONE)

transition: width 2s, background 0.5s — 两个属性同时开始,但持续时间不同,所以背景 500ms 就完成了,而宽度还要等到 2000ms。

Both transitions start at the same moment on hover, but have different durations: background completes in 500ms while width takes 2000ms.


📚 References

🧒 ELI5

过渡就像灯光调光开关——你拨动开关,灯慢慢变亮或变暗。动画就像旋转的风车——不需要你动它,它自己一直转。

Transitions are like a dimmer switch — you flip it and the light slowly changes. Animations are like a spinning pinwheel — it just keeps spinning on its own without you touching it.

🤖
🤖 AI
🤖 AI Day 8

🤖 AI Day 8

主题 / Topic: 训练 vs 微调 vs 提示工程 / Training vs Fine-Tuning vs Prompting

分类 / Category: Foundations · Foundation Phase


🌍 直觉理解 / Intuitive Explanation

把 LLM 想象成一个刚毕业的大学生

  • 预训练 Pre-training = 完成大学本科,学习了世界上几乎所有的知识(读了整个互联网)
  • 微调 Fine-tuning = 参加专业培训项目(比如医学院),让他专门擅长某个领域
  • 提示工程 Prompting = 给这个毕业生发一份工作简报,告诉他今天该做什么、怎么做

三种方式的成本、灵活性、效果完全不同。

Think of an LLM as a fresh college graduate:

- Pre-training = completing a full degree (reading the entire internet)

- Fine-tuning = attending medical school (specializing in a domain)

- Prompting = giving them a daily briefing (telling them what and how to do today)


⚙️ 工作原理 / How It Works

1. 预训练 Pre-Training (从零开始)

大量文本数据 → 基础模型 (互联网+书籍+代码) (GPT-4, Llama, Claude) Massive text corpus → Foundation model 成本:数百万美元 + 数周 GPU 时间 Cost: Millions of dollars + weeks of GPU time 训练目标:预测下一个 token Objective: Predict the next token

2. 微调 Fine-Tuning (在已有模型上继续训练)

基础模型 + 专业数据集 → 专用模型 Foundation model + domain data → Specialized model 例子 / Examples: • GPT-4 + 法律文书 → 法律助手 • Llama + 医疗记录 → 医疗诊断模型 • Code Llama = Llama + 代码数据集 成本:数百到数万美元 Cost: Hundreds to tens of thousands of dollars 关键参数:learning rate 要小 (1e-5 to 1e-4),防止"灾难性遗忘" Key: Low learning rate to prevent "catastrophic forgetting"

3. 提示工程 Prompting (零成本,零训练)

用户: "你是一个专业的法律助手,..."
基础模型通过上下文调整行为
No training — just clever input formatting

技术 / Techniques:
• Zero-shot: 直接问
• Few-shot: 给几个例子
• Chain-of-thought: "一步一步想..."
• RAG: 外挂知识库

📊 三者对比 / Comparison Table

预训练 Pre-train微调 Fine-tune提示 Prompting
成本 Cost💰💰💰💰 极高💰💰 中等💰 几乎免费
时间 Time数周 weeks数小时~天 hours-days即时 instant
数据需求 Data万亿 tokens数百~数万样本0~几个例子
效果 Performance通用基础 general专域最优 domain-best灵活但有上限
修改难度 Updates极难 very hard需重新微调实时修改 instant
使用场景 Use Case建大模型专业领域99%的应用场景

💻 可运行代码 / Runnable Python Snippet

# pip install openai

from openai import OpenAI

client = OpenAI()  # Reads OPENAI_API_KEY from env

# ============================================================
# Technique 1: Zero-shot prompting
# ============================================================
zero_shot = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": "Classify this tweet as positive/negative/neutral: 'I love this new feature!'"}
    ]
)
print("Zero-shot:", zero_shot.choices[0].message.content)

# ============================================================
# Technique 2: Few-shot prompting (3 examples teach the format)
# ============================================================
few_shot = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "Classify tweets. Answer with just: positive, negative, or neutral."},
        {"role": "user",   "content": "The food was amazing!"},
        {"role": "assistant", "content": "positive"},
        {"role": "user",   "content": "Worst customer service ever."},
        {"role": "assistant", "content": "negative"},
        {"role": "user",   "content": "Package arrived."},
        {"role": "assistant", "content": "neutral"},
        # Now the real question:
        {"role": "user",   "content": "I love this new feature!"},
    ]
)
print("Few-shot:", few_shot.choices[0].message.content)
# Output: "positive" — much more consistent format!

🎯 什么时候用哪种?/ When to Use Which?

你的问题 推荐方案 Your problem Recommended approach "我要做一个通用聊天机器人" → Prompting (系统提示词) General chatbot → Prompting (system prompt) "客服需要理解我们专有术语" → Fine-tuning (100-1000个示例) Support bot with domain jargon → Fine-tuning (100-1000 examples) "要让模型写代码更像我们团队风格" → Fine-tuning 或 RAG+Prompting Code style consistency → Fine-tuning or RAG+Prompting "从零做下一个 GPT-4" → Pre-training (别这么想) Build next GPT-4 from scratch → Pre-training (don't)
**经验法则 / Rule of thumb:** 先试 prompting → 不行试 fine-tuning → 绝不要自己预训练 *Try prompting first → fine-tune if needed → never pre-train from scratch*

🧠 2026 前沿 / 2026 Cutting Edge

  • LoRA / QLoRA — 用极少参数做高效微调,只需一张消费级 GPU(A100 → RTX 4090)
  • RLHF — 用人类反馈强化训练,让模型更"安全"(ChatGPT 的秘诀之一)
  • DPO — Direct Preference Optimization,比 RLHF 更简单,效果相当

📚 References

🧒 ELI5

想象你要教一条狗学新把戏:

  • 预训练 = 把狗从小养大,它学会了所有基本技能
  • 微调 = 专门训练它学"装死"这一个把戏,练了一个月
  • 提示工程 = 你每次说话前告诉它"你是一只很聪明的狗,我想让你..."

最简单的方法永远是先"说话",实在不行才去"专门训练"。

To teach a dog tricks: pre-training = raising it from a puppy (learns everything). Fine-tuning = a month of "play dead" training. Prompting = just telling it what you want before each command. Always try talking first.