← 2026-03-29 📂 All Days 2026-04-01 →

🏗️

🏗️ System Design

🏗️ 系统设计 Day 14 / System Design Day 14

▼

🏗️ 系统设计 Day 14 / System Design Day 14

微服务 vs 单体架构 / Microservices vs Monolith

难度 / Difficulty: Intermediate · 阶段 / Phase: Growth · 预计阅读 / Read time: 3 min

🌍 真实场景 / Real-World Scenario

想象你在一家初创公司工作，产品刚上线，代码都在一个仓库里。随着用户量增长到百万级别，你开始思考：要不要把代码拆成独立的服务？什么时候该拆？怎么拆？

Imagine you're at a startup. Your entire product lives in one codebase. As you scale to millions of users, you face the classic question: should you break it apart into microservices? When? How?

🏛️ 架构图 / Architecture Diagrams

单体架构 / Monolith

┌─────────────────────────────────────────┐ │ Monolith App │ │ ┌──────────┐ ┌──────────┐ ┌─────────┐ │ │ │ Users │ │ Orders │ │Payments │ │ │ │ Module │ │ Module │ │ Module │ │ │ └────┬─────┘ └────┬─────┘ └────┬────┘ │ │ └─────────────┴─────────────┘ │ │ │ │ │ ┌─────────▼─────────┐ │ │ │ Single Database │ │ │ └───────────────────┘ │ └─────────────────────────────────────────┘ │ Deploy everything together │

微服务架构 / Microservices

Client ──► API Gateway │ ┌────────┼────────┐ ▼ ▼ ▼ ┌────────┐ ┌──────┐ ┌──────────┐ │ Users │ │Orders│ │ Payments │ │Service │ │Svc │ │ Service │ └───┬────┘ └──┬───┘ └───┬──────┘ │ │ │ ┌──▼──┐ ┌──▼──┐ ┌───▼───┐ │ DB │ │ DB │ │ DB │ └─────┘ └─────┘ └───────┘ (独立部署, message queue 通信)

⚖️ 核心权衡 / Key Tradeoffs

为什么选单体？/ Why Monolith?

简单 — 一个代码库，一次部署，本地开发直接跑
低延迟 — 模块间函数调用，无网络开销
事务一致性 — 一个数据库，ACID 事务天然支持
适合阶段 — 团队 < 20 人，产品 PMF 还没验证时

为什么选微服务？/ Why Microservices?

独立扩展 — Payment 服务流量暴增，只扩它，不动 Users 服务
技术异构 — 推荐系统用 Python/ML，API 层用 Go，各自最优
故障隔离 — 一个服务崩了，不影响整体
团队自治 — 不同团队独立发布，互不阻塞
适合阶段 — 团队 > 50 人，有专门 DevOps/Platform 团队时

对比表 / Comparison

维度	单体	微服务
部署复杂度	低 ✅	高 ❌
开发速度(早期)	快 ✅	慢 ❌
独立扩展	❌	✅
故障隔离	❌	✅
数据一致性	容易 ✅	需要设计 ❌
运维成本	低 ✅	高 ❌

🪤 别踩这个坑 / Common Mistakes

❌ 坑1: 过早微服务化 (Premature Microservices)

刚起步就拆服务，结果团队只有3个人要维护10个服务+Kubernetes。

"We went microservices on day one, and it almost killed us." — every startup that tried it too early

✅ 正确做法: 先做"模块化单体"(Modular Monolith)，内部模块化，边界清晰，后期再物理拆分。

❌ 坑2: 分布式单体 (Distributed Monolith)

拆成多个服务，但服务之间强耦合，必须同步部署。既有微服务的复杂性，又没有微服务的好处。

✅ 正确做法: 服务间通过 API 或消息队列解耦，不共享数据库。

❌ 坑3: 忽视跨服务事务

订单服务扣库存成功，支付服务失败了，数据不一致。

✅ 正确做法: 使用 Saga 模式或最终一致性设计。

📚 References

🧒 ELI5

单体就像一家小餐厅，一个厨房做所有菜，简单高效。微服务像大型餐厅连锁，每家分店专做一类菜，可以独立扩张，但管理更复杂。刚开始开一家店，别一上来就开连锁。

Monolith = one kitchen that cooks everything. Simple, fast to start. Microservices = a food court where each stall specializes. Great at scale, but way more management. Start with one kitchen; split when it gets too crowded.

💻

💻 Algorithms

💻 算法 Day 14 / Algorithms Day 14

▼

💻 算法 Day 14 / Algorithms Day 14

#42 Trapping Rain Water (Hard) — Two Pointers

🧩 Two Pointers (5/5) — building on the template from earlier days in this block

🔗 LeetCode: https://leetcode.com/problems/trapping-rain-water/ 🔴
📹 NeetCode: https://www.youtube.com/watch?v=ZI2z5pq0TqA
Pattern / 模式: Two Pointers（双指针）

🌧️ 现实类比 / Real-world analogy

把城市的屋顶想成一排高度不同的墙。下雨后，低洼处会积水，但能积多少取决于它左边最高的墙和右边最高的墙：

water[i] = min(maxLeft, maxRight) - height[i]（如果为正）

Think of bars as walls. The water above a bar is limited by the shorter of the tallest wall on its left and the tallest wall on its right.

🧠 问题重述 / Problem

给定数组 height 表示柱子高度，每根柱子宽度为 1，计算下雨后能接多少雨水。

Given height, compute total trapped water.

🧩 如何映射到双指针模板 / Map to the Two Pointers template

之前的双指针块（

#125 回文、#167 两数之和II、#15 三数之和、#11 盛最多水的容器

）里，左右指针“夹逼”的核心是：

每一步都能确定一侧的最优/可行性，因此可以移动那一侧，整体 O(n)

这题的“变化点”是：

我们不再追求 pair/sum，而是维护 leftMax / rightMax，并在每一步“结算”一侧的水量。

Key twist vs earlier problems: instead of comparing sums/areas, we compare leftMax and rightMax. The side with the smaller max can be finalized because its limiting wall is known.

✅ 双指针解法 / Two pointers solution

核心思路 / Key idea

l, r 从两端向中间走
维护 leftMax = max(height[0..l])，rightMax = max(height[r..end])
如果 leftMax < rightMax：左边的水位上限已确定（被 leftMax 限制），可以计算 l 位置的水并 l += 1
否则：对称处理右边

Python 代码 / Python code

from typing import List

class Solution:
    def trap(self, height: List[int]) -> int:
        l, r = 0, len(height) - 1
        left_max, right_max = 0, 0
        water = 0

        while l < r:
            if height[l] < height[r]:
                # left side is bounded by left_max
                if height[l] >= left_max:
                    left_max = height[l]
                else:
                    water += left_max - height[l]
                l += 1
            else:
                # right side is bounded by right_max
                if height[r] >= right_max:
                    right_max = height[r]
                else:
                    water += right_max - height[r]
                r -= 1

        return water

🔍 手动走一遍 / Quick trace

例子：[0,1,0,2,1,0,1,3,2,1,2,1]

开始 l=0, r=11, left_max=0, right_max=0, water=0
右边较高（1 vs 1 走 else），更新 right_max=1，r=10
当左边较小（0 < 2）：左侧可结算，left_max=0，l=1
height[2]=0 时，left_max=1，水 += 1-0 = 1
... 最终累计 water=6

Why it works: the side with the smaller boundary max is the limiting factor, so we can safely finalize water there without knowing the exact interior structure.

⏱️ 复杂度 / Complexity

Time: O(n) (each pointer moves at most n steps)
Space: O(1)

举一反三 / Transfer within this pattern block

#11 Container With Most Water：移动“短板”来寻找更可能变大的面积
#15 3Sum：固定一个数 + 双指针夹逼
#125 Valid Palindrome：两端检查并向内收缩

共同点：

每一步移动都基于一个可证明的单调性/界限，避免 O(n^2)

📚 References

LeetCode editorial: https://leetcode.com/problems/trapping-rain-water/editorial/
NeetCode explanation (video): https://www.youtube.com/watch?v=ZI2z5pq0TqA
GeeksforGeeks (two-pointer approach): https://www.geeksforgeeks.org/trapping-rain-water/

🧒 ELI5

想象你在一排积木之间倒水。某一格能装多少水，只取决于它左边最高的积木和右边最高的积木里较矮的那个。双指针就是从两边往中间走，随时记住“目前看到的最高积木”，然后一格一格把水算出来。

Imagine filling water between blocks. A spot’s water level is capped by the shorter of the tallest block on its left and right. Two pointers walk inward, tracking those tallest blocks and adding water as you go.

🗣️

🗣️ Soft Skills

🗣️ 软技能 Day 14 / Soft Skills Day 14

▼

🗣️ 软技能 Day 14 / Soft Skills Day 14

Tell me about a time you drove a large cross-team initiative

级别 / Level: Staff · 主题 / Category: Leadership · Read time: 2 min

为什么重要 / Why this matters

中文：

跨团队项目（例如：统一身份认证、支付迁移、数据平台升级、全站性能治理）最大的风险往往不是技术，而是对齐、节奏、依赖、沟通成本。Staff 级别面试官想听到的是：你如何在没有“直接汇报关系”的情况下，把很多人带到同一条船上。

English:

For cross-team initiatives, the hardest part is rarely the technical design—it’s alignment, dependencies, cadence, and communication overhead. Interviewers want evidence you can lead without formal authority.

⭐ STAR 结构（建议 90 秒回答）/ STAR structure (aim for 90 seconds)

S — Situation（背景）

中文：项目是什么？影响范围多大？涉及哪些团队？
English: What was the initiative? Scope? Which teams?

T — Task（你的职责）

中文：你具体负责什么？目标/成功标准是什么（SLO、迁移比例、成本、上线日期）？
English: What did you own? What were the success metrics?

A — Action（你做了什么）

用“可复制的方法论”讲：

1) 定义北极星指标 / Define a north-star metric：例如 p95 latency、error budget、migration completion。

2) 把问题拆成工作流 / Break into a plan：里程碑、风险清单、依赖图、RACI（谁负责/批准/咨询/知会）。

3) 建立节奏 / Create operating cadence：每周 cross-team sync、异步周报、决策记录（ADR）、升级通道。

4) 提前拆雷 / De-risk early：先做 POC / pilot、灰度、回滚预案、观测（dashboards + alerts）。

5) 对齐激励 / Align incentives：明确“对他们有什么好处”（减少 oncall、降低成本、提高转化）。

English (same content):

1) Define a measurable north-star metric.

2) Turn ambiguity into a concrete plan (milestones, dependency map, RACI).

3) Establish cadence (syncs, async updates, decision logs).

4) De-risk early (pilot, gradual rollout, rollback plan, observability).

5) Align incentives so partner teams want to participate.

R — Result（结果）

中文：用数字结尾：提前/按期上线、迁移比例、故障率下降、成本节省、开发效率提升。
English: Close with numbers: completion %, latency improvement, incidents reduced, cost savings.

❌ Bad vs ✅ Good（面试官一听就懂）/ Bad vs Good

❌ Bad（空泛）

“我组织了很多会议，大家最后达成一致，然后上线了。”

✅ Good（可验证）

“我先把目标写成 p95 从 800ms 降到 400ms，并把依赖拆成 3 条迁移路径；每周一次跨团队同步 + 每两天异步进度；关键风险是 X 团队的 schema 变更，于是先做了两周 pilot 和双写；最终 6 周内迁移 92%，相关 oncall 事故从每周 5 起降到 1 起。”

Senior/Staff 加分点 / Senior/Staff-level tips

把决策写下来：用 ADR 记录 tradeoffs，不靠“口口相传”。
沟通要“分层”：IC 关注任务与风险，Manager 关注里程碑与资源，Exec 关注指标与 ROI。
处理冲突的方式：先找共同目标，再用数据/实验说话；必要时明确升级路径。
让系统自动运行：好的机制（dashboard、SLO、自动化迁移工具）比个人英雄更可靠。

Key Takeaways

中文：跨团队项目 = 目标清晰 + 依赖可视化 + 节奏稳定 + 风险前置 + 激励对齐。
English: Cross-team success = clear metrics + dependency visibility + steady cadence + early de-risking + aligned incentives.

📚 References

Google SRE Book — Service Level Objectives: https://sre.google/sre-book/service-level-objectives/
RACI matrix overview (Atlassian): https://www.atlassian.com/team-playbook/plays/roles-and-responsibilities
Amazon Working Backwards (concept): https://www.aboutamazon.com/news/company-news/working-backwards-how-amazon-starts-with-the-customer

🧒 ELI5

中文：你要做一件很多同学一起完成的大作业。你得先说清楚“最终要拿多少分”（指标），再把任务分好、规定每周检查一次进度、提前发现最难的部分先做小实验，最后大家才会真的按同一个计划走。

English: It’s like a big group project. First define what “success” means, split work and owners, check progress regularly, test the risky parts early, and keep everyone moving together.

🎨

🎨 Frontend

🎨 前端 Day 14 / Frontend Day 14

▼

🎨 前端 Day 14 / Frontend Day 14

React Custom Hooks — Extract & Reuse Logic

阶段 / Phase: Growth · Read time: 2 min

🧩 真实场景 / Real scenario

中文：你在做一个 dashboard，需要在多个页面复用“拉取数据 + loading/error + 取消请求 + 刷新”的逻辑。你不想每个组件都写一遍 useEffect + AbortController + 一堆状态。

English: You’re building a dashboard and need reusable “fetch + loading/error + cancellation + refresh” logic across multiple pages.

✅ 生产可用的 Custom Hook 例子 / Production-ready custom hook

import { useCallback, useEffect, useRef, useState } from "react";

type AsyncState<T> = {
  data: T | null;
  error: string | null;
  loading: boolean;
};

// Code comments in English
export function useJsonFetch<T>(url: string, deps: unknown[] = []) {
  const [state, setState] = useState<AsyncState<T>>({
    data: null,
    error: null,
    loading: true,
  });

  // Keep AbortController in a ref so we can cancel in-flight requests.
  const abortRef = useRef<AbortController | null>(null);

  const run = useCallback(async () => {
    abortRef.current?.abort();
    const controller = new AbortController();
    abortRef.current = controller;

    setState((s) => ({ ...s, loading: true, error: null }));

    try {
      const res = await fetch(url, { signal: controller.signal });
      if (!res.ok) throw new Error(`HTTP ${res.status}`);
      const json = (await res.json()) as T;
      setState({ data: json, error: null, loading: false });
    } catch (e) {
      // Abort is not a real “error” the user should see
      if ((e as any)?.name === "AbortError") return;
      setState({ data: null, error: (e as Error).message, loading: false });
    }
  }, [url, ...deps]);

  useEffect(() => {
    void run();
    return () => abortRef.current?.abort();
  }, [run]);

  return { ...state, refresh: run };
}

怎么用 / How to use:

type User = { id: string; name: string };

function UsersPanel() {
  const { data, loading, error, refresh } = useJsonFetch<User[]>("/api/users");

  if (loading) return <div>Loading…</div>;
  if (error) return <button onClick={refresh}>Retry: {error}</button>;
  return (
    <div>
      <button onClick={refresh}>Refresh</button>
      <ul>{data?.map((u) => <li key={u.id}>{u.name}</li>)}</ul>
    </div>
  );
}

🧠 “猜猜这段代码输出什么？”/ Output quiz

function Demo() {
  const [n, setN] = useState(0);

  const inc = useCallback(() => setN(n + 1), []);

  return <button onClick={inc}>{n}</button>;
}

A) 每次点击都会正常 +1 / Increments correctly each click

B) 永远显示 0 / Always shows 0

C) 只会变成 1，然后卡住 / Becomes 1 then stuck

D) 组件会崩溃 / Component crashes

正确答案 / Correct: C

中文解释：useCallback(..., []) 把 inc 固定住了，但它闭包里捕获的 n 永远是初始值 0，所以每次都是 setN(0 + 1)。

English: The callback is memoized with an empty deps array, so it captures n=0 forever. Each click sets n to 1 again.

✅ 修复方式：用 functional update

const inc = useCallback(() => setN((x) => x + 1), []);

❌ 常见错误 vs ✅ 正确方式 / Common mistake vs correct approach

❌ 错误：custom hook 里依赖不稳定，导致无限刷新 / unstable deps causing loops

useEffect(() => {
  fetch(url).then(...)
}, [options]) // options is a new object every render

✅ 正确：让依赖稳定（useMemo / useCallback / 把对象提升到外层）

const options = useMemo(() => ({ headers: { "x": "1" } }), []);
useEffect(() => {
  fetch(url, options).then(...)
}, [url, options])

什么时候用 / When to use

✅ 当你要复用“状态 + 副作用 + 取消/清理 + 触发刷新”的组合逻辑
✅ 当你希望组件变得更像“UI 视图层”，逻辑下沉到 hook

什么时候不要用 / When NOT to use

❌ 只是复用一个纯函数：直接写 utility function 就好
❌ hook 内部逻辑强耦合某个页面的 UI 结构：可能该用组件/组合而不是 hook
❌ 你还没搞清楚边界：先写在组件里，稳定后再抽取（避免过早抽象）

📚 References

React Docs — Reusing Logic with Custom Hooks: https://react.dev/learn/reusing-logic-with-custom-hooks
React Docs — useCallback: https://react.dev/reference/react/useCallback
MDN — AbortController (cancel fetch): https://developer.mozilla.org/en-US/docs/Web/API/AbortController

🧒 ELI5

中文：Custom Hook 就像把“做菜步骤”写成一个固定食谱。以后每次要做同样的菜（同样的逻辑），你就直接用这份食谱，而不是每次都从头想一遍。

English: A custom hook is a reusable recipe for state + side effects. You call the recipe in different components instead of rewriting the steps each time.

🤖

🤖 AI

🤖 AI Day 14

▼

🤖 AI Day 14

LoRA & QLoRA — Efficient Fine-Tuning

Mode: CONCEPT · Category: Training · Read time: 2 min

🧠 直觉解释 / Intuition

中文：

全量微调（full fine-tuning）像是把整本教科书都重写一遍：效果可能好，但成本极高（显存/时间/存储），而且不小心会“改坏”模型的通用能力。

LoRA（Low-Rank Adaptation）更像是：

原模型权重冻结不动（不重写教科书）
只加一层“薄薄的可训练适配器”来改变模型行为（像贴便签/补丁）

QLoRA 则是在 LoRA 基础上再做一步：

把基座模型量化（比如 4-bit）来极大降低显存占用
仍然用 LoRA 训练小适配器，从而让你在更小的 GPU 上也能做高质量微调

English:

Full fine-tuning is rewriting the whole book: powerful but expensive and risky. LoRA freezes the base model and learns small low-rank “adapter” matrices (patches). QLoRA further quantizes the base model (e.g., 4-bit) to cut VRAM dramatically while still training LoRA adapters.

⚙️ 它是怎么工作的 / How it works

中文：

在 Transformer 里，大量参数集中在注意力/FFN 的线性层（比如 W）。LoRA 把权重更新 ΔW 表示为两个小矩阵的乘积：

ΔW = A · B，其中 A、B 的秩（rank）很小（r ≪ d）
训练时只更新 A、B（参数量从 O(d²) 变成 O(2·d·r)）
推理时可以把 ΔW 合并回 W（不增加太多推理开销）

QLoRA：

基座权重用 4-bit NF4 等量化方式存储
训练时用更高精度（比如 bfloat16）在计算路径中做补偿（常见做法是 double quantization 等技巧）

English:

LoRA factorizes the weight update as ΔW = A·B with small rank r. You train only A and B (far fewer parameters). QLoRA quantizes the base weights (often 4-bit) and trains LoRA adapters on top, using careful compute dtypes/quantization tricks to keep quality.

✅ 什么时候用 / When to use

中文：

- 你想针对“特定任务/风格/领域语料”提升效果，但预算有限

- 你希望可控地“加能力”，并且随时能切换不同 adapter（一个基座多个 LoRA）

- 你需要在单卡/小显存环境训练

English:

- You need task/domain/style adaptation on a budget

- You want modular adapters you can swap (one base, many LoRAs)

- You’re constrained by VRAM (single GPU / smaller GPUs)

🧪 可运行示例（≤15 行）/ Runnable snippet (≤15 lines)

下面示例展示“加载 LoRA adapter”的最小思路（训练通常更长、代码更多）。

# pip install -U transformers peft torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base = "gpt2"  # demo base model
lora_path = "./my_lora_adapter"  # your saved LoRA adapter folder

tok = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base)
model = PeftModel.from_pretrained(model, lora_path)

prompt = "Write a short product update:"
print(tok.decode(model.generate(**tok(prompt, return_tensors="pt"), max_new_tokens=40)[0]))

📚 References

LoRA paper (arXiv): https://arxiv.org/abs/2106.09685
QLoRA paper (arXiv): https://arxiv.org/abs/2305.14314
Hugging Face PEFT docs: https://huggingface.co/docs/peft/index

🧒 ELI5

中文：

LoRA 就像给机器人加一副“可拆卸的小眼镜”，不改它原来的大脑，只训练这副眼镜让它更擅长某件事。QLoRA 则是把机器人的大脑“压缩存起来”，省空间省钱，但仍能换不同眼镜来学习新技能。

English:

LoRA is a small attachable add-on that changes behavior without rewriting the whole brain. QLoRA compresses the brain to save memory, then still learns those small add-ons.

Attention Heatmap

When the model processes it, it attends most strongly to animal.

	The	animal	didnt	cross	the	street	because	it	was	too	tired
it →

Tuesday, March 31, 2026

🏗️ 系统设计 Day 14 / System Design Day 14

微服务 vs 单体架构 / Microservices vs Monolith

🌍 真实场景 / Real-World Scenario

🏛️ 架构图 / Architecture Diagrams

单体架构 / Monolith

微服务架构 / Microservices

⚖️ 核心权衡 / Key Tradeoffs

为什么选单体？/ Why Monolith?

为什么选微服务？/ Why Microservices?

对比表 / Comparison

🪤 别踩这个坑 / Common Mistakes

📚 References

🧒 ELI5

💻 算法 Day 14 / Algorithms Day 14

#42 Trapping Rain Water (Hard) — Two Pointers

🌧️ 现实类比 / Real-world analogy

🧠 问题重述 / Problem

🧩 如何映射到双指针模板 / Map to the Two Pointers template

✅ 双指针解法 / Two pointers solution

核心思路 / Key idea

Python 代码 / Python code

🔍 手动走一遍 / Quick trace

⏱️ 复杂度 / Complexity

举一反三 / Transfer within this pattern block

📚 References

🧒 ELI5

🗣️ 软技能 Day 14 / Soft Skills Day 14

Tell me about a time you drove a large cross-team initiative

为什么重要 / Why this matters

⭐ STAR 结构（建议 90 秒回答）/ STAR structure (aim for 90 seconds)

S — Situation（背景）

T — Task（你的职责）

A — Action（你做了什么）

R — Result（结果）

❌ Bad vs ✅ Good（面试官一听就懂）/ Bad vs Good

Senior/Staff 加分点 / Senior/Staff-level tips

Key Takeaways

📚 References

🧒 ELI5

🎨 前端 Day 14 / Frontend Day 14

React Custom Hooks — Extract & Reuse Logic

🧩 真实场景 / Real scenario

✅ 生产可用的 Custom Hook 例子 / Production-ready custom hook

🧠 “猜猜这段代码输出什么？”/ Output quiz

❌ 常见错误 vs ✅ 正确方式 / Common mistake vs correct approach

什么时候用 / When to use

什么时候不要用 / When NOT to use

📚 References

🧒 ELI5

🤖 AI Day 14

LoRA & QLoRA — Efficient Fine-Tuning

🧠 直觉解释 / Intuition

⚙️ 它是怎么工作的 / How it works

✅ 什么时候用 / When to use

🧪 可运行示例（≤15 行）/ Runnable snippet (≤15 lines)

📚 References

🧒 ELI5

Attention Heatmap