← 2026-03-25 📂 All Days 2026-03-27 →
🏗️
🏗️ System Design
🏗️ 系统设计 Day 10 / System Design Day 10

🏗️ 系统设计 Day 10 / System Design Day 10

Topic: Consistent Hashing (一致性哈希)

预计阅读时间 / Estimated reading time: 3 minutes


场景 / Scenario

想象你在设计一个分布式缓存系统(比如 Redis 集群),有 10 台缓存服务器存储着数百万用户的数据。

Imagine you're designing a distributed cache (like a Redis cluster) with 10 servers storing millions of users' data.

一天,服务器 #3 宕机了。用系统的普通哈希 key % 10,你要重新分配 90% 的数据

One day, server #3 goes down. With simple modulo hashing key % 10, you'd need to reassign 90% of your data!

一致性哈希只需要重新分配 ~1/N 的数据。这就是它的魔力。

Consistent hashing only reassigns ~1/N of data. That's the magic.


架构图 / Architecture Diagram

哈希环 / Hash Ring (0 to 360°) 0° │ Server A │ Server B (90°) │ (180°) ┌────┴────┐ ──────┤ RING ├────── └────┬────┘ Server D │ Server C (315°) │ (270°) │ 360° Key "user:123" hashes to 210° → goes to Server C (next clockwise) Key "user:456" hashes to 95° → goes to Server B (next clockwise) Virtual Nodes (虚拟节点): ┌─────────────────────────────────────────┐ │ Physical: A B C D │ │ Virtual: A1 B1 C1 D1 A2 B2 C2 D2 ... │ │ (150 virtual nodes per physical node) │ └─────────────────────────────────────────┘

数据流 / Data Flow:

  1. 计算 key 的哈希值,映射到环上某个角度 → Hash key to a position on the ring
  2. 顺时针找到第一个服务器节点 → Find next server clockwise
  3. 读写该服务器 → Read/write from that server
  4. 服务器宕机:只有它的数据转移到下一个节点 → On failure: only its data migrates to the next node

关键权衡 / Key Tradeoffs

为什么这样设计?/ Why this design?

普通哈希 / Simple Hash一致性哈希 / Consistent Hash
key % N 简单但脆弱环形映射,容错强
增减节点 → 大规模重分配增减节点 → 仅影响 ~1/N 数据
热点不均匀难处理虚拟节点解决负载均衡

虚拟节点的作用 / Virtual Nodes:

每个物理节点在环上有多个虚拟位置(通常 100-200 个),解决数据分布不均的问题。Each physical node has many virtual positions on the ring, solving uneven data distribution.

CAP 定理视角 / CAP Perspective:

一致性哈希帮助在分区容错(P)下提升可用性(A),但一致性(C)需要额外机制(如 quorum reads)保证。


别踩这个坑 / Common Mistakes

虚拟节点数量太少 — 数据分布会很不均匀,导致热点

*Too few virtual nodes → uneven distribution → hot spots*

不考虑节点权重 — 新服务器内存更大,应承担更多虚拟节点

Ignoring node weights → underutilizing powerful servers

哈希函数选错 — 用差的哈希函数(如 MD5)导致聚集

*Bad hash function → clustering → poor distribution*

✅ 用 MurmurHash 或 FNV1a,配合 150-200 个虚拟节点,是生产环境的黄金配置。

Use MurmurHash or FNV1a with 150-200 virtual nodes in production.


实际使用 / Real-World Usage

  • Amazon DynamoDB — 内部分区路由
  • Apache Cassandra — token-based consistent hashing
  • Memcached / Twemproxy — 客户端一致性哈希
  • Nginx upstream hashhash $request_uri consistent

📚 References

  1. Consistent Hashing — Tom White's original paper explanation
  2. Amazon DynamoDB's use of consistent hashing
  3. Cassandra's consistent hashing implementation

🧒 ELI5 (解释给5岁小孩听)

想象一圈小朋友站成一个圆,每人负责一段颜色。玩具来了,看看玩具是什么颜色,顺时针找到对应颜色的小朋友,就给他。少了一个小朋友,只有他那段颜色的玩具要重新分,其他小朋友不受影响!

Imagine kids standing in a circle, each responsible for a color range. A toy arrives — find the next kid clockwise with that color. If one kid leaves, only their toys need reassigning. Everyone else stays put!

💻
💻 Algorithms
💻 算法 Day 11 / Algorithms Day 11

💻 算法 Day 11 / Algorithms Day 11

#167 Two Sum II — Input Array Is Sorted · 🟡 Medium

预计阅读时间 / Estimated reading time: 4 minutes


🧩 双指针模式 (2/5) — 继承 Day 10 的模版

Building on the Two Pointers template from Day 10

今天是双指针模式的第 2 题(共 5 题)。上一题 Valid Palindrome 用双指针判断回文;今天我们用同样的框架解决"有序数组找配对"问题。

This is the 2nd problem in our Two Pointers block (5 total). Yesterday we checked palindromes; today we use the same framework to find pairs in a sorted array.

本 block 全部 5 题 / All 5 problems:

  1. ✅ #125 Valid Palindrome (Easy) — Day 10
  2. 👈 #167 Two Sum II (Medium) — TODAY
  3. #15 3Sum (Medium)
  4. #11 Container With Most Water (Medium)
  5. #42 Trapping Rain Water (Hard)

通用模版回顾 / Template Recap:

left, right = 0, len(arr) - 1
while left < right:
    total = arr[left] + arr[right]
    if total == target: return [left, right]
    elif total < target: left += 1   # need bigger sum
    else: right -= 1                  # need smaller sum

与 Valid Palindrome 的对比 / vs Yesterday:

Valid PalindromeTwo Sum II
移动条件 / Move whenchars don't matchsum ≠ target
收缩方向 / Shrinkboth sides toward middlewhichever side adjusts sum
核心逻辑 / Corecompare charsadjust sum magnitude

题目 / Problem

🔗 LeetCode #167 · 🟡 Medium

📹 NeetCode Video

现实类比 / Real-World Analogy:

你有一张已排序的价目表,要找出恰好等于预算 target 的两件商品。

You have a sorted price list and want to find exactly two items that sum to your budget.

题目 / Problem:

给一个 1-indexed、非递减排序的数组,找两个数相加等于 target,返回它们的下标(1-indexed)。每个输入保证有唯一解。

Given a 1-indexed, non-decreasing sorted array, find two numbers that sum to target. Return 1-indexed positions. Exactly one solution exists.

Input:  numbers = [2, 7, 11, 15], target = 9
Output: [1, 2]  (numbers[0] + numbers[1] = 2 + 7 = 9)

💡 套用模版 / Mapping to Template

模版中 arr[left] + arr[right] 对应今天的 numbers[left] + numbers[right]

为什么有序数组可以用双指针?/ Why does sorting enable two pointers?

关键洞察:数组排序后,如果 `sum < target`,我们**确定**需要更大的值 → 移动左指针。如果 `sum > target`,需要更小的值 → 移动右指针。无序数组无法这样推断!
*Key insight: With a sorted array, if sum < target, we KNOW we need a bigger value → move left. If sum > target, we need smaller → move right. Unsorted arrays can't support this reasoning!*

🐍 Python 解法 + 逐步追踪 / Solution + Trace

def twoSum(numbers: list[int], target: int) -> list[int]:
    left, right = 0, len(numbers) - 1  # 1
    
    while left < right:                 # 2
        current_sum = numbers[left] + numbers[right]  # 3
        
        if current_sum == target:       # 4
            return [left + 1, right + 1]  # convert to 1-indexed
        elif current_sum < target:      # 5
            left += 1   # need bigger number
        else:
            right -= 1  # need smaller number
    
    return []  # guaranteed to find answer, never reaches here

追踪 / Trace with numbers = [2, 7, 11, 15], target = 9:

Step 1: left=0, right=3 → 2+15=17 > 9 → right=2 Step 2: left=0, right=2 → 2+11=13 > 9 → right=1 Step 3: left=0, right=1 → 2+7=9 == 9 → return [1, 2] ✅

时间/空间复杂度 / Complexity:

  • ⏱ Time: O(n) — each pointer moves at most n steps total
  • 💾 Space: O(1) — no extra data structures

vs. Brute Force: O(n²) with nested loops. Two pointers give a 10-100x speedup on large inputs.


举一反三 / Pattern Connections

在本 block 中 / Within this pattern block:

  • #15 3Sum (下一题): Same two-pointer idea + outer loop. Fix one element, two-pointer the rest.
  • #11 Container With Most Water: left, right move based on which height is smaller — same structure!
  • #42 Trapping Rain Water: Two pointers + track running max from each side. Most complex variation.

看到这些信号就想到双指针 / Recognize these signals:

  • ✅ Sorted array
  • ✅ "Find pair that sums to X"
  • ✅ O(1) space required
  • ✅ Palindrome check
  • ✅ "Remove duplicates in-place"

📚 References

  1. LeetCode #167 — Two Sum II
  2. NeetCode Two Sum II explanation
  3. Two Pointers pattern guide — LeetCode Patterns

🧒 ELI5

你和朋友站在一排数字两端。你喊出你们俩数字的和。太小就让左边的人向右走一步(换更大的数);太大就让右边的人向左走一步(换更小的数);正好就赢了!

*You and a friend stand at opposite ends of a number line. Call out your sum. Too small → left person steps right (bigger number). Too big → right person steps left (smaller). Exact match → win!*
🗣️
🗣️ Soft Skills
🗣️ 软技能 Day 10 / Soft Skills Day 10

🗣️ 软技能 Day 10 / Soft Skills Day 10

Topic: Proactiveness — 主动发现问题

"Tell me about a time you identified and solved a problem before others noticed"

预计阅读时间 / Estimated reading time: 2 minutes


为什么这道题很重要 / Why This Matters

这是区分普通工程师和高级工程师的核心问题之一。

This is one of the core questions that distinguishes senior from junior engineers.

初级工程师:等待任务分配,发现问题后上报。

Junior: Waits for tasks, escalates problems when found.

高级工程师:主动监控系统健康,提前发现隐患,悄悄修好。

Senior: Proactively monitors system health, finds issues before they explode, quietly fixes them.

面试官想听到的信号:主动性、系统思维、影响力量化

What interviewers want: proactivity, systems thinking, quantified impact.


STAR 拆解 / STAR Breakdown

✅ 强回答结构 / Strong Answer Structure

Situation(情境):

"在我们的支付服务中,我在做例行代码审查时注意到一个看起来没问题但实际上很危险的模式——一个在高并发场景下会导致重复扣款的竞态条件。"
"While doing a routine code review of our payment service, I noticed a pattern that looked fine but was actually dangerous — a race condition that would cause duplicate charges under high concurrency."

Task(任务):

"没人发现这个问题,线上也没有报警。但我知道如果不处理,在双十一这样的高峰期必然会触发。"
"No one had flagged it, and there were no alerts. But I knew it would absolutely trigger during peak traffic like Black Friday."

Action(行动):

"我先写了一个复现脚本,用 k6 模拟并发请求证明了问题存在;然后提出了三种修复方案,评估了各自的性能影响;和 PM 沟通了延迟一个小功能发布来优先修复;最后用数据库级别的幂等锁解决了问题。"
"I wrote a reproduction script using k6 to prove the bug. Then I proposed 3 fix options with their performance tradeoffs, aligned with the PM to delay a minor feature, and fixed it with database-level idempotency locks."

Result(结果):

"两周后的峰值流量中,有记录显示有 847 次请求命中了我们的幂等保护。估算避免了 $15K 的退款损失和潜在的支付合规问题。"
"Two weeks later during peak traffic, we recorded 847 requests hitting our idempotency guard. Estimated $15K in prevented chargebacks and potential compliance issues."

❌ Bad vs ✅ Good

❌ 弱回答 / Weak:

"我发现了一个 bug,报告给了我的经理,他们修复了它。"
"I found a bug and reported it to my manager and they fixed it."

→ 没有主动性,没有影响,这是被动行为。

No ownership, no impact, this is reactive not proactive.

✅ 强回答 / Strong:

展示:你如何主动发现(不是被告知)→ 量化潜在风险独立推进修复数字化影响
Show: how you proactively discovered (not told) → quantified potential risk → independently drove the fix → measured impact

高级/Staff 的进阶 / Senior/Staff Level Tips

🔥 系统性主动 vs 偶发性主动:

普通的"主动"是偶然发现问题。Staff 级别会建立系统

  • 定期审查监控告警覆盖率
  • 建立技术债 backlog 并推动季度 review
  • 主导 Game Day / Chaos Engineering 主动暴露隐患

Average proactiveness is accidental. Staff-level proactiveness is systematic: quarterly tech debt reviews, monitoring coverage audits, deliberate chaos engineering.

🔥 提前沟通风险:

找到问题后,不只是"修了",而是向上同步风险评估和修复进度,让决策者知情。

Don't just fix silently — sync up risk assessment and fix progress with stakeholders. Make decisions visible.


Key Takeaways

  1. 🔍 主动发现 — 描述你如何发现(代码审查、监控、读日志、直觉)
  2. 📊 量化风险 — "如果不修,会有 X 影响"比"我觉得有问题"有力 10 倍
  3. ⚙️ 独立推进 — 展示你能端到端推动,不依赖他人催促
  4. 📈 结果数字化 — 预防的损失 > 修复的技术细节

📚 References

  1. Staff Engineer: Leadership beyond the management track — Will Larson
  2. Google SRE Book — Chapter on Monitoring and Alerting
  3. The STAR Method for behavioral interviews — Indeed

🧒 ELI5

就像你在玩游戏,别人都在打怪,但你提前发现了地图上有个陷阱,在队友掉坑之前就绕过去了,还告诉大家这里有坑。这就是主动性!

It's like being in a game where everyone's fighting monsters, but you spotted a hidden trap on the map. You avoided it before your teammates fell in — and told everyone it was there. That's proactiveness!

🎨
🎨 Frontend
🎨 前端 Day 10 / Frontend Day 10

🎨 前端 Day 10 / Frontend Day 10

Topic: React useEffect — Side Effects & Cleanup

预计阅读时间 / Estimated reading time: 2 minutes


真实场景 / Real Scenario

你在做一个 实时股票 dashboard,需要:

  1. 组件加载时订阅 WebSocket 数据流
  2. 组件卸载时取消订阅(否则内存泄漏!)
  3. 当股票代码(ticker)改变时,切换订阅

You're building a real-time stock dashboard. You need to:

  1. Subscribe to WebSocket stream when component mounts
  2. Unsubscribe when component unmounts (or: memory leak!)
  3. Switch subscriptions when the ticker changes

这就是 useEffect 的经典使用场景。


代码示例 / Code Example

import { useEffect, useState } from 'react';

interface StockData {
  price: number;
  change: number;
}

function StockTicker({ symbol }: { symbol: string }) {
  const [data, setData] = useState<StockData | null>(null);

  useEffect(() => {
    // 1. Setup: runs after render
    console.log(`Subscribing to ${symbol}`);
    const ws = new WebSocket(`wss://stocks.example.com/${symbol}`);
    
    ws.onmessage = (event) => {
      setData(JSON.parse(event.data));
    };

    // 2. Cleanup: runs before next effect OR on unmount
    return () => {
      console.log(`Unsubscribing from ${symbol}`);
      ws.close(); // ← THIS IS CRITICAL
    };
  }, [symbol]); // 3. Dependency array: re-run when symbol changes

  if (!data) return <div>Loading...</div>;
  return <div>{symbol}: ${data.price} ({data.change}%)</div>;
}

猜猜输出? / What's the output order?

场景: symbol 从 "AAPL" 改为 "GOOG"

A) "Subscribing to GOOG" B) "Unsubscribing from AAPL" → "Subscribing to GOOG" C) "Subscribing to GOOG" → "Unsubscribing from AAPL" D) 什么都不打印

<details>

<summary>显示答案 / Show Answer</summary>

答案是 B"Unsubscribing from AAPL" 先打印,然后 "Subscribing to GOOG"

React 的执行顺序:

  1. symbol prop 变化 → re-render
  2. React 运行上一个 effect 的 cleanup(关闭旧 WebSocket)
  3. React 运行新的 effect(打开新 WebSocket)

这就是为什么 cleanup 在 return 里:React 会在正确时机调用它。

</details>


❌ 常见错误 vs ✅ 正确做法

❌ 忘记清理 / Forgetting cleanup:

useEffect(() => {
  const interval = setInterval(fetchData, 1000);
  // ❌ No cleanup! Interval runs forever after unmount
}, []);

✅ 正确清理 / Always clean up:

useEffect(() => {
  const interval = setInterval(fetchData, 1000);
  return () => clearInterval(interval); // ✅
}, []);

❌ 依赖项缺失 / Missing dependencies:

useEffect(() => {
  fetchUser(userId); // ❌ userId used but not in deps
}, []); // stale closure! always fetches original userId

✅ 正确依赖 / Correct dependencies:

useEffect(() => {
  fetchUser(userId); // ✅
}, [userId]); // re-runs whenever userId changes

三种 useEffect 形态 / Three Patterns

// 1. Run ONCE on mount (componentDidMount equivalent)
useEffect(() => {
  initAnalytics();
  return () => cleanup(); // runs on unmount
}, []); // empty deps = run once

// 2. Run on every render (rarely needed)
useEffect(() => {
  document.title = `Count: ${count}`;
}); // no deps array = every render

// 3. Run when specific values change
useEffect(() => {
  fetchUserProfile(userId);
}, [userId]); // run when userId changes

何时用 / 何时不用 / When to Use / When NOT to

✅ 适合用 useEffect:

  • API 数据获取(推荐用 React Query/SWR 封装)
  • WebSocket / 事件监听器
  • 第三方库集成(地图、图表)
  • 浏览器 API(localStorage, document.title)

❌ 不要用 useEffect:

  • 派生状态(用 useMemo 代替)
  • 事件处理(直接用 event handler)
  • 在 render 期间的数据转换(直接在组件里算)

React 团队的建议 / React Team's Take:

"You might not need an Effect" — 很多 useEffect 可以被消除。
Many Effects can be eliminated. Think twice before reaching for it.

📚 References

  1. React Docs: Synchronizing with Effects
  2. React Docs: You Might Not Need an Effect
  3. Dan Abramov: A Complete Guide to useEffect

🧒 ELI5

useEffect 就像给房间装了一个"进房间就开灯、出房间就关灯"的传感器。你进去(组件挂载),灯亮了;你出来(组件卸载),灯自动灭。如果你换了房间(依赖变了),它先把旧房间的灯关掉,再把新房间的灯打开。

useEffect is like a room sensor that turns the light on when you enter and off when you leave. When you switch rooms (deps change), it turns off the old light before turning on the new one.

🤖
🤖 AI
🤖 AI Day 11 — News Roundup

🤖 AI Day 11 — News Roundup

2026年3月26日 / March 26, 2026

预计阅读时间 / Estimated reading time: 2 minutes


📰 本周 AI 大事件 / This Week in AI

Sources: Web search results from March 2026


1. 🚀 OpenAI 发布 GPT-5.4 — AI"数字同事"时代来临

来源 / Source: riskinfo.ai | juliangoldie.co.uk

GPT-5.4 于 3 月 5 日正式发布,最大亮点是原生计算机使用能力(Computer Use)——AI 可以直接操作真实软件环境(Excel、文档、网页),而不只是生成文字。在法律文件 benchmark 上达到 91% 准确率。

GPT-5.4 launched March 5 with native computer-use capabilities — AI can now directly interact with real software environments like spreadsheets and documents, moving toward a "digital co-worker" role. Achieved 91% on a legal-document benchmark.

为什么你应该关心 / Why you should care:

作为工程师,这意味着 AI agent 正在从"问答工具"变成"能自主操作 GUI 的同事"。未来的 AI 代码助手可能直接在你的 IDE 里操作文件、跑测试、提 PR——而不只是给出建议。


2. 🛡️ Anthropic 拒绝军事合同,被列为"供应链风险"

来源 / Source: radicaldatascience.wordpress.com

美国国防部要求 Anthropic 移除 Claude 的安全护栏(禁止自主武器使用),Anthropic 拒绝后被 DoD 列为"供应链风险"。这是 AI 安全与国家安全之间最直接的冲突之一。

The US DoD designated Anthropic as a "supply chain risk" after the company refused to remove safety guardrails prohibiting Claude's use in autonomous weaponry.

为什么你应该关心 / Why you should care:

AI 公司的价值观选择正在产生真实商业后果。这场博弈将塑造未来 AI 系统的"红线"在哪里划定。


3. 🎮 NVIDIA GTC 2026:AI 工厂 + 边缘计算引领下一波

来源 / Source: nvidianews.nvidia.com | vtnetzwelt.com

NVIDIA 在 3 月 16 日 GTC 大会上主推"AI 工厂"概念:AI 算力中心既能生产 AI tokens,又能作为灵活电网资产调节用电。同时发布 Nemotron 3 Super,专为复杂 agentic 系统设计。

NVIDIA's GTC 2026 highlighted "AI factories" — compute centers that generate AI tokens AND act as flexible grid assets. Nemotron 3 Super targets complex agentic AI workflows.

为什么你应该关心 / Why you should care:

AI 基础设施成本是行业最大变量之一。AI 工厂与能源网格整合,可能显著降低运算成本,从而让更多 AI 功能变得可行。


4. 📊 76% 的企业:准备好迎接 AI Agent 了吗?还没有

来源 / Source: ey.com

EY 2026 AI Sentiment Report:76% 的企业承认运营流程还没准备好支持 agentic AI。最大障碍是:缺少结构化工作流、上下文传递不清晰、以及信任问题。

EY's 2026 AI Sentiment Report: 76% of enterprises admit their operations are not yet ready to support agentic AI. Main blockers: unstructured workflows, unclear context handoffs, and trust gaps.

为什么你应该关心 / Why you should care:

这直接影响你作为工程师的工作重点。未来 1-3 年最有价值的技能:设计能与 AI Agent 协作的系统架构——清晰的 API 接口、可审计的工作流、幂等操作。


🔗 本周延伸阅读 / Further Reading


🧒 ELI5

AI 这周的新闻就像:有人造了一个超级厉害的机器人助手,会直接帮你操电脑;有家公司不愿意把机器人改成"可以打仗"的,结果被政府不喜欢了;还有调查说大多数公司虽然想用 AI 助手,但家里还没收拾好迎接它。

This week in AI: a super robot that can actually USE your computer arrived; one company refused to let their AI be used as a weapon and got in trouble for it; and a survey found most companies want AI workers but haven't cleaned their house yet.