[{"content":"","date":"2026-03-23","externalUrl":null,"permalink":"/en/authors/aaron/","section":"Authors","summary":"","title":"Aaron","type":"authors"},{"content":"","date":"2026-03-23","externalUrl":null,"permalink":"/en/tags/agent/","section":"Tags","summary":"","title":"Agent","type":"tags"},{"content":"","date":"2026-03-23","externalUrl":null,"permalink":"/en/tags/ai/","section":"Tags","summary":"","title":"AI","type":"tags"},{"content":"You can subscribe to all blog posts via RSS\n","date":"2026-03-23","externalUrl":null,"permalink":"/en/posts/","section":"All Posts","summary":"","title":"All Posts","type":"posts"},{"content":"","date":"2026-03-23","externalUrl":null,"permalink":"/en/authors/","section":"Authors","summary":"","title":"Authors","type":"authors"},{"content":"","date":"2026-03-23","externalUrl":null,"permalink":"/en/categories/","section":"Categories","summary":"","title":"Categories","type":"categories"},{"content":"","date":"2026-03-23","externalUrl":null,"permalink":"/en/tags/harness-engineering/","section":"Tags","summary":"","title":"Harness Engineering","type":"tags"},{"content":" Introduction # Harness Engineering has been making waves across the AI community lately, showing up in tech blogs, podcasts, and conference talks everywhere. I spent a good amount of time digging into the concept, including public case studies from Anthropic and OpenAI, and the more I read, the more convinced I became that this is something anyone building Agents can\u0026rsquo;t afford to ignore. A common frustration when building Agents: the model is solid, the prompts have been iterated multiple times, yet the system still doesn\u0026rsquo;t run reliably. The root cause often lies not in the model itself, but in the runtime system wrapped around it. That outer layer now has a unified name: Harness. This article distills what I\u0026rsquo;ve learned, covering what Harness Engineering is and what core components it involves.\nThree Shifts in Focus # Over the past two years, AI engineering has gone through three distinct shifts: from Prompt Engineering to Context Engineering, and now to Harness Engineering. It might look like a string of buzzwords, but each corresponds to a core challenge in AI system development:\nDid the model understand what you\u0026rsquo;re asking? Did the model receive enough correct information? Can the model consistently do the right thing during real execution? These questions expand outward, layer by layer.\nPrompt Engineering: Say It Right # When large language models first took off, the most striking observation was that the same model could produce vastly different results depending on how you phrased things. \u0026ldquo;Summarize this article\u0026rdquo; gets you a flat summary; a more structured phrasing yields something much better.\nSo people went all-in on prompts: role-playing, style constraints, few-shot examples, step-by-step guidance, output formatting. Why do these work? Because a large language model is fundamentally a probability-based generation system that\u0026rsquo;s extremely sensitive to context. Give it a role, and it responds in character. Give it examples, and it follows the pattern. Emphasize constraints, and it treats them as priorities1.\nThe essence of Prompt Engineering isn\u0026rsquo;t commanding the model. It\u0026rsquo;s shaping a local probability space. The key skill at this stage is language design.\nBut Prompt Engineering soon hit a ceiling. Many tasks aren\u0026rsquo;t about saying things clearly; they require actual information. Analyzing internal company documents, answering questions about the latest product specs, writing code to a detailed specification, orchestrating across multiple tools. No matter how well-crafted the prompt, it can\u0026rsquo;t substitute for the facts themselves.\nPrompts are good at: Constraining output, activating the model\u0026rsquo;s existing capabilities, short-chain tasks. Not so good at: Filling knowledge gaps, managing dynamic information, handling state across long task chains.\nContext Engineering: Get the Information Right # When Agents gained traction, models were no longer just answering questions. They had to enter real environments and do things. Multi-turn conversations, browser automation, reading and writing code, operating databases, passing intermediate results between steps, revising plans based on feedback.\nThe system was no longer facing \u0026ldquo;did it answer this one question correctly?\u0026rdquo; but \u0026ldquo;can the entire task pipeline run end to end?\u0026rdquo; Take a real-world task: \u0026ldquo;Analyze this requirements document, identify potential risks, combine with historical review comments to produce improvement suggestions, then generate a feedback draft for the product manager.\u0026rdquo; This is far beyond what any single prompt can handle. It needs the current requirements document, historical review records, relevant specifications, current goals, intermediate analysis conclusions, the output recipient, tone requirements, and more.\nThe core of Context Engineering becomes one sentence: The model doesn\u0026rsquo;t necessarily know. The system must deliver the right information at the right time.\nContext here isn\u0026rsquo;t just background material. In engineering terms, it represents the sum of all information influencing the model\u0026rsquo;s current decision: user input, conversation history, retrieval results, tool outputs, task state, intermediate artifacts, system rules, safety constraints, and structured results from other Agents. Prompts are actually just one part of the context2.\nRAG is a classic Context Engineering practice. But mature context engineering goes far beyond retrieval: how to chunk documents, how to rank results, how to compress long texts, when to keep vs. summarize conversation history, whether to pass raw tool output to the model or process it first, whether to pass raw text or structured fields between Agents. All of this requires careful design.\nAgent Skill\u0026rsquo;s progressive disclosure is also an advanced Context Engineering practice. It addresses a real problem: if you stuff a dozen tool descriptions and parameter definitions into the model, it theoretically knows more, but practically performs worse. Context window space is scarce, and information overload scatters attention. The Skill approach shows only minimal metadata upfront and dynamically loads detailed references and scripts only when needed.\nThe key insight: context optimization isn\u0026rsquo;t about giving more; it\u0026rsquo;s about giving on demand, in layers, at the right moment.\nHarness Engineering: Keep the Model on Track # Context Engineering isn\u0026rsquo;t the endpoint either. Even with correct information, the model might not execute reliably. It might plan well but drift during execution, call a tool but misinterpret the result, gradually veer off course across a long chain while the system fails to notice.\nPrompts optimize intent expression. Context optimizes information supply. But in complex tasks, there\u0026rsquo;s a harder question: When the model starts taking actions continuously, who supervises it, constrains it, and corrects its course?\nThat\u0026rsquo;s what Harness Engineering addresses.\nWhat Is a Harness # The word \u0026ldquo;harness\u0026rdquo; originally refers to the gear and straps used to hitch and control a draft animal. In AI systems, it\u0026rsquo;s a reminder of something elementary: when a model transitions from answering questions to executing tasks, the system doesn\u0026rsquo;t just feed it information. It must also manage the entire process.\nThere\u0026rsquo;s a concise formula: Agent = Model + Harness. Everything in an Agent system, aside from the model itself, that determines whether it can deliver reliably, falls under Harness.\nHere\u0026rsquo;s an analogy. Imagine sending a new employee on an important client visit.\nPrompt is about telling them the plan clearly: greet first, present the proposal, ask about needs, confirm next steps. The key is saying the right things Context is about preparing all the materials: client background, previous communication records, product pricing, competitive landscape, meeting objectives. The key is getting the information right Harness is about building a complete operational safety net: have them bring a checklist, report at key milestones, verify meeting minutes and recordings afterward, correct deviations immediately, and accept deliverables against clear criteria. The key is having a mechanism for continuous observation, correction, and final acceptance These three aren\u0026rsquo;t replacements for each other. They\u0026rsquo;re nested: Prompt engineering formalizes instructions, Context engineering formalizes the input environment, Harness engineering formalizes the entire runtime system. Each layer\u0026rsquo;s boundary is larger than the last. The first two generations of engineering focused on \u0026ldquo;making the model think better.\u0026rdquo; Harness focuses on \u0026ldquo;keeping the model from drifting, running stably, and being recoverable when things go wrong.\u0026rdquo;\nThe Six Layers of a Harness # A mature Harness can be decomposed into six layers.\nLayer 1: Context Management # Whether a model performs stably often depends less on its intelligence and more on what it sees. The Harness\u0026rsquo;s first responsibility is ensuring the model thinks within the right information boundaries. This typically involves three things:\nRole and goal definition: The model needs to know who it is, what the task is, and what success looks like Information curation and selection: More context isn\u0026rsquo;t always better. Relevant context is better Structural organization: Fixed rules, current task, runtime state, and external evidence should be separated into clear sections. Once information gets jumbled, the model starts missing key points, forgetting constraints, or even self-contaminating On information curation, OpenAI fell into a classic trap early on: they wrote a massive agent instruction document that crammed in every standard, framework, and convention, and the agent only got more confused. Context window space is scarce, and overstuffing it is equivalent to saying nothing at all. Their fix was to turn the document into a directory page with only core indexes, splitting detailed content into sub-documents for architecture, design, execution plans, quality scoring, and security rules. The agent reads the table of contents first and drills down on demand. This is the same progressive disclosure philosophy as Agent Skills: don\u0026rsquo;t give everything at once; expose on demand.\nAnthropic ran into a related issue with long-running autonomous tasks. The context grows fuller over time, and the model starts dropping details and key points. There\u0026rsquo;s even an interesting phenomenon where the model seems to sense it\u0026rsquo;s running out of space and starts rushing to wrap up. The common remedy is context compression, but Anthropic found that for some scenarios, compression alone isn\u0026rsquo;t enough. It makes things shorter but doesn\u0026rsquo;t truly relieve the burden. So they did something more radical: Context Reset. Instead of compressing within the existing context, they spin up a clean new agent and hand off the work. This is analogous to how engineers handle memory leaks: not by clearing caches, but by restarting the process and restoring state3.\nLayer 2: Tool System # Without tools, a large language model is essentially a text predictor. With tools, it can actually do things: search the web, read documents, write code, call APIs.\nBut the Harness does more than just attaching tools. It needs to solve three problems:\nWhich tools to provide: Too few and capabilities are insufficient; too many and the model gets confused When to invoke tools: Don\u0026rsquo;t search when unnecessary, but don\u0026rsquo;t bluff when you should search How to feed tool results back: Dozens of search results shouldn\u0026rsquo;t be dumped raw. They need to be distilled and filtered for task relevance OpenAI\u0026rsquo;s practice here is fairly extreme: they don\u0026rsquo;t just give the Agent a code editor. They connect a browser so the Agent can take screenshots and simulate user interactions, hook into logging and metrics systems so it can check logs and monitors, and run each task in an isolated environment. The Agent doesn\u0026rsquo;t just say \u0026ldquo;done writing code.\u0026rdquo; It can actually run the code, see the results, find bugs, fix them, and verify the fix. Tool system design directly determines how \u0026ldquo;real\u0026rdquo; the Agent\u0026rsquo;s capabilities can be.\nLayer 3: Execution Orchestration # The core question this layer answers: What should the model do next?\nMany Agents fail not because they can\u0026rsquo;t do individual steps, but because they can\u0026rsquo;t string steps together. They can search, summarize, and write code, but the whole process is ad hoc, delivering a pile of half-finished work. A complete task typically needs a track: understand the goal → assess information sufficiency (supplement if needed) → analyze based on results → generate output → verify output → revise or retry if requirements aren\u0026rsquo;t met.\nOpenAI offered an insightful perspective here: the engineer\u0026rsquo;s job in the Agent era is no longer writing code, but designing the environment. Specifically, three things: decompose product goals into tasks within the Agent\u0026rsquo;s capability range; when the Agent fails, don\u0026rsquo;t tell it to try harder, ask what capability is missing from the environment; and build feedback loops so the Agent can see its own work results. \u0026ldquo;When an Agent has problems, the fix is almost never to make it try harder. It\u0026rsquo;s to identify what capability it\u0026rsquo;s missing.\u0026rdquo; This is itself classic Harness thinking.\nLayer 4: Memory and State # An Agent without state is amnesiac every turn. It doesn\u0026rsquo;t know what it just did, which conclusions are confirmed, or which problems remain unresolved. The Harness must manage state, separating at least three categories:\nCurrent task state Intermediate results within the session Long-term memory and user preferences If these three types get mixed together, the system degrades over time. Once properly separated, the Agent starts behaving like a reliable collaborator.\nLayer 5: Evaluation and Observability # This is the layer most teams overlook. Many systems can generate output but have no idea whether that output is actually good. Without independent evaluation, the Agent stays in a perpetual state of self-satisfaction.\nThis layer typically includes: output acceptance criteria, environment validation, automated testing, logging and metrics, and error attribution. The system must not only be able to do things, but also know whether it did them right.\nThere\u0026rsquo;s a key engineering principle here: production and evaluation must be separated. When a model both does the work and grades itself, it tends to be overly optimistic, especially for subjective questions like design quality or product completeness. Anthropic\u0026rsquo;s approach is to split roles: a Planner expands vague requirements into complete specifications, a Generator implements step by step, and an Evaluator tests like a QA engineer. Critically, the Evaluator doesn\u0026rsquo;t just read code. It actually operates the page, checks interactions, and verifies real results. As long as the evaluator is sufficiently independent, the system forms an effective loop of generate → check → fix → re-check.\nLayer 6: Constraints, Validation, and Failure Recovery # The final layer is what truly determines whether a system can go to production. In real environments, failure isn\u0026rsquo;t the exception; it\u0026rsquo;s the norm. Inaccurate search results, API timeouts, messy document formats, model misunderstandings. These are everyday occurrences.\nWithout recovery mechanisms, every error means starting over. A mature Harness must include:\nConstraints: What\u0026rsquo;s allowed and what\u0026rsquo;s not Validation: How to check before and after output Recovery: How to retry, switch paths, or roll back to a stable state after failure OpenAI did something noteworthy here: they encoded senior engineers\u0026rsquo; experience directly as system rules. How modules should be layered, which layers must not depend on which, when to block, and how to fix issues when found. These rules don\u0026rsquo;t just flag errors; they feed the fix back to the Agent for the next round of context. This is no longer traditional code style enforcement. It\u0026rsquo;s a continuously running automated governance system4. Agents submit code far faster than humans can review it, so system rules must serve as the safety net rather than relying on manual code review.\nSummary # Putting it all together:\nPrompt Engineering solves how to say the task clearly Context Engineering solves how to deliver the right information Harness Engineering solves how to keep the model consistently doing the right thing in real execution Harness doesn\u0026rsquo;t replace the previous two. It encompasses them at a larger system boundary. When the task is simple single-turn generation, prompts matter most. When tasks depend on external knowledge and dynamic information, context becomes critical. When the model enters real-world scenarios with long chains, executable actions, and low tolerance for error, Harness becomes virtually unavoidable.\nThis explains why the same model can perform so differently across products. The model determines what\u0026rsquo;s possible to ship. But Harness determines whether it actually lands and delivers reliably. The core challenge of AI deployment is shifting from \u0026ldquo;making the model look smart\u0026rdquo; to \u0026ldquo;making the model work reliably in the real world.\u0026rdquo;\nThis context sensitivity is both a strength and a weakness of large language models. The upside is that carefully designed inputs can steer outputs. The downside is that subtle wording changes can produce drastically different results.\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nBroadly speaking, a prompt is also a form of context. In engineering practice, however, we typically distinguish user-authored instructions as \u0026ldquo;prompts\u0026rdquo; from system-organized injected information as \u0026ldquo;context,\u0026rdquo; to differentiate their management approaches.\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nContext Reset is essentially a more aggressive context management strategy. Rather than compressing within the existing context, it abandons the current context entirely, serializing state and loading it into a fresh Agent to restart the reasoning process.\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nThis approach of \u0026ldquo;encoding engineer experience as system rules\u0026rdquo; shares the same philosophy as traditional software engineering lint rules and CI checks, except the executor has shifted from humans to Agents.\u0026#160;\u0026#x21a9;\u0026#xfe0e;\n","date":"2026-03-23","externalUrl":null,"permalink":"/en/posts/2026/03/harness-engineering-secret/","section":"All Posts","summary":"","title":"Harness Engineering: The Secret to Stable AI Agents","type":"posts"},{"content":"","date":"2026-03-23","externalUrl":null,"permalink":"/en/","section":"Hideaway","summary":"","title":"Hideaway","type":"page"},{"content":"","date":"2026-03-23","externalUrl":null,"permalink":"/en/tags/prompt-engineering/","section":"Tags","summary":"","title":"Prompt Engineering","type":"tags"},{"content":"","date":"2026-03-23","externalUrl":null,"permalink":"/en/tags/","section":"Tags","summary":"","title":"Tags","type":"tags"},{"content":"","date":"2026-03-23","externalUrl":null,"permalink":"/en/categories/tech-reflections/","section":"Categories","summary":"","title":"Tech Reflections","type":"categories"},{"content":"","date":"2026-03-23","externalUrl":null,"permalink":"/categories/%E6%8A%80%E6%9C%AF%E6%80%9D%E8%80%83/","section":"分类","summary":"","title":"技术思考","type":"categories"},{"content":" Introduction # I recently took the time to systematically review the core concepts behind modern AI, and I realized that while I had encountered each one individually, spelling out how they actually depend on each other took more effort than I expected. The biggest insight after putting it all together: these concepts aren\u0026rsquo;t isolated. They\u0026rsquo;re stacked layer by layer, with each one filling a gap left by the one below it. This post is my attempt to trace that stack from the ground up.\nLLM: Where It All Begins # LLM stands for Large Language Model. Pretty much every major model today is built on the Transformer architecture1, first proposed by Google in the 2017 paper Attention Is All You Need. The ironic part: Google lit the spark, but OpenAI was the one who turned it into a wildfire. GPT-3.5 dropped in late 2022, and just months later GPT-4 pushed the ceiling of what AI could do to a whole new level. The GPT family remains an industry benchmark today, though Claude, Gemini, and others are competing hard in their respective strengths.\nThe underlying mechanism is surprisingly straightforward: it\u0026rsquo;s essentially a word-prediction game. You give it some input, it predicts the most probable next word, appends that word to the input, predicts the next one, and so on until it emits a special end-of-sequence token.\nThat\u0026rsquo;s also why ChatGPT streams its responses word by word. It\u0026rsquo;s not a network issue; it\u0026rsquo;s just how the model fundamentally works.\nToken: The Atom of Language Processing # Calling it a \u0026ldquo;word-prediction game\u0026rdquo; is a simplification. In reality, an LLM is a massive mathematical function running matrix operations. It only understands numbers, not text. Between you and the model sits a translation layer called the Tokenizer.\nThe Tokenizer does two things:\nSplit: Break the input text into the smallest possible fragments, each called a Token Map: Convert each Token to a number (Token ID), since the model only processes numbers For instance, the sentence \u0026ldquo;Today\u0026rsquo;s weather is great\u0026rdquo; might be split into [\u0026ldquo;Today\u0026rdquo;, \u0026ldquo;\u0026rsquo;s\u0026rdquo;, \u0026ldquo;weather\u0026rdquo;, \u0026ldquo;is\u0026rdquo;, \u0026ldquo;great\u0026rdquo;], each mapped to a distinct number. On the output side, the process runs in reverse: numbers get mapped back to text.\nBut here\u0026rsquo;s the catch: a Token is not the same thing as a \u0026ldquo;word.\u0026rdquo; The English word \u0026ldquo;unbelievable\u0026rdquo; might get split into \u0026ldquo;un\u0026rdquo;, \u0026ldquo;believ\u0026rdquo;, and \u0026ldquo;able\u0026rdquo;. A single emoji can require three or more Tokens. Tokenization is a set of rules the model learns during training, and it doesn\u0026rsquo;t correspond neatly to how we think about words.\nAs a rough estimate, one Token equals about 0.75 English words or 1.5 to 2 Chinese characters. 400K Tokens is roughly 300K English words2.\nContext and Context Window # Ever wondered how an LLM \u0026ldquo;remembers\u0026rdquo; what you said earlier? It doesn\u0026rsquo;t actually have memory.\nThe trick is that every time you send a message, the application behind the scenes bundles your entire conversation history along with your new question and sends it all to the model. The model sees the full picture every time, which is how it knows what came before.\nThis brings us to Context: everything the model receives when processing a request, including the user\u0026rsquo;s message, conversation history, system instructions, tool lists, and more. Think of it as the model\u0026rsquo;s short-term memory.\nHow large can this memory be? That\u0026rsquo;s where Context Window comes in. It defines the maximum number of Tokens a Context can hold. Current mainstream models offer impressive Context Windows: GPT-5.4 at 1.05M Tokens, Gemini 3.1 Pro and Claude Opus 4.6 at 1M Tokens each. 1M Tokens is roughly 750K English words; you could fit the entire Harry Potter series.\nBut Context Window isn\u0026rsquo;t a silver bullet. If you have a thousand-page product manual and want the model to answer user questions based on it, stuffing the whole thing into context is technically possible but financially impractical. That\u0026rsquo;s where RAG (Retrieval-Augmented Generation)3 shines: it retrieves only the most relevant snippets from the manual and sends just those to the model, sidestepping both Context Window limits and cost concerns.\nPrompt: Telling the Model What to Do # A Prompt is simply the question or instruction you give the model. \u0026ldquo;Write me a poem\u0026rdquo; is a Prompt. Sounds trivial, but how you phrase it directly shapes the output. \u0026ldquo;Write me a poem\u0026rdquo; might get you a haiku, a sonnet, or a limerick, because the model doesn\u0026rsquo;t know what you actually want. Something like \u0026ldquo;Write a Shakespearean sonnet about autumn leaves in a melancholic tone\u0026rdquo; narrows things down considerably.\nThis is the domain of Prompt Engineering. It used to be a hot topic; these days, not so much. Partly because the bar is low (it\u0026rsquo;s essentially \u0026ldquo;be clear\u0026rdquo;), and partly because modern models are good enough to infer your intent even from vague prompts.\nPrompts come in two flavors:\nUser Prompt: what you type into the chat box System Prompt: instructions configured by the developer behind the scenes, invisible to the user but consistently shaping the model\u0026rsquo;s behavior For example, say you\u0026rsquo;re building a customer service bot and don\u0026rsquo;t want it to freely promise refunds. You set the System Prompt to: \u0026ldquo;You are a post-sale support agent for an e-commerce platform. When handling complaints, first acknowledge the customer\u0026rsquo;s frustration, then investigate the issue. Do not authorize refunds or compensation on your own. For cases requiring escalation, guide the user to submit a ticket.\u0026rdquo; When a user types \u0026ldquo;My item arrived broken, I want a refund,\u0026rdquo; the model responds with \u0026ldquo;I\u0026rsquo;m sorry about the experience. Could you share a photo of the damage? I\u0026rsquo;ll help get this sorted once I can see the situation,\u0026rdquo; rather than jumping straight to \u0026ldquo;Sure, refunding now.\u0026rdquo;\nTool: Giving the Model Eyes and Hands # LLMs have an obvious blind spot: they can\u0026rsquo;t perceive the outside world. Ask \u0026ldquo;How\u0026rsquo;s the weather in Beijing today?\u0026rdquo; and it\u0026rsquo;ll tell you straight up that it can\u0026rsquo;t access real-time information, because its knowledge is frozen at the training cutoff date.\nTools fill this gap. A Tool is essentially a function: give it input, get output. A weather tool, for example, takes a city and date, calls a weather API internally, and returns the forecast.\nThe key thing to understand: the model cannot call tools itself. Its only ability is generating text. So the full workflow requires a middleman (usually called the platform) to orchestrate:\nThe user\u0026rsquo;s question and the list of available tools are sent to the model together The model analyzes the request and outputs a tool-call instruction (specifying the tool name and parameters) The platform receives the instruction and actually calls the corresponding function The tool returns a result, and the platform forwards it back to the model The model synthesizes the result into natural language and replies to the user Each role has a clear responsibility: the model selects tools and summarizes results, the tool executes the action, and the platform ties the whole pipeline together.\nMCP: A Universal Standard for Tools # Tools solve the capability problem, but they introduce a new engineering headache: no standard integration format.\nBuilding a tool for ChatGPT means writing integration code to OpenAI\u0026rsquo;s spec. Doing the same for Claude means following Anthropic\u0026rsquo;s spec. And again for Gemini with Google\u0026rsquo;s spec. Same tool, three implementations, because every platform has its own format.\nMCP (Model Context Protocol)4 was created to solve exactly this. It defines a single, unified standard for tool integration. Developers build a tool once following the MCP spec, and it works across every platform that supports MCP. Think of it like the USB-C standard: accessory makers don\u0026rsquo;t need a separate charger for every phone brand anymore.\nAgent: The Autonomous Planner # With Tools and MCP, LLMs can already do a lot. But for more complex requests, a single tool call won\u0026rsquo;t cut it.\nImagine a user saying: \u0026ldquo;I\u0026rsquo;m planning to go for a run this afternoon. Can you check if it\u0026rsquo;s a good idea?\u0026rdquo; This requires multiple tool calls where each step depends on the previous one: first get the user\u0026rsquo;s location coordinates, then use those to check both the weather and air quality, and finally evaluate all the data to decide whether outdoor exercise makes sense. The model needs to reason step by step about the current state and figure out what to do next.\nA system that can autonomously plan, call tools, and keep working until the task is done is called an Agent. Products like Claude Code, Codex, and Gemini CLI are all Agents under the hood. They use different architectural patterns, with ReAct and Plan and Execute being among the most well-known5.\nAgent Skill: The Agent\u0026rsquo;s Instruction Manual # Agents sound powerful, and they are. But in day-to-day use, you quickly hit a pain point: personalization rules have to be repeated every single time.\nSay you want an Agent as your fitness advisor. Before each workout, it should assess your physical condition and the day\u0026rsquo;s environmental factors. You have your own preferences: skip squats when your knees are acting up, move indoors if the AQI exceeds 150, reduce intensity above 35°C, always remind you to stretch afterward. And you want a fixed output format: an overall score first, then specific recommendations.\nWithout any preset, the Agent will check the data but won\u0026rsquo;t know your body\u0026rsquo;s quirks or your formatting preferences. It\u0026rsquo;ll give generic advice. So you end up appending a wall of requirements to every single prompt. Copy, paste, repeat. Not fun.\nAgent Skill exists to fix this. It\u0026rsquo;s essentially a pre-written instruction document (in Markdown) that tells the Agent how to behave in a specific scenario. It has two layers:\nMetadata layer: the cover page, specifying the Skill\u0026rsquo;s name and description (e.g., name Fitness Advisor, description \u0026ldquo;Pre-workout assessment and recommendations\u0026rdquo;) Instruction layer: the actual rules, including goals, execution steps, judgment criteria, output format, and examples Once defined, you save it to a designated location. In Claude Code, that\u0026rsquo;s ~/.claude/skills/. The folder name must match the Skill name, and the file inside must be named SKILL.md.\nAt runtime, the Agent loads all Skill metadata on startup. When a user\u0026rsquo;s question matches a Skill\u0026rsquo;s description, only then does the Agent read that Skill\u0026rsquo;s full instruction layer and execute accordingly. This progressive disclosure mechanism (loading full content only when needed) keeps Token usage efficient6.\nThe Big Picture # String these concepts together and the layered structure of the AI stack becomes clear:\nConcept Role Problem Solved LLM Foundation Core text generation capability Token Data unit Text-to-number translation between humans and models Context Memory Giving the model short-term recall Prompt Instruction Telling the model what to do and how Tool Capability extension Letting the model perceive and affect the outside world MCP Protocol standard Unifying tool integration formats Agent Autonomous system Multi-step planning and tool use for complex tasks Agent Skill Customization Making Agents follow your rules and formats Each layer builds on the one below and addresses its shortcomings. Once you grasp this hierarchy, products like Claude Code, Codex, and OpenClaw all start to look like variations on the same framework. The buzzwords keep multiplying, but the underlying logic stays the same.\nThe internals of the Transformer architecture are beyond the scope of this post. For a deep dive, the original paper Attention Is All You Need is the definitive reference.\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nToken-to-text ratios vary across models since each uses its own Tokenizer implementation. The numbers here are rough estimates.\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nRAG works by \u0026ldquo;retrieve first, generate second\u0026rdquo;: it uses vector similarity matching to find the most relevant document chunks and sends only those as Context to the model.\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nMCP was released by Anthropic in late 2024 and has since been adopted by multiple platforms and tools.\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nFor a detailed breakdown of Agent architecture patterns including ReAct and Plan and Execute, check out dedicated articles on the topic.\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nAgent Skills also support advanced features like running code and referencing external resources. This post covers only the core usage.\u0026#160;\u0026#x21a9;\u0026#xfe0e;\n","date":"2026-02-22","externalUrl":null,"permalink":"/en/posts/2026/02/ai-eight-layers/","section":"All Posts","summary":"","title":"Eight Layers of AI: LLM, Token, Agent, and How They Connect","type":"posts"},{"content":"","date":"2026-02-22","externalUrl":null,"permalink":"/en/tags/llm/","section":"Tags","summary":"","title":"LLM","type":"tags"},{"content":"","date":"2026-02-22","externalUrl":null,"permalink":"/en/tags/mcp/","section":"Tags","summary":"","title":"MCP","type":"tags"},{"content":"","date":"2026-02-22","externalUrl":null,"permalink":"/en/categories/tech-notes/","section":"Categories","summary":"","title":"Tech Notes","type":"categories"},{"content":"","date":"2026-02-22","externalUrl":null,"permalink":"/en/tags/token/","section":"Tags","summary":"","title":"Token","type":"tags"},{"content":"","date":"2026-02-22","externalUrl":null,"permalink":"/categories/%E6%8A%80%E6%9C%AF%E7%AC%94%E8%AE%B0/","section":"分类","summary":"","title":"技术笔记","type":"categories"},{"content":"","date":"2026-01-07","externalUrl":null,"permalink":"/en/tags/claude-code/","section":"Tags","summary":"","title":"Claude Code","type":"tags"},{"content":" Introduction # Claude Code has been around for a while now. Most people I know have tried it, but many are still stuck at the \u0026ldquo;type a prompt, wait for code\u0026rdquo; stage. Honestly, Claude Code is capable of much more. To truly make it a seamless part of your development workflow, there are quite a few details worth understanding. This article walks through everything I\u0026rsquo;ve learned from real-world usage, from installation to advanced customization, covering the entire pipeline.\nBy the way, similar tools like Codex and OpenCode share largely the same architecture and usage patterns. Once you\u0026rsquo;ve mastered Claude Code, switching to alternatives is nearly frictionless.\nInstallation and Login # Installation is a single command. Head to the Claude Code website, copy the install script, and paste it into your terminal.\nOnce installed, run claude in your project directory to launch. On first use you\u0026rsquo;ll be prompted to log in, or you can trigger it manually with /login. Claude Code offers two access options:\nSubscription: If you have a Claude Pro or Max membership, just authorize and go API Key (pay-per-use): Charged based on token consumption, suitable for light usage If you can\u0026rsquo;t access Claude\u0026rsquo;s official service, Claude Code isn\u0026rsquo;t tied to any specific model. By setting a few environment variables, you can connect it to alternative providers like GLM or Minimax1.\nThree Interaction Modes # Claude Code has three working modes, cycled through with Shift+Tab. Understanding the differences between them is foundational to using Claude Code well.\nDefault Mode # The mode you start in. The status bar shows only a gray hint ? for shortcuts with no additional mode label. This \u0026ldquo;nothing labeled\u0026rdquo; state is the default mode.\nIn this mode, Claude Code asks for your permission before creating or modifying any file. The three options are:\nYes: One-time approval for the current operation only Yes, allow all edits during this session: All file operations pass automatically for the rest of the conversation No: Reject, and you can continue typing your thoughts Default mode is the safest option, ideal when you want tight control over code changes.\nAuto Mode (Accept Edits On) # The status bar shows Accept Edits On. In this mode, Claude Code creates and modifies files directly without asking for confirmation one by one. It\u0026rsquo;s highly efficient for iterative development, especially when you need it to touch dozens of files in one go.\nPlan Mode # The status bar shows Plan Mode On. This mode is for discussion only, no files are written. It\u0026rsquo;s ideal for brainstorming approaches and pinning down details before committing to changes. We\u0026rsquo;ll cover this in detail later.\nPlan Mode: Think Before You Act # Say you have a small HTML project and want to migrate it to React + TypeScript + Vite. A structural overhaul like this is risky if you just let Claude Code go wild. A better approach is to switch to Plan Mode first and nail down the approach.\nHere\u0026rsquo;s the workflow:\nPress Shift+Tab to enter Plan Mode Type your requirement, e.g. \u0026ldquo;Refactor the current project to React + TypeScript + Vite, keep all existing features, maintain the same UI style\u0026rdquo; A useful tip: to enter multi-line input, press Shift+Enter for a new line. Pressing Enter submits the prompt. If the terminal input feels clunky, press Ctrl+G to open VS Code as your editor. Save and close it, and the content automatically flows back into Claude Code\u0026rsquo;s input box.\nClaude Code will produce a complete plan with goals, file lists, directory structure, and so on. It then gives you three options:\nExecute plan and enter auto mode: No more file confirmation prompts going forward Execute plan and stay in default mode: Each file change still requires your approval Continue modifying the plan: Not happy with the plan? Add more requirements For instance, if you realize the plan is missing \u0026ldquo;add priority tags (high/medium/low) to each todo item with color coding,\u0026rdquo; pick the third option and add it. Claude Code will regenerate a revised plan.\nOnce you\u0026rsquo;re satisfied, choose to execute, and Claude Code gets to work.\nA common gotcha: auto mode only applies to file read/write operations. Terminal commands (like mkdir, npm install) are considered more sensitive, and Claude Code will ask for permission each time by default. If you want to skip all permission checks entirely, launch with the --dangerously-skip-permissions flag. The word \u0026ldquo;dangerously\u0026rdquo; is right there in the name. It means Claude Code gets full terminal access with zero confirmation. Maximum efficiency, but you bear the risk.\nTerminal Commands and Background Tasks # Running Terminal Commands # Typing ! at the beginning of the input box enters Bash mode, where you can run any terminal command. For example, ! ls to list files, or ! open index.html to open a page in the browser.\nBackground Tasks # One thing that\u0026rsquo;s easy to overlook: certain commands (like npm run dev to start a dev server) will block Claude Code. While the server is running, Claude Code can\u0026rsquo;t process new requests.\nThe fix is Ctrl+B to send the task to the background. After that, you can continue interacting with Claude Code. Use /tasks to see running background tasks, and press K in the task list to stop a specific service.\nRewind Operations # Claude Code automatically creates a restore point every time you submit a request. Press Esc twice to enter the rewind page. After selecting a restore point, you have four options:\nRewind code and conversation: Both files and chat history are restored Rewind conversation only: Chat history is restored, files stay as-is Rewind code only: Files are restored, chat history preserved Cancel rewind Rewind is convenient, but there\u0026rsquo;s a limitation: it can only roll back files that Claude Code wrote itself. Directories created via terminal commands, installed dependencies, and so on cannot be automatically cleaned up. For precise rollbacks, Git is still your best bet.\nImage Input and Figma MCP # Direct Image Input # Want Claude Code to build a page from a design mockup? Just drag the image into the terminal, or copy it and press Ctrl+V to paste. Note that even on macOS, you must use Ctrl+V; Cmd+V doesn\u0026rsquo;t work.\nAfter pasting the image, continue typing your request, and Claude Code will reference the image when generating code. That said, UI restoration from images alone has limited accuracy. Font sizes, spacing, and similar details are hard to nail pixel-perfectly.\nConnecting Figma MCP # If the design is in Figma, there\u0026rsquo;s a more precise approach: connect the Figma MCP Server2.\nMCP (Model Context Protocol) is a protocol for communication between large language models and external tools. Through Figma MCP, Claude Code doesn\u0026rsquo;t just get a screenshot of the design; it receives structured information including component spacing, font styles, color values, and more.\nSetup steps:\nInstall the Figma MCP Server (one command, see the official docs) Restart Claude Code with the -c flag (claude -c) to resume the previous conversation Run /mcp, select the Figma tool, and authenticate Once authenticated, type your request, e.g. \u0026ldquo;Modify the current page to match the Figma design,\u0026rdquo; and include the Figma design link Claude Code will automatically detect the Figma MCP, call tools like get_design_context and get_screenshot to retrieve design information, and then modify the code based on this structured data. The restoration accuracy is significantly better than working from an image alone.\nContext Management # As a conversation progresses, the context accumulates code snippets, tool call results, and other information. Useful and useless content gets mixed together, which affects model performance and wastes tokens.\n/compact: Compress Context # Running /compact performs intelligent compression on the context. Claude Code trims redundant information while preserving the essentials. After compression, press Ctrl+O to view the result. You\u0026rsquo;ll see that a lengthy conversation has been distilled into a few lines of key information.\nYou can append specific compression strategies, like /compact focus on user requirements, to steer the result toward what matters to you.\nThat said, compression controllability is limited. You can\u0026rsquo;t directly edit the compressed content.\n/clear: Clear Context # More drastic than compression. /clear wipes all context entirely. Useful when the upcoming task has no connection to what came before.\nCLAUDE.md: Making Claude Code Understand You Better # Whether you compress or clear, context is always tied to a specific session. Start a new session, and Claude Code knows nothing. Is there a way to have Claude Code automatically load some preset information every time it starts?\nThat\u0026rsquo;s what CLAUDE.md is for. You can document your project\u0026rsquo;s tech stack, code style preferences, special notes, and so on. Claude Code loads this file automatically on every launch.\nUse /init to auto-generate an initial CLAUDE.md, then customize it. For example, I added a rule at the end: \u0026ldquo;Always append \u0026lsquo;Happy Coding\u0026rsquo; at the end of every response.\u0026rdquo; After restarting, I asked a random question, and sure enough, Claude Code added \u0026ldquo;Happy Coding\u0026rdquo; at the end.\nCLAUDE.md comes in two levels:\nProject-level: Placed in the project root, applies to this project only, can be committed to Git for team sharing User-level: Placed in your home directory, applies to all projects Use /memory to quickly open the corresponding CLAUDE.md file without digging through the file manager.\nHooks: Automating Your Workflow # Hooks let you run custom logic at specific moments (before/after tool execution, on failure, etc.). A typical use case is code formatting: have Claude Code automatically run Prettier every time it finishes writing a file.\nConfiguration:\nRun /hooks to enter the Hook configuration page Select the trigger timing, e.g. Post Tool Use (after a tool runs) Select the tool to match, e.g. Write or Edit Enter the command to execute The command looks something like this:\necho \u0026#39;$TOOL_INPUT\u0026#39; | jq -r \u0026#39;.file_path\u0026#39; | xargs prettier --write When triggering a Hook, Claude Code passes in a JSON where file_path is the path to the file just edited. Use jq to extract the path, then pipe it to Prettier for formatting.\nHook storage has three options:\nLocal project-level (settings.local.json): Only works on this machine for this project, not committed to Git Project-level (settings.json): Applies to everyone using this project, distributed via Git User-level: Applies to all projects for the current user Agent Skill: Reusable Prompt Templates # If you frequently need Claude Code to output content in a specific format (e.g. a daily dev log with date, summary, and details), pasting the format requirements every time is tedious. This is where Agent Skill shines3.\nAgent Skill is essentially a dynamically loaded prompt. You write the format requirements in a skill.md file, and Claude Code automatically matches and loads the relevant Skill based on user requests. You can also invoke it manually via /skill-name, skipping the model\u0026rsquo;s intent recognition step.\nSubAgent: Independent Context Powerhouse # Agent Skills run while fully sharing the current conversation\u0026rsquo;s context. This means all logs and reasoning from the Skill\u0026rsquo;s execution get stuffed into your context window. For lightweight tasks this is fine, but if you ask a Skill to review a codebase with tens of thousands of lines, the intermediate process will bloat the context, spike token costs, and slow the model down.\nSubAgent solves this. It has a completely independent context. When launched, it opens a brand-new conversation window where all intermediate processing happens. Only the final result is reported back to the main conversation.\nThe difference in context handling determines their ideal use cases:\nDimension Agent Skill SubAgent Context Shared with main conversation Fully independent Best for Tasks tied to context, low context impact Tasks loosely tied to context, high context impact Typical tasks Writing dev summaries, format conversion Code reviews, large-scale refactoring analysis Creating a SubAgent: run /agent, select \u0026ldquo;Create New Agent,\u0026rdquo; describe its responsibilities, configure available tools (e.g. read-only access), and choose the model. Claude Code generates an initial description file that you can customize. After that, just mention the relevant need in conversation, and Claude Code will automatically invoke the appropriate SubAgent.\nPlugins: One-Click Capability Bundles # Plugins bundle Skills, SubAgents, Hooks, MCP connections, and more into a single installable package, giving Claude Code a full suite of advanced capabilities in one click.\nRun /plugin to open the plugin manager where you can browse, install, and view installed plugins. Choose the scope (user-level, project-level, etc.) during installation, then restart Claude Code.\nFor example, Anthropic offers an official frontend-design plugin with a built-in UI design specification. Once installed, when you ask Claude Code to build a frontend page, it automatically loads these guidelines, producing interfaces with noticeably better color schemes, layout, and interaction design compared to the default output.\nThe Plugin marketplace is growing fast. Beyond UI design plugins, there are LSP plugins for specific programming languages and more. If you\u0026rsquo;ve built up a mature configuration, you can package it as a Plugin to share with your team or the community.\nSummary # This article covered the full pipeline from basics to advanced usage of Claude Code:\nThree modes (default/auto/plan) flexibly switch to suit different scenarios Plan Mode for thinking before acting, reducing the risk of major changes Background tasks to handle long-running commands without blocking Rewind as a safety net, though Git is still needed for complex rollbacks Images and Figma MCP for converting designs to code Context management (/compact, /clear) to control token consumption CLAUDE.md for auto-loading project configuration on every launch Hooks to run custom logic at key moments automatically Agent Skill and SubAgent for lightweight and heavyweight tasks respectively Plugins for one-click installation of complete capability extensions Each feature is straightforward on its own, but combined they form a fairly complete development workflow. Hope this guide helps.\nThere are plenty of tutorials online for the specific configuration. The core idea is setting a few environment variables to specify the model API endpoint and authentication credentials.\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nFor a detailed walkthrough of MCP usage and its design principles, I\u0026rsquo;ve covered it in a previous article.\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nFor a complete guide on Agent Skill usage and its underlying design, check out my dedicated Agent Skill article.\u0026#160;\u0026#x21a9;\u0026#xfe0e;\n","date":"2026-01-07","externalUrl":null,"permalink":"/en/posts/2026/01/claude-code-guide/","section":"All Posts","summary":"","title":"Claude Code: The Complete Guide Beyond Just Writing Code","type":"posts"},{"content":"","date":"2026-01-07","externalUrl":null,"permalink":"/en/tags/skill/","section":"Tags","summary":"","title":"Skill","type":"tags"},{"content":"","date":"2026-01-07","externalUrl":null,"permalink":"/en/categories/tutorials/","section":"Categories","summary":"","title":"Tutorials","type":"categories"},{"content":" Introduction # I\u0026rsquo;ve been using Claude Code a lot lately, and there\u0026rsquo;s one feature I keep coming back to: Agent Skill. At first I treated it as just a \u0026ldquo;prompt archive,\u0026rdquo; but the more I used it, the more I realized the design is far more elegant than that. In this post, I\u0026rsquo;ll walk through what Agent Skill is, how to use it, the thinking behind its architecture, and how it compares to MCP, so you can pick the right tool for your use case.\nWhat Is Agent Skill # The simplest way to think about it: Agent Skill is a reference manual that a large language model can consult at any time.\nFor example, if you\u0026rsquo;re building a smart customer service bot, you can write rules in a Skill like \u0026ldquo;calm the user down first when handling complaints, and never make promises you can\u0026rsquo;t keep.\u0026rdquo; If you want meeting summaries, you can specify \u0026ldquo;output must follow the format: attendees, topics, decisions.\u0026rdquo; No more pasting the same long prompt every time—the model just looks up the manual and gets to work1.\nThat said, \u0026ldquo;reference manual\u0026rdquo; is a beginner-friendly simplification. Skill can do a lot more, as we\u0026rsquo;ll see.\nBasic Usage: Building a Meeting Summary Assistant # Let\u0026rsquo;s use Claude Code as an example. The first step is creating a Skill.\nClaude Code expects Skills to live in the .claude/skills/ directory under your home folder. Create a folder called \u0026ldquo;meeting-summary-assistant\u0026rdquo;—the folder name becomes the Skill name. Inside, create a skill.md file.\nThe file has two parts:\nThe header is metadata, wrapped in triple dashes, with name and description fields. name must match the folder name. description tells the model what this Skill does.\nThe rest is the instruction, which describes the rules the model should follow. In my case, I specified that summaries must cover attendees, topics, and decisions, and included an input/output example to make sure the model really understands.\nOnce created, open Claude Code and ask \u0026ldquo;what Skills do you have?\u0026rdquo; It\u0026rsquo;ll list the one you just made. Then paste in a meeting transcript and ask it to summarize. Claude Code will recognize the request matches your \u0026ldquo;meeting-summary-assistant,\u0026rdquo; ask for permission to use it, read the skill.md, and produce a formatted summary.\nPretty intuitive.\nUnder the Hood: How Skill Works # Now that we\u0026rsquo;ve seen the basics, let\u0026rsquo;s think about what actually happened.\nThere are three actors in the flow: the user, Claude Code (the host), and the LLM behind it. Here\u0026rsquo;s the sequence:\nThe user sends a request Claude Code sends the request along with the names and descriptions of all Skills to the LLM The LLM recognizes the request matches \u0026ldquo;meeting-summary-assistant\u0026rdquo; and tells Claude Code Claude Code reads the full skill.md of the matched Skill Claude Code sends the user request and the complete skill.md content to the LLM The LLM generates a response following the Skill\u0026rsquo;s rules The key detail: step 2 only sends names and descriptions; step 4 reads the full content. Even if you have a dozen Skills installed, the model starts with a lightweight directory. This is Skill\u0026rsquo;s first core mechanism: lazy loading.\nAdvanced Usage I: Reference (Conditional File Loading) # Lazy loading already saves tokens, but it can go further.\nSuppose your meeting summary assistant gets more sophisticated: when a meeting involves spending money, it should flag whether expenses comply with financial policies; when contracts come up, it should note legal risks. To do this, the Skill needs to know the relevant financial rules and legal provisions. If you stuff all of that into skill.md, the file becomes bloated—even a simple technical retrospective would force the model to load pages of irrelevant financial clauses.\nCan we load files with even finer granularity? For example, only load financial policies when the meeting actually discusses money?\nThat\u0026rsquo;s exactly what Reference solves.\nCreate a file like company-finance-handbook.md with expense reimbursement standards, then add a rule in skill.md: only trigger when keywords like \u0026ldquo;budget,\u0026rdquo; \u0026ldquo;procurement,\u0026rdquo; or \u0026ldquo;expense\u0026rdquo; appear; when triggered, read the handbook and flag any amounts that exceed limits.\nIn practice: if the meeting transcript mentions budgets, Claude Code reads skill.md, detects the financial relevance, loads the handbook, and includes financial reminders in the summary. If it\u0026rsquo;s a money-free technical retrospective, the handbook stays on disk without consuming a single token.\nReference\u0026rsquo;s core property: conditional activation. Load only when needed, stay completely untouched otherwise.\nAdvanced Usage II: Script (Code Execution) # Reading files is just the first step. The real automation kicks in when Skill can run code directly. That\u0026rsquo;s where Script comes in.\nCreate an upload.py script for uploading files, then add a rule in skill.md: if the user mentions \u0026ldquo;upload,\u0026rdquo; \u0026ldquo;sync,\u0026rdquo; or \u0026ldquo;send to server,\u0026rdquo; the script must be executed.\nWhen testing, Claude Code generates the meeting summary and then directly executes upload.py. Here\u0026rsquo;s the interesting part: when Claude Code requests script execution, it does not read the script\u0026rsquo;s source code. It only cares about how to run it and what the result is.\nThis means even a 10,000-line script with complex business logic consumes essentially zero model context.\nSo while Reference and Script are both advanced features, their impact on model context is fundamentally different:\nReference reads: loads file content into context, consuming tokens Script runs: executes without reading, nearly zero context overhead Progressive Disclosure: The Three-Layer Architecture # Tying it all together, Skill\u0026rsquo;s design is a carefully layered progressive disclosure structure with three tiers:\nLayer 1: Metadata. All Skill names and descriptions, always loaded—essentially a catalog. The model scans this before every response to determine if the user\u0026rsquo;s request matches any Skill.\nLayer 2: Instruction. Everything in skill.md beyond the metadata. Only loaded when the model identifies a match, hence lazy loading.\nLayer 3: Resources. Includes Reference and Script (the official spec also mentions Assets, but it overlaps with Reference so we\u0026rsquo;ll skip it here). This layer only activates when the model determines specific resources are needed based on the instruction layer—it\u0026rsquo;s lazy loading on top of lazy loading, or \u0026ldquo;lazy within lazy.\u0026rdquo;\nEach layer builds on the judgment of the one above it, keeping token consumption to an absolute minimum.\nHow Skill Relates to Prompt Engineering # This brings up another common question: what\u0026rsquo;s the relationship between Skill and Prompt Engineering? Both seem to be about \u0026ldquo;teaching the model what to do\u0026rdquo;—so what\u0026rsquo;s the difference?\nMy take: they solve problems at different levels.\nPrompt Engineering is about \u0026ldquo;how to think.\u0026rdquo; Its job is guiding the model toward correct understanding and reasoning—defining roles, providing context, formatting outputs, reducing hallucinations. It operates at the cognitive layer, deciding what the model should do, how to decompose problems, and whether external capabilities are needed. But prompts themselves don\u0026rsquo;t execute any real actions.\nSkill is about \u0026ldquo;how to act.\u0026rdquo; It turns model decisions into executable behavior—calling functions, running scripts, reading and writing files. Skill doesn\u0026rsquo;t participate in thinking; it takes instructions and gets things done2.\nAn imperfect but memorable analogy: Prompt Engineering is like writing an onboarding manual for a new hire, explaining how to judge different situations. Skill is like handing them a toolbox so they can act on those judgments. One is the brain, the other is the hands.\nWith this in mind, the three-layer architecture clicks into place: Skill\u0026rsquo;s instruction layer carries the output of Prompt Engineering, while the resource layer is where real execution lives.\nSkill vs. MCP: Which One to Use # After all this, you might be thinking: Skill and MCP seem kind of similar—both connect the model to the outside world.\nAnthropic nailed the distinction in one sentence:\nMCP connects Claude to data. Skills teach Claude what to do with that data.\nMCP supplies data to the model—querying sales records, fetching shipping status. Skill teaches the model how to process that data—requiring meeting summaries to include topics, demanding reports to cite specific numbers.\nYou might ask: Skill can also connect to data via scripts, so why not just use Skill for everything?\nSure, it can—but \u0026ldquo;can\u0026rdquo; doesn\u0026rsquo;t mean \u0026ldquo;should.\u0026rdquo; A Swiss Army knife can chop vegetables, but nobody actually uses it for cooking. MCP is fundamentally a standalone service; Skill is fundamentally a set of instructions. They differ significantly in security, stability, and suitable use cases. Skill is better suited for lightweight scripts and simple logic, while MCP is more reliable for complex data connections3.\nIn practice, you\u0026rsquo;ll often combine the two: MCP handles data connections, Skill defines processing rules—each doing what it does best.\nStrictly speaking, Skill is more than a static reference manual. It supports conditional file loading and code execution, giving it dynamic capabilities.\u0026#160;\u0026#x21a9;\u0026#xfe0e;\n\u0026ldquo;Doesn\u0026rsquo;t participate in thinking\u0026rdquo; is relative to Prompt Engineering. Skill\u0026rsquo;s instruction layer does incorporate Prompt Engineering, but the resource layer\u0026rsquo;s Reference and Script are purely about execution, not reasoning.\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nSkill scripts are executed directly by Claude Code without the sandboxing and permission controls that MCP provides, making them unsuitable for sensitive or high-risk data operations.\u0026#160;\u0026#x21a9;\u0026#xfe0e;\n","date":"2025-12-21","externalUrl":null,"permalink":"/en/posts/2025/12/agent-skill-architecture/","section":"All Posts","summary":"","title":"Agent Skill: Three-Layer Architecture and Design Philosophy","type":"posts"},{"content":"","date":"2025-12-21","externalUrl":null,"permalink":"/en/categories/%E6%8A%80%E6%9C%AF%E6%95%99%E7%A8%8B/","section":"Categories","summary":"","title":"技术教程","type":"categories"},{"content":"","date":"2025-01-12","externalUrl":null,"permalink":"/en/tags/prd/","section":"Tags","summary":"","title":"PRD","type":"tags"},{"content":"","date":"2025-01-12","externalUrl":null,"permalink":"/en/tags/product-manager/","section":"Tags","summary":"","title":"Product Manager","type":"tags"},{"content":"","date":"2025-01-12","externalUrl":null,"permalink":"/en/categories/technical-thinking/","section":"Categories","summary":"","title":"Technical Thinking","type":"categories"},{"content":" Introduction # I was recently working on an Agent product, and halfway through writing the PRD, I realized something: the old approach felt like using a screwdriver to hammer a nail. Not completely useless, but definitely not the right tool. Feature lists, page structures, interaction flows — the usual suspects were all there, but they couldn\u0026rsquo;t cover what an Agent product actually needs to define.\nSo I scrapped the entire PRD and rewrote it from scratch, shifting the focus from \u0026ldquo;describing features\u0026rdquo; to \u0026ldquo;describing decisions.\u0026rdquo; This post is about why I changed my approach and what the new structure looks like.\nWhere Traditional PRDs Fall Short # The fundamental assumption behind traditional PRDs is that system behavior is predictable — you define the transitions between states and you\u0026rsquo;re done. That works well for deterministic products, but Agent products are inherently uncertain: the same user input can lead to completely different processing paths depending on context.\nThere are a few obvious blind spots:\nIntent understanding is left undefined. In traditional products, users express intent through buttons, menus, and forms — intent is explicit. In Agent products, users speak in natural language, and intent is implicit1. \u0026ldquo;Check last month\u0026rsquo;s data\u0026rdquo; could mean a summary overview, a month-over-month comparison, or an export for a board meeting. If you don\u0026rsquo;t map out the intent space first, everything downstream is built on sand.\nTool invocation conditions are missing. Many Agent PRDs list a string of capabilities: \u0026ldquo;supports knowledge base retrieval, search, ticketing system integration.\u0026rdquo; But what actually determines the user experience is when to invoke each tool, in what order, and what happens on failure2. Tool invocation is itself a product decision — you can\u0026rsquo;t just leave it to the engineering team to figure out.\nBoundary conditions are treated as edge cases. In traditional products, error handling is an afterthought — you add a section at the end of the doc and call it done. But in Agent products, insufficient information, tool failures, data conflicts, and hallucination risks aren\u0026rsquo;t rare edge cases — they happen every day as part of the main flow3. Treating them as \u0026ldquo;exception handling\u0026rdquo; is a fundamental misjudgment.\nA Different Approach: Start with the Decision Flow # Once I recognized these issues, I changed the starting point of my PRD from \u0026ldquo;feature modules\u0026rdquo; to \u0026ldquo;decision flow.\u0026rdquo;\nThe core idea: after a user says something, the system first checks whether the intent is clear — if not, it asks for clarification. If the intent is clear, it checks whether external tools are needed. After calling tools, it checks whether the action is high-risk — if so, it requires user confirmation. The entire chain is a series of decision nodes.\nI sketched this logic with Mermaid4:\ngraph TD A([User Input]) --\u003e B{Clear intent identified?} B -- No --\u003e C[Ask for clarification] B -- Yes --\u003e D{External tool needed?} D -- No --\u003e E[Generate result directly] D -- Yes --\u003e F[Call knowledge base / search / business system] F --\u003e G{High-risk action?} G -- Yes --\u003e H[Request user confirmation] G -- No --\u003e I[Execute and return result] This isn\u0026rsquo;t a universal template — different business scenarios will need different branches and conditions. But it gives the PRD a backbone: every diamond node is a decision rule that needs to be clearly defined, and every rectangle is a behavior that needs to be precisely described.\nHow I Write Intent Breakdowns # I use a structured table for intent breakdowns, not just a list of intent names. Each intent covers at least these fields: intent name, typical expression, actual goal, priority, whether automatic execution is allowed, whether secondary confirmation is required.\nTake a real example. The single input \u0026ldquo;I can\u0026rsquo;t log in\u0026rdquo; hides at least four intents: forgotten password, locked account, new employee without access, and SSO misconfiguration. Each requires a completely different processing path. If the PRD just says \u0026ldquo;automatically identify issue category and create a ticket,\u0026rdquo; the engineering team is left guessing.\nAfter mapping intents, there\u0026rsquo;s another important piece: what to do when information is insufficient. A user might express two requests in one sentence, or leave out critical details entirely. Do you ask once and then give your best answer? Do you keep asking until you have enough? If two rounds of clarification still aren\u0026rsquo;t enough, do you degrade gracefully or return an error? These rules need to be locked down upfront, not figured out in production.\nHow I Write Tool Invocation Rules # I don\u0026rsquo;t write tool invocation as a capability checklist — I write it as conditional logic:\nDoes the user\u0026rsquo;s question involve internal data? If yes, check the knowledge base first; if no, answer directly. Does the knowledge base return enough information to answer the question? If not, supplement with search results. Does the request involve write operations (create, modify, delete)? If so, user confirmation is mandatory. What happens on tool timeout or error — degrade gracefully, return an error, or save a draft for the user to retry? When multiple tools return contradictory results, which data source takes priority? Each of these conditions is a product decision. The more explicit you are, the less the implementation drifts, and the fewer arguments you have post-launch.\nEvery additional tool invocation adds latency and a new failure point. More tools doesn\u0026rsquo;t mean a better product — if anything, the principle should be \u0026ldquo;don\u0026rsquo;t invoke unless necessary.\u0026rdquo; There\u0026rsquo;s no point routing a simple question through three tools just to produce an answer that\u0026rsquo;s worse than a direct response.\nHow I Write Boundary Conditions # My current approach is to treat boundary conditions as equal in importance to the main flow — not as a supplementary section at the end of the document, but as rules embedded into each decision node.\nSpecifically, I focus on these categories:\nInsufficient information: What\u0026rsquo;s the clarification strategy, and how to handle it after N rounds of failed clarification Tool failure: Degradation plan for each tool High-risk actions: Which operations require confirmation, which can execute automatically, and which must never execute without explicit approval Data conflicts: When multiple sources contradict each other, which one to trust — or whether to surface the conflict to the user Hallucination control: Whether the model is allowed to generate content without sufficient evidence, and how to inform the user if not These rules may seem tedious to write, but they\u0026rsquo;re the line between a controllable Agent and an uncontrollable one.\nThe Complete PRD Structure # After going through this process, here\u0026rsquo;s the structure I use for Agent PRDs:\nScenario definition: Why does this scenario need an Agent instead of a traditional feature? If a task has extremely stable rules, highly structured inputs, and doesn\u0026rsquo;t require multi-round judgment, it probably doesn\u0026rsquo;t need an Agent3. This section filters out a lot of pseudo-requirements.\nIntent breakdown: A structured table listing possible intents for each scenario, along with trigger conditions, priorities, and handling methods.\nDecision paths: The expanded version of the Mermaid diagram above — each decision node with clearly defined conditions and branch directions.\nTool invocation rules: Trigger conditions, prohibition conditions, call order, and failure degradation.\nBoundary conditions: Equal in importance to the main flow, embedded into each node rather than in a separate chapter.\nResult definition: What counts as complete, partially complete, and failed-but-with-usable-intermediate-results.\nEvaluation metrics: Intent recognition accuracy, tool invocation success rate, task completion rate, human takeover rate, result adoption rate — not just DAU.\nClosing Thoughts # Looking back, the fundamental difference between traditional PRDs and Agent PRDs is this: traditional PRDs describe how a system behaves across different states; Agent PRDs describe how a system acts across different decisions. One is about static pages and flows; the other is about dynamic intents and decisions.\nOnce you understand that, the writing approach naturally changes. If you\u0026rsquo;re working on an Agent product, try running your PRD through this checklist: are the intents clearly broken down? Are tool invocation conditions defined? Are boundary conditions covered? If you can\u0026rsquo;t answer one of these, the problem probably isn\u0026rsquo;t the document — it\u0026rsquo;s that the product design itself isn\u0026rsquo;t fully thought through yet.\nhttps://www.promptingguide.ai/\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://platform.claude.com/docs/en/agents-and-tools/tool-use/overview\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://www.anthropic.com/engineering/building-effective-agents\u0026#160;\u0026#x21a9;\u0026#xfe0e;\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://mermaid.js.org\u0026#160;\u0026#x21a9;\u0026#xfe0e;\n","date":"2025-01-12","externalUrl":null,"permalink":"/en/posts/2025/01/agent-prd-writing/","section":"All Posts","summary":"","title":"Writing Agent PRDs: The Mistakes I Made and How I Do It Now","type":"posts"},{"content":"","date":"2025-01-12","externalUrl":null,"permalink":"/tags/%E4%BA%A7%E5%93%81%E7%BB%8F%E7%90%86/","section":"标签","summary":"","title":"产品经理","type":"tags"},{"content":"","date":"2024-12-15","externalUrl":null,"permalink":"/en/tags/ltsc/","section":"Tags","summary":"","title":"LTSC","type":"tags"},{"content":"","date":"2024-12-15","externalUrl":null,"permalink":"/en/categories/tinkering/","section":"Categories","summary":"","title":"Tinkering","type":"categories"},{"content":"","date":"2024-12-15","externalUrl":null,"permalink":"/en/tags/windows/","section":"Tags","summary":"","title":"Windows","type":"tags"},{"content":" Introduction # My daily driver is a Mac, but I also have a Linux laptop running a dual-boot setup: NixOS on one side, Windows 11 IoT LTSC on the other. The Windows partition handles the handful of tasks that aren\u0026rsquo;t convenient on macOS or Linux.\nWhy LTSC? I just wanted a clean Windows. No Microsoft Store, no Copilot, no pile of pre-installed UWP apps, and noticeably lower resource usage. If you like building up your system from a blank slate, LTSC is a solid choice.\nThis post documents my installation and configuration process, for reference only.\nBefore You Install # LTSC vs IoT LTSC # LTSC (Long-Term Servicing Channel)1 is a special branch of Windows Enterprise designed for devices that demand extreme stability and continuity. Regular LTSC gets 5 years of support; IoT LTSC gets 10. The feature set is nearly identical, with a few key differences:\nIoT LTSC supports digital entitlement activation IoT LTSC doesn\u0026rsquo;t enable BitLocker encryption by default IoT LTSC allows installation on devices that don\u0026rsquo;t meet TPM requirements One thing to note: IoT LTSC ships with English only, so you\u0026rsquo;ll need to download language packs manually after installation. If that sounds like a hassle, regular LTSC includes multiple languages out of the box — install and you\u0026rsquo;re good to go.\nBack Up Your Data # Back up all your data, software configs, and personal files before reinstalling. Since this is a dual-boot machine, I also needed to make sure the NixOS partition and bootloader wouldn\u0026rsquo;t get wiped — I\u0026rsquo;d recommend writing down your partition layout beforehand.\nPrepare Essential Tools # Download these beforehand so you\u0026rsquo;re not stuck with no network drivers after a fresh install:\nEthernet and Wi-Fi drivers for your hardware An activation tool (HEU2 works fine) Download and Install # Download the ISO # The official download channel for Windows 11 IoT LTSC requires filling out enterprise information3, which is a hassle. A more convenient option is to grab the ISO directly through the links provided by MAS4 — no registration required.\nCreate a Bootable Drive # I use Ventoy5 — just drop the ISO onto the USB drive and you\u0026rsquo;re done. No need to write images with Rufus every time.\nInstallation # Boot into PE, format the target partition, then launch the Windows installer. Nothing special about the process — just follow the wizard. Dual-boot users: make sure you select the right partition and don\u0026rsquo;t overwrite your Linux data. If GRUB gets wiped, boot from a NixOS installer USB and repair it.\nSystem Configuration # Installing the system is the easy part. The real work begins after.\nAdd Chinese Language Pack # IoT LTSC ships with English only by default. After connecting to the network, go to system settings and download the Chinese language pack, then set it as the display language and regional format. If this step bothers you, consider installing regular LTSC instead — it comes with Chinese out of the box.\nDisable Windows Reserved Storage # Windows reserves a portion of disk space for system updates by default. If you don\u0026rsquo;t need it:\nDISM.exe /Online /Set-ReservedStorageState /State:Disabled Delete Hibernation File # If you don\u0026rsquo;t use hibernation, the hibernation file wastes several GB of disk space. Run in an administrator terminal:\npowercfg -h off Disable Windows Defender # Windows Defender is a decent antivirus, but its false positive rate is high and it frequently hogs resources, sending fans into overdrive. I use tools from Lenovo\u0026rsquo;s knowledge base to disable it — they work on both Windows 10 and 11. That said, after disabling Defender I\u0026rsquo;d recommend installing a reliable third-party antivirus. Running bare is not a great idea.\nUninstall Microsoft Edge # Edge is actually a decent Chromium browser, but it\u0026rsquo;s incredibly pushy: Copilot banners on ChatGPT, Edge banners when downloading Chrome, widgets popping up uninvited. Remove-MS-Edge6 handles the uninstall. Make sure to keep WebView2 — just select the first option to remove only the browser itself.\nUpdate WebView2 # Many third-party apps depend on WebView27. After removing Edge, download and install the latest stable offline installer from the Microsoft developer site.\nInstall VC++ Runtimes # Most Windows applications need the VC++ runtimes. Download the VC++ 2015-2022 bundle from Microsoft\u0026rsquo;s website and install it.\nInstall Microsoft Store (Optional) # If you need UWP apps, open PowerShell as administrator and run:\nwsreset -i There\u0026rsquo;s no progress bar — you\u0026rsquo;ll get a notification when it\u0026rsquo;s done.\nWindows Update # After connecting to the network, run a full system update. If downloads are slow, temporarily enable Delivery Optimization (P2P acceleration) — remember to turn it off after updating.\nDisable Automatic Driver Updates # I recommend disabling automatic driver updates after completing a full system update. This prevents Windows Update from overwriting your graphics drivers, so you can install WHQL-certified versions from the vendor\u0026rsquo;s website instead.\nClosing Thoughts # Overall, IoT LTSC feels clean and stable. No bloatware, no Copilot pushing, and the IoT version doesn\u0026rsquo;t even ship with AI-related components. Resource usage is noticeably lower than the regular edition. If you need it for gaming, just install the necessary runtimes and components — LTSC itself won\u0026rsquo;t hold you back.\nFor me, starting from a clean system and gradually installing software and configuring things to my liking is part of the fun.\nReferences # https://massgrave.dev/windows_ltsc_links https://info.microsoft.com/ww-landing-windows-11-enterprise.html https://learn.microsoft.com/en-us/windows/whats-new/ltsc/overview\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://github.com/zbezj/HEU_KMS_Activator\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://info.microsoft.com/ww-landing-windows-11-enterprise.html\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://massgrave.dev/windows_ltsc_links\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://www.ventoy.net\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://github.com/zoicware/Remove-MS-Edge\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://developer.microsoft.com/en-us/microsoft-edge/webview2/\u0026#160;\u0026#x21a9;\u0026#xfe0e;\n","date":"2024-12-15","externalUrl":null,"permalink":"/en/posts/2024/12/windows-11-iot-ltsc-guide/","section":"All Posts","summary":"","title":"Windows 11 IoT LTSC Setup Guide","type":"posts"},{"content":"","date":"2024-12-15","externalUrl":null,"permalink":"/categories/%E6%8A%98%E8%85%BE%E8%AE%B0%E5%BD%95/","section":"分类","summary":"","title":"折腾记录","type":"categories"},{"content":" Introduction # I recently had a need for audio transcription and happened to come across a tweet from Awni Hannun — the latest MLX Whisper is even faster now, transcribing 12 minutes of audio in 12.3 seconds on an M2 Ultra, nearly 60x real-time speed:\nThe latest MLX Whisper is even faster.\nWhisper v3 Turbo on an M2 Ultra transcribes ~12 minutes in 12.3 seconds. Nearly 60x real-time.\npip install -U mlx-whisper pic.twitter.com/DcKE0TRcbv\n\u0026mdash; Awni Hannun (@awnihannun) November 1, 2024 Since I have a Mac, I decided to give it a try. Turns out it\u0026rsquo;s very straightforward to set up, so here\u0026rsquo;s a quick write-up.\nInstallation # MLX Whisper is a Whisper implementation built on Apple\u0026rsquo;s MLX framework, optimized for Apple Silicon. There are two ways to install it.\nStandard Way # pip install -U mlx-whisper Using uv # I have a bit of a system cleanliness obsession — I don\u0026rsquo;t like installing things into the global Python environment. uv 1 is a Python package manager whose uv tool install command installs CLI tools into isolated environments without polluting your system Python. If you share this preference, I\u0026rsquo;d recommend this approach:\nuv tool install mlx-whisper After installation, the mlx_whisper command is ready to use.\nUsage # Command Line # mlx_whisper audio.mp3 --model mlx-community/whisper-large-v3-turbo Batch Processing Script # I wrote a small script that accepts multiple audio files and automatically saves transcripts as .txt files alongside the originals:\nimport sys import os import mlx_whisper if len(sys.argv) \u0026lt; 2: print(\u0026#34;Usage: python whisper.py \u0026lt;audio_file\u0026gt; [audio_file...]\u0026#34;) sys.exit(1) model = \u0026#34;mlx-community/whisper-large-v3-turbo\u0026#34; for file in sys.argv[1:]: print(f\u0026#34;Transcribing: {file}\u0026#34;) result = mlx_whisper.transcribe(file, path_or_hf_repo=model) output = \u0026#34;\\n\\n\u0026#34;.join(seg[\u0026#34;text\u0026#34;].strip() for seg in result[\u0026#34;segments\u0026#34;]) txt_path = os.path.splitext(file)[0] + \u0026#34;.txt\u0026#34; with open(txt_path, \u0026#34;w\u0026#34;, encoding=\u0026#34;utf-8\u0026#34;) as f: f.write(output + \u0026#34;\\n\u0026#34;) print(f\u0026#34;Saved: {txt_path}\u0026#34;) print() Just drop in your audio files and run. Since I use uv, the script is also run through it — --with automatically installs the dependency in a temporary environment:\nuv run --with mlx-whisper python whisper.py audio1.mp3 audio2.m4a Model Storage Location # The model is automatically downloaded via Hugging Face Hub and cached at:\n~/.cache/huggingface/hub/models--mlx-community--whisper-large-v3-turbo/ If you decide you no longer need it, simply delete this directory — no leftover files.\nWrap-up # MLX Whisper on Apple Silicon is genuinely fast. You basically throw a file at it and get results in seconds. Installation and cleanup are both clean. Highly recommended for anyone on a Mac.\nuv, a Python package manager by Astral\u0026#160;\u0026#x21a9;\u0026#xfe0e;\n","date":"2024-11-06","externalUrl":null,"permalink":"/en/posts/2024/11/mlx-whisper-transcription/","section":"All Posts","summary":"","title":"Blazing-Fast Audio Transcription on Mac with MLX Whisper","type":"posts"},{"content":"","date":"2024-11-06","externalUrl":null,"permalink":"/en/tags/macos/","section":"Tags","summary":"","title":"MacOS","type":"tags"},{"content":"","date":"2024-11-06","externalUrl":null,"permalink":"/en/tags/mlx/","section":"Tags","summary":"","title":"MLX","type":"tags"},{"content":"","date":"2024-11-06","externalUrl":null,"permalink":"/en/tags/python/","section":"Tags","summary":"","title":"Python","type":"tags"},{"content":"","date":"2024-11-06","externalUrl":null,"permalink":"/en/tags/whisper/","section":"Tags","summary":"","title":"Whisper","type":"tags"},{"content":"","date":"2024-08-10","externalUrl":null,"permalink":"/en/tags/networking/","section":"Tags","summary":"","title":"Networking","type":"tags"},{"content":" Preface # In January 2022, I bought a 2.5G soft router with a J4125 processor on Xianyu (second-hand marketplace), which kicked off my journey of tinkering with networking gear. In February, PON sticks became popular, and I originally planned to buy one to replace my home optical modem. But I heard they ran quite hot and would drop connections when overheated, so I dropped the idea.\nIn the second half of 2024, ZTE\u0026rsquo;s F7015tv3/F7005tv3 suddenly gained popularity. They have very low power consumption1, come with one 2.5G Ethernet port, one IPTV port, and two Gigabit Ethernet ports. The difference between the two models is that the F7015tv3 has an additional phone port. Since my broadband plan includes a landline (even though I barely get any calls), I ended up buying the F7015tv3 on Xianyu. When it arrived, I was surprised by how compact it is, about the same length as my iPhone 13 mini.\nHardware Info # # cat /proc/capability/boardinfo system:LINUX cpufac:ZXIC cpumod:ZX279133 2gwlmod:MTK 5gwlmod:MTK cpufre:1000MHZ cpunum:2 flshcap:256MB ddrcap:512MB Preparation # Open the old modem\u0026rsquo;s web interface and log in with the super admin account. Here\u0026rsquo;s the login method for Beijing Unicom: open 192.168.1.1, press F12, and enter the following in the Console.\ndocument.getElementById(\u0026#34;loginfrm\u0026#34;).setAttribute(\u0026#34;method\u0026#34;, \u0026#34;get\u0026#34;); document.getElementById(\u0026#34;username\u0026#34;).value = \u0026#34;CUAdmin\u0026#34;; document.getElementById(\u0026#34;password\u0026#34;).value = \u0026#34;CUAdmin\u0026#34;; document.getElementById(\u0026#34;loginfrm\u0026#34;).submit(); submitFrm(); Record the SN/MAC address, device number, and connection configuration VLAN information (take screenshots or photos for reference).\nConfiguring the New Modem 2 # Log in to the new modem\u0026rsquo;s web interface using the super account useradmin with password nE7jA%5m. Go to Management → Uplink Mode, and switch to XGPON or XEPON depending on your broadband type.\nEnable the modem\u0026rsquo;s Telnet3. Change your computer\u0026rsquo;s network card MAC address to 000729553357, then run the following command. When prompted to reboot, wait for the modem to restart automatically.\nzteOnu --telnet Delete existing network configurations: sendcmd 1 DB p WANC sendcmd 1 DB delr WANC 0 sendcmd 1 DB delr WANC 1 Delete as many entries as there are by changing the number after WANC.\nCreate a new network connection following the configuration from the old modem.\nUse Telnet to modify modem parameters. Run setmac show2 to view system parameters.\nChange Region # # cat /etc/init.d/regioncode 200:Jiangsu 201:Xinjiang 202:Hainan 203:Tianjin 204:Anhui 205:Shanghai 206:Chongqing 207:Beijing 208:Sichuan 209:Shandong 210:Guangdong 211:Hubei 212:Fujian 214:Zhejiang 215:Shanxi 216:Hunan 217:Yunnan 218:Xizang 219:Heilongjiang 220:Guizhou 221:Shanxi2 222:Hebei 223:Ningxia 224:Guangxi 225:Jiangxi 226:Gansu 227:Qinghai 229:Liaoning 230:Jilin 231:Neimeng 232:Henan 234:TelecomInstitute # Switch to Beijing region upgradetest sdefconf 207 Change MAC Address # setmac 1 32769 12:34:56:78:90:12 Change Device Code (First 6 Digits) # setmac 1 768 xxxxxx Change Device Code (Last 17 Digits) # setmac 1 512 xxxxxxxxxx ITMS Spoofing # sendcmd 1 DB set PDTCTUSERINFO 0 Status 0 sendcmd 1 DB set PDTCTUSERINFO 0 Result 1 sendcmd 1 DB save Once all modifications are complete, connect the fiber optic cable and restart.\nhttps://www.acwifi.net/28124.html\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://www.chiphell.com/thread-2607258-1-1.html\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://github.com/Septrum101/zteOnu\u0026#160;\u0026#x21a9;\u0026#xfe0e;\n","date":"2024-08-10","externalUrl":null,"permalink":"/en/posts/2024/08/beijing-unicom-replaced-2.5g-optical-modem/","section":"All Posts","summary":"","title":"Replacing a 2.5G Optical Modem on Beijing Unicom","type":"posts"},{"content":"","date":"2024-07-17","externalUrl":null,"permalink":"/en/tags/android/","section":"Tags","summary":"","title":"Android","type":"tags"},{"content":"","date":"2024-07-17","externalUrl":null,"permalink":"/en/tags/arch/","section":"Tags","summary":"","title":"Arch","type":"tags"},{"content":" Preface # There are many cloud phone providers on the market today. While they\u0026rsquo;re generally stable, putting all your data in someone else\u0026rsquo;s hands doesn\u0026rsquo;t feel very secure. Who knows if they\u0026rsquo;re recording your screen during operations?\nThere\u0026rsquo;s an open-source project on GitHub1 called Redroid2 that lets you run Android containers using Docker. I\u0026rsquo;ve been using it for a long time and it\u0026rsquo;s very stable.\nPrerequisites # You\u0026rsquo;ll need a server. Both AMD64 and ARM64 architectures are supported.\nSupported operating systems (click to view deployment instructions for each):\nAlibaba-Cloud-Linux Amazon-Linux Arch-Linux CentOS Debian Deepin Fedora Gentoo Kubernetes LXC Mint OpenEuler PopOS Ubuntu WSL If you\u0026rsquo;re a beginner, I recommend Ubuntu, Arch Linux, or NixOS. Ubuntu requires loading kernel modules, while Arch Linux and NixOS just need a switch to the zen kernel.\nInstallation # This section only covers the three operating systems mentioned above. For other systems, refer to the official documentation.\nUbuntu # I recommend not using the latest Ubuntu version, as there may be various minor issues. I\u0026rsquo;ve tested versions 20.04 and 22.04 without any problems. Run the following commands to load the kernel modules:\nsudo apt install linux-modules-extra-`uname -r` sudo modprobe binder_linux devices=\u0026#34;binder,hwbinder,vndbinder\u0026#34; sudo modprobe ashmem_linux Install Docker3 using the one-click script:\ncurl -sSL https://get.docker.com/ | sh Or install Docker and Docker Compose using the package manager:\nsudo apt install docker docker-compose Arch Linux # Install the linux-zen kernel:\n# Update system sudo pacman -Syu # Install linux-zen kernel sudo pacman -S linux-zen linux-zen-headers # Update GRUB boot configuration sudo grub-mkconfig -o /boot/grub/grub.cfg # Reboot sudo reboot # Verify kernel version (should show \u0026#34;zen\u0026#34;) uname -r Install Docker and Docker Compose using the package manager:\nsudo pacman -S docker docker-compose NixOS # Install the linux-zen kernel:\n# Add to /etc/nixos/configuration.nix boot.kernelPackages = pkgs.linuxPackages_zen # Rebuild and apply sudo nixos-rebuild switch # Reboot sudo reboot # Verify kernel version (should show \u0026#34;zen\u0026#34;) uname -r Install Docker and Docker Compose:\n# Add to /etc/nixos/configuration.nix virtualisation.docker.enable = true; # docker-compose environment.systemPackages = with pkgs; [ docker-compose ]; # Rebuild and apply sudo nixos-rebuild switch Running the Redroid Container # Direct Launch # sudo docker run -itd --rm --privileged \\ --pull always \\ -v ~/data:/data \\ -p 5555:5555 \\ redroid/redroid:11.0.0-latest Using Docker Compose # # docker-compose.yaml version: \u0026#34;3\u0026#34; services: redroid: stdin_open: true tty: true privileged: true pull_policy: always volumes: - ~/data:/data ports: - 5555:5555 image: redroid/redroid:11.0.0-latest # Start the container sudo docker-compose up -d Additional Notes # Connecting to the Device # You can operate Android with your mouse by installing adb4 and scrcpy5.\nOn macOS and Linux, install android-platform-tools and scrcpy using your package manager:\n# macOS brew install --cask android-platform-tools scrcpy # Debian \u0026amp; Ubuntu sudo apt install android-platform-tools scrcpy # Arch Linux sudo pacman -S android-platform-tools scrcpy # NixOS environment.systemPackages = with pkgs; [ android-tools scrcpy ]; If you want to control Android from a web browser, try the ws-scrcpy project.\nIf you don\u0026rsquo;t want to set up the environment yourself, you can use my pre-built image: aaronyes/ws-scrcpy.\nRun directly:\nsudo docker run --name ws-scrcpy -d -p 8000:8000 aaronyes/ws-scrcpy Or use Docker Compose:\nversion: \u0026#34;3\u0026#34; services: ws-scrcpy: container_name: ws-scrcpy ports: - 8000:8000 image: aaronyes/ws-scrcpy After starting, connect to Android using adb:\nsudo docker exec ws-scrcpy adb connect ip:5555 If using Docker Compose, you can combine both services:\nversion: \u0026#34;3\u0026#34; services: redroid: container_name: redroid stdin_open: true tty: true privileged: true pull_policy: always volumes: - ~/data:/data ports: - 5555:5555 image: redroid/redroid:11.0.0-latest ws-scrcpy: container_name: ws-scrcpy ports: - 8000:8000 image: aaronyes/ws-scrcpy Now you can connect using the container name:\ndocker exec ws-scrcpy adb connect redroid:5555 Installing Apps Without an App Store # Use adb to install apps. Download the APK file on your computer, then run adb install \u0026lt;path-to-apk\u0026gt;\nAlternatively, install Via Browser (very small, lightweight, and ad-free), then use it to download and install other apps\nInstalling Magisk # Refer to the following guide:\nhttps://gist.github.com/assiless/a23fb52e8c6156db0474ee8973c4be66 GMS Support # Refer to the following documentation:\nhttps://github.com/remote-android/redroid-doc?tab=readme-ov-file#gms-support References # https://viayoo.com/en/ https://github.com/NetrisTV/ws-scrcpy https://gist.github.com/assiless/a23fb52e8c6156db0474ee8973c4be66 https://github.com\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://github.com/remote-android/redroid-doc\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://docs.docker.com/engine/install/\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://developer.android.com/tools/adb\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://github.com/Genymobile/scrcpy\u0026#160;\u0026#x21a9;\u0026#xfe0e;\n","date":"2024-07-17","externalUrl":null,"permalink":"/en/posts/2024/07/build-your-own-cloud-phone/","section":"All Posts","summary":"","title":"Build Your Own Cloud Phone","type":"posts"},{"content":"","date":"2024-07-17","externalUrl":null,"permalink":"/en/tags/cloud/","section":"Tags","summary":"","title":"Cloud","type":"tags"},{"content":"","date":"2024-07-17","externalUrl":null,"permalink":"/en/tags/docker/","section":"Tags","summary":"","title":"Docker","type":"tags"},{"content":"","date":"2024-07-17","externalUrl":null,"permalink":"/en/tags/linux/","section":"Tags","summary":"","title":"Linux","type":"tags"},{"content":"","date":"2024-07-17","externalUrl":null,"permalink":"/en/tags/nixos/","section":"Tags","summary":"","title":"NixOS","type":"tags"},{"content":"","date":"2024-07-17","externalUrl":null,"permalink":"/en/tags/ubuntu/","section":"Tags","summary":"","title":"Ubuntu","type":"tags"},{"content":"","date":"2024-07-03","externalUrl":null,"permalink":"/en/tags/avif/","section":"Tags","summary":"","title":"AVIF","type":"tags"},{"content":" Preface # In my previous post, I introduced the AVIF format and how to use it. I thought it was a format with great potential: high compression ratio and excellent image quality. But after actually replacing my blog images with AVIF, I ran into quite a few compatibility issues and ultimately decided to switch back to WebP.\nThis article explains why I went back to WebP.\nCompatibility Issues # First, let\u0026rsquo;s look at AVIF\u0026rsquo;s compatibility data on Can I use.\nAs you can see, Edge only started supporting AVIF from version 121, which was released in 2024. Even though half a year has passed since this article was written, there are still many people who haven\u0026rsquo;t updated to the latest browser version. Mobile is even more problematic. Most mobile browsers only recently added support. For example, QQ Browser. If someone opens your blog in QQ or WeChat, images might not display at all. Even though my blog doesn\u0026rsquo;t have many images, this still significantly affects the reading experience.\nThen there\u0026rsquo;s hardware decoding support for AV1. You can check CPU and SoC support on Wikipedia. Here are some common ones:\nDesktop # Intel # 11th gen and later: Starting from the 11th gen Tiger Lake processors, Intel integrated AV1 hardware decoding support. AMD # Ryzen 6000 series and later: Based on Zen 3+ architecture, supports AV1 hardware decoding. Apple Silicon # M3 and later: M3, M3 Pro, M3 Max started supporting AV1 hardware decoding1. Mobile # Qualcomm # AV1 hardware decoding supported starting from Snapdragon 888. iOS # AV1 hardware decoding supported starting from A17 Pro2. Rant # Apple sure knows how to milk it. They only added AV1 hardware decoding support in chips released in late 2023. While software decoding can fill the gap, it requires the OS or browser to be updated first. On devices without hardware decoding support, software decoding causes high CPU usage, device heating, and on older, less powerful devices, it can lead to stuttering.\nWebP Introduction # WebP3 is an image format developed by Google, designed to speed up image loading. Its main advantage is smaller file sizes. At the same quality level, WebP images are about 40% smaller than JPEG, roughly two-thirds the size, which can save significant server bandwidth and storage space.\nCheck WebP\u0026rsquo;s support on Can I use as well.\nCompared to AVIF, WebP has much broader support. With the exception of IE (which Microsoft has already abandoned, and nobody should be using anymore), all mainstream browsers support it. WebP has been around for years, has a mature ecosystem, and you basically don\u0026rsquo;t have to worry about compatibility issues.\nHow to Convert Images to WebP Format # You can use FFmpeg4 for conversion:\n# JPEG → WebP ffmpeg -i input.jpg output.webp # PNG → WebP ffmpeg -i input.png output.webp Adjust output quality with the -qscale parameter (0-100, higher values mean better quality but larger files):\nffmpeg -i input.jpg -qscale 75 output.webp Adjust compression level with the -compression_level parameter (0-6, higher values mean higher compression but slower speed):\nffmpeg -i input.jpg -compression_level 4 output.webp Combined usage, specifying both quality and compression level:\nffmpeg -i example.png -qscale 80 -compression_level 4 example.webp References # https://zh.wikipedia.org/wiki/WebP https://en.wikipedia.org/wiki/AV1#Hardware https://www.reddit.com/r/AV1/comments/ytzwxx/list_of_cpusoc_with_av1_support https://www.apple.com/newsroom/2023/10/apple-unveils-m3-m3-pro-and-m3-max-the-most-advanced-chips-for-a-personal-computer/\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://www.reddit.com/r/AV1/comments/16gyfyw/apple_a17_pro_finally_supports_av1_hardware/\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://developers.google.com/speed/webp\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://ffmpeg.org\u0026#160;\u0026#x21a9;\u0026#xfe0e;\n","date":"2024-07-03","externalUrl":null,"permalink":"/en/posts/2024/07/avif-is-so-good-why-did-i-still-choose-webp/","section":"All Posts","summary":"","title":"AVIF is Great, So Why Did I Still Choose WebP?","type":"posts"},{"content":"","date":"2024-07-03","externalUrl":null,"permalink":"/en/tags/ffmpeg/","section":"Tags","summary":"","title":"FFmpeg","type":"tags"},{"content":"","date":"2024-07-03","externalUrl":null,"permalink":"/en/tags/webp/","section":"Tags","summary":"","title":"WebP","type":"tags"},{"content":"","date":"2024-07-03","externalUrl":null,"permalink":"/categories/%E6%8A%80%E6%9C%AF%E5%AE%9E%E8%B7%B5/","section":"分类","summary":"","title":"技术实践","type":"categories"},{"content":" Preface # While working on optimizing my blog\u0026rsquo;s image formats, I came across AVIF1, a relatively new image format. After learning more about it, I found that it offers significant advantages over traditional JPEG in terms of compression ratio. I did some research and decided to document my findings here.\nWhat is AVIF # AVIF is an image format based on AV1 video coding technology. Like JPEG, it uses lossy compression to reduce file size, but unlike JPEG, AVIF can achieve much smaller file sizes at the same visual quality.\nThe format was developed by the Alliance for Open Media (AOMedia), a consortium consisting of companies like Amazon, Netflix, Google, and Mozilla. Files are compressed using the AV1 (AOMedia Video 1) algorithm and stored in the HEIF container format2. Since AV1 compression technology is royalty-free3, there are no licensing fees required to use it.\nYou can check AVIF\u0026rsquo;s browser support on Can I use.\nAVIF vs JPEG # The most obvious advantage of AVIF over JPEG is significantly smaller file sizes. Smaller files mean less bandwidth consumption and faster loading times. Replacing JPEG images with AVIF on a webpage can reduce image data consumption by about half.\nColor depth is another area where AVIF outperforms JPEG. JPEG only supports 8-bit color depth, while AVIF supports HDR4, meaning richer colors and more detail.\nNetflix has some excellent visual examples comparing the same image compressed as JPEG and AVIF. Other comparisons with WebP and PNG formats are also worth checking out.\nHow to Use AVIF # You can use FFmpeg5 to convert images from other formats to AVIF.\nSimple conversion examples:\n# JPEG → AVIF ffmpeg -i input.jpg output.avif # PNG → AVIF ffmpeg -i input.png output.avif Adjust output quality with the -crf parameter (0-63, lower values mean higher quality):\nffmpeg -i input.jpg -c:v libaom-av1 -crf 30 -b:v 0 output.avif Adjust compression level with the -cpu-used parameter (0-8, higher values mean faster speed but lower compression):\nffmpeg -i input.jpg -c:v libaom-av1 -cpu-used 4 -crf 30 -b:v 0 output.avif Combined usage, specifying both quality and compression level:\nffmpeg -i example.png -c:v libaom-av1 -crf 30 -cpu-used 4 -b:v 0 example.avif References # https://caniuse.com/?search=avif https://jakearchibald.com/2020/avif-has-landed/ https://www.lifewire.com/what-is-an-avif-file-5078731 https://www.ctrl.blog/entry/webp-avif-comparison.html https://netflixtechblog.com/avif-for-next-generation-image-coding-b1d75675fe4 https://en.wikipedia.org/wiki/AVIF\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://en.wikipedia.org/wiki/High_Efficiency_Image_File_Format\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://aomedia.org/av1-features/\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://en.wikipedia.org/wiki/High_dynamic_range\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://ffmpeg.org\u0026#160;\u0026#x21a9;\u0026#xfe0e;\n","date":"2024-06-04","externalUrl":null,"permalink":"/en/posts/2024/06/what-is-avif-how-to-use-it/","section":"All Posts","summary":"","title":"What is AVIF and How to Use It","type":"posts"},{"content":"","date":"2024-06-01","externalUrl":null,"permalink":"/en/tags/disko/","section":"Tags","summary":"","title":"Disko","type":"tags"},{"content":" Preface # NixOS1 is a declaratively configured system where the entire OS can be configured using declarative methods. When I first started using NixOS in 2023, I didn\u0026rsquo;t know about Disko2, so partitioning still required manually running commands. After using it for a while, I discovered the tmpfs as root approach, which was perfect for someone with severe OCD like me. I applied tmpfs as root to all my local devices. The experience was great, so I wanted to switch my servers to NixOS as well, and that\u0026rsquo;s when I encountered this problem that bothered me for months.\nMy local devices all use UEFI + systemd-boot, which has been working fine. But cloud servers typically boot with BIOS, and systemd-boot has some issues with BIOS3. I ultimately went with BIOS + GRUB, which is quite different from my local setup.\nThe Problem # After running rebuild, I got the following error:\n... updating GRUB 2 menu... updating GRUB 2 menu... updating GRUB 2 menu... Failed to get blkid info (returned 512) for / on tmpfs at /nix/store/nvycxmg4g2q5jyqdxfvkgi95sqs48iw3-install-grub.pl line 212. warning: error(s) occurred while switching to the new configuration I searched for related issues on GitHub and tried many times but couldn\u0026rsquo;t resolve it. After more than a month of troubleshooting, I finally found the solution.\nSolution # Edit the hardware-configuration.nix file and add the following:\nboot.loader.grub.enable = true; boot.loader.grub.efiSupport = true; boot.loader.grub.efiInstallAsRemovable = true; Save the file, run rebuild again, and it should work normally.\nhttps://nixos.org\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://github.com/nix-community/disko\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://github.com/systemd/systemd/issues/25963\u0026#160;\u0026#x21a9;\u0026#xfe0e;\n","date":"2024-06-01","externalUrl":null,"permalink":"/en/posts/2024/06/nixos-bios-boot-using-disko-to-manage-partitions/","section":"All Posts","summary":"","title":"Using Disko to Manage Partitions on BIOS-Boot NixOS Systems","type":"posts"},{"content":"","date":"2023-03-20","externalUrl":null,"permalink":"/en/tags/api/","section":"Tags","summary":"","title":"API","type":"tags"},{"content":"","date":"2023-03-20","externalUrl":null,"permalink":"/en/tags/chatgpt/","section":"Tags","summary":"","title":"ChatGPT","type":"tags"},{"content":"","date":"2023-03-20","externalUrl":null,"permalink":"/en/tags/reverse-engineering/","section":"Tags","summary":"","title":"Reverse Engineering","type":"tags"},{"content":"","date":"2023-03-20","externalUrl":null,"permalink":"/en/categories/technical-practice/","section":"Categories","summary":"","title":"Technical Practice","type":"categories"},{"content":" Introduction # In early 2023, ChatGPT was taking the world by storm, but using it came with a few clear pain points: the web experience wasn\u0026rsquo;t flexible enough to integrate with my own tools; the official API1 charged by token, which added up quickly for heavy users; and the web and API were two completely separate systems — ChatGPT Plus subscribers couldn\u0026rsquo;t use their GPT-4 quota through the API.\nSo I had an idea: what if I could reverse engineer the ChatGPT web interface and wrap it into a standard OpenAI API format? That way I could use the unlimited web quota and plug it into my own toolchain.\nThis post documents the entire process, from packet analysis to a working implementation.\nUnderstanding ChatGPT\u0026rsquo;s Web Architecture # Before diving in, I used the browser DevTools Network panel to map out the full request chain of the ChatGPT web frontend.\nAuthentication Chain # ChatGPT uses Auth02 as its OAuth2 authentication provider. The normal web login flow looks like this:\nBrowser → chat.openai.com/auth/login → Redirect to auth0.openai.com (Auth0-hosted login page) → User enters email and password → Auth0 callback returns access_token → Frontend stores token in session But this flow is designed for browsers — there are multiple 302 redirects, cookie passing, and JavaScript rendering. A CLI tool can\u0026rsquo;t just follow this flow directly.\nConversation Chain # After logging in, the frontend sends messages to this endpoint:\nPOST https://chat.openai.com/backend-api/conversation Authorization: Bearer \u0026lt;access_token\u0026gt; Content-Type: application/json Accept: text/event-stream Request body:\n{ \u0026#34;action\u0026#34;: \u0026#34;next\u0026#34;, \u0026#34;messages\u0026#34;: [ { \u0026#34;id\u0026#34;: \u0026#34;uuid\u0026#34;, \u0026#34;role\u0026#34;: \u0026#34;user\u0026#34;, \u0026#34;content\u0026#34;: { \u0026#34;content_type\u0026#34;: \u0026#34;text\u0026#34;, \u0026#34;parts\u0026#34;: [\u0026#34;Hello\u0026#34;] } } ], \u0026#34;model\u0026#34;: \u0026#34;text-davinci-002-render-sha\u0026#34;, \u0026#34;parent_message_id\u0026#34;: \u0026#34;uuid\u0026#34; } The response is SSE (Server-Sent Events)3 format, streaming token by token:\ndata: {\u0026#34;message\u0026#34;: {\u0026#34;content\u0026#34;:{\u0026#34;parts\u0026#34;:[\u0026#34;He\u0026#34;]}, ...}} data: {\u0026#34;message\u0026#34;: {\u0026#34;content\u0026#34;:{\u0026#34;parts\u0026#34;:[\u0026#34;Hello\u0026#34;]}, ...}} data: [DONE] One key difference: the web frontend uses a message tree (each message has a parent_message_id, supporting branching and regeneration), while the OpenAI API uses a linear messages array. This difference needs to be handled during protocol conversion later.\nReverse Engineering the Auth0 Authentication Flow # This was the most core — and most interesting — part of the project.\nShifting Approach: From Web to iOS # I initially tried to simulate the browser login flow directly, but quickly hit a wall: Auth0\u0026rsquo;s login page has extensive anti-bot measures — JavaScript validation, browser fingerprinting, reCAPTCHA, you name it.\nA different approach: mobile authentication flows are usually simpler than web ones. So I packet-analyzed the ChatGPT iOS client and found that it also uses Auth0, but through the OAuth2 + PKCE (Proof Key for Code Exchange)4 extension, which doesn\u0026rsquo;t require a browser environment.\nPKCE in Brief # PKCE is an OAuth2 security extension, originally designed for mobile and desktop apps that can\u0026rsquo;t safely store a client_secret. The flow is straightforward:\nClient generates code_verifier (random string) Client computes code_challenge = SHA256(code_verifier) Authorization request includes code_challenge Callback includes the original code_verifier Server verifies SHA256(code_verifier) == code_challenge, issues Token The benefit: even if the authorization code is intercepted, without the code_verifier it can\u0026rsquo;t be exchanged for a token.\nThe Decompiled Authentication Flow # By analyzing the iOS client\u0026rsquo;s network requests, I broke down the complete authentication flow into 7 steps:\nGet preauth_cookie Build authorize URL with iOS client parameters Follow authorize URL, extract state parameter and save cookies Submit email Submit password Handle callback or MFA verification If MFA is required, submit the code and go back to step 6 Finally, exchange the authorization code for an Access Token A few noteworthy details:\nWhy can code_verifier be hardcoded? The iOS client can be decompiled — the code_verifier and code_challenge pair is hardcoded in the client, shared by all iOS users. In this scenario, PKCE protects the transport layer (authorization code leak doesn\u0026rsquo;t mean token leak), not the client itself.\nWhere does client_id come from? Also from iOS client decompilation. It\u0026rsquo;s the iOS application ID that OpenAI registered with Auth0.\nWhy is redirect_uri set to com.openai.chat://...? That\u0026rsquo;s an iOS URL Scheme, used by Auth0 to redirect back to the app after authorization. In our implementation, we don\u0026rsquo;t actually need to redirect — we just extract the code parameter from the response\u0026rsquo;s Location header.\nThe Python implementation looks roughly like this:\nclass Auth0: def auth(self, login_local=False) -\u0026gt; str: return self.__part_one() if login_local else self.get_access_token_proxy() def __part_one(self): # Step 1: get preauth def __part_two(self): # Step 2: build authorize URL def __part_three(self): # Step 3: follow authorize def __part_four(self): # Step 4: submit email def __part_five(self): # Step 5: submit password def __part_six(self): # Step 6: handle callback/MFA def __part_seven(self): # Step 7: MFA OTP def get_access_token(self): # Final: code → token Implementing SSE Streaming Proxy # With the Access Token in hand, the next step is calling the ChatGPT conversation API.\nRequest construction is fairly intuitive: each message needs a UUID as its id, parent_message_id points to the previous message to form a conversation chain, and the first message doesn\u0026rsquo;t include conversation_id (the server creates and returns one). The action can be next (new message), variant (regenerate), or continue (continue output).\nThe tricky part is handling SSE responses. Python\u0026rsquo;s Flask is a synchronous framework, but SSE requires async consumption of streaming responses. My solution: async thread + blocking queue + Generator bridge:\ndef _request_sse(self, url, headers, data): queue, event = block_queue.Queue(), threading.Event() t = threading.Thread(target=asyncio.run, args=(self._do_request_sse(url, headers, data, queue, event),)) t.start() return queue.get(), queue.get(), self.__generate_wrap(queue, t, event) Why this detour? Because httpx5\u0026rsquo;s streaming API is async (async with client.stream('POST', url) needs an async context), but the upper layer is synchronous (Flask route handlers, CLI readline loops are all sync), and I didn\u0026rsquo;t want to rewrite the entire architecture from Flask to aiohttp/uvicorn.\nSo a thread runs the async event loop, queue.Queue ferries data from the async world to the sync world, and it exposes a standard Generator to the upper layer — completely transparent.\nAnother detail: threading.Event is used for interruption protection. If the client disconnects and triggers GeneratorExit, the Event is set, and the async thread detects it and closes the httpx connection, preventing thread leaks.\nWeb API to OpenAI API Protocol Conversion # This is the key step of wrapping ChatGPT\u0026rsquo;s web interface into a standard OpenAI API. The two API formats differ significantly:\nDimension ChatGPT Web API OpenAI Public API Authentication Bearer access_token Bearer sk-xxx (API Key) Request format Message tree (parent_message_id) messages array Response format SSE + message tree nodes SSE + choices array Session management Server-side conversation_id Stateless Request Conversion # My approach was to maintain a local message tree, converting the OpenAI-format messages array into a tree structure, supporting multi-turn conversations and regeneration:\ndef talk(self, content, model, message_id, parent_message_id, ...): if conversation_id: parent = conversation.get_prompt(parent_message_id) else: parent = conversation.add_prompt(Prompt(parent_message_id)) parent = conversation.add_prompt(SystemPrompt(self.system_prompt, parent)) conversation.add_prompt(UserPrompt(message_id, content, parent)) user_prompt, gpt_prompt, messages = conversation.get_messages(message_id, model) Response Conversion # The web side returns full text each time (parts[0] gets longer), while the OpenAI API returns incremental text. A delta calculation is needed:\n# Web response {\u0026#34;message\u0026#34;: {\u0026#34;content\u0026#34;: {\u0026#34;parts\u0026#34;: [\u0026#34;full text\u0026#34;]}, \u0026#34;author\u0026#34;: {\u0026#34;role\u0026#34;: \u0026#34;assistant\u0026#34;}}} # Converted to OpenAI format data: {\u0026#34;choices\u0026#34;: [{\u0026#34;delta\u0026#34;: {\u0026#34;content\u0026#34;: \u0026#34;incremental text\u0026#34;}, \u0026#34;finish_reason\u0026#34;: null}]} data: {\u0026#34;choices\u0026#34;: [{\u0026#34;delta\u0026#34;: {}, \u0026#34;finish_reason\u0026#34;: \u0026#34;stop\u0026#34;}]} data: [DONE] Token Limit Trimming # The OpenAI API has token limits (4096 for gpt-3.5-turbo, 8192 for gpt-4). When conversation history gets too long, local trimming is needed:\ndef __reduce_messages(self, messages, model, token=None): max_tokens = self.FAKE_TOKENS[model] if self.__is_fake_api(token) else self.MAX_TOKENS[model] while gpt_num_tokens(messages) \u0026gt; max_tokens - 200: if len(messages) \u0026lt; 2: raise Exception(\u0026#39;prompt too long\u0026#39;) messages.pop(1) # Remove from index 1, keeping system prompt and latest turns return messages Trimming strategy: keep messages[0] (system prompt) and the latest few conversation turns, removing the oldest user messages first. The - 200 leaves headroom for the model\u0026rsquo;s response.\nFrom Technical Validation to Production # Once the API was working, the next challenge was making it available to colleagues and friends.\nBatch Registration # After getting the API working, I found in practice that ChatGPT has per-account rate limits — push too many requests and it starts throwing errors. The most straightforward fix: more accounts. So I built a registration bot and used my own domain and email to batch-register 200 ChatGPT accounts. Two of those got Plus subscriptions (only Plus unlocks GPT-4), with the cost split among friends. The rest ran on free GPT-3.5, perfectly fine for daily use.\nToken Management and Persistence # Access Tokens are valid for 14 days and need to be refreshed upon expiry. I stored all account tokens in a PostgreSQL database, with a scheduled task that automatically detects expiration and batch-refreshes tokens to keep the pool always available.\nLoad Balancing # With 200 accounts, using just one would be a waste. I added a simple load balancing layer to the proxy service: on each incoming request, the service round-robins through the database to pick an available token for the ChatGPT API call. This avoids single-account rate limits and distributes request pressure evenly.\nThe end result: a single standard OpenAI API endpoint exposed externally. Colleagues and friends just point their applications\u0026rsquo; API Base URL to my service, completely unaware that 200 accounts are rotating behind the scenes. GPT-4 requests route to the Plus account pool, GPT-3.5 requests to the free account pool.\nReferences # https://auth0.com/docs/get-started/authentication-and-authorization-flow/authorization-code-flow-with-pkce https://platform.openai.com/docs/api-reference\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://auth0.com\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://datatracker.ietf.org/doc/html/rfc7636\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://www.python-httpx.org\u0026#160;\u0026#x21a9;\u0026#xfe0e;\n","date":"2023-03-20","externalUrl":null,"permalink":"/en/posts/2023/03/chatgpt-web-to-api/","section":"All Posts","summary":"","title":"Wrapping ChatGPT Web into a Standard API: A Reverse Engineering Practice","type":"posts"},{"content":"","date":"2023-03-20","externalUrl":null,"permalink":"/tags/%E9%80%86%E5%90%91/","section":"标签","summary":"","title":"逆向","type":"tags"},{"content":"","date":"2023-01-17","externalUrl":null,"permalink":"/en/tags/fastapi/","section":"Tags","summary":"","title":"FastAPI","type":"tags"},{"content":" Preface # User authentication is an unavoidable topic when developing web applications. While working on a backend project with FastAPI, I came across JWT1 as an authentication method and found it quite interesting. I did some research and decided to document it here.\nWhat is JWT # JWT stands for JSON Web Token. Simply put, it\u0026rsquo;s a token format where the content is encoded JSON, commonly used for identity verification in the web domain.\nTypically, a user sends their username and password to the server (we\u0026rsquo;ll skip OAuth and other third-party authentication for now). After verification, the server issues a Token to the user. This Token contains necessary information such as the issuer, subject (usually the user ID), and expiration time. From then on, the server no longer needs the username and password. The Token alone is sufficient to confirm the user\u0026rsquo;s identity.\nWhat Problem Does JWT Solve # You might ask: since the user already has a username and password, why bother with the extra step? Why generate a Token first instead of using credentials directly, like HTTP Basic Authentication2?\nThere are two main reasons: security and performance.\nSecurity is fairly straightforward. Compared to sending plaintext usernames and passwords in every request, a JWT with a built-in expiration mechanism is clearly more secure.\nThe second reason, which I think is even more important, is performance. Let\u0026rsquo;s consider what the username/password authentication process looks like:\nThe user sends a request with Base64-encoded username and password. The server decodes the request, compares the username and password against the database (passwords are usually stored as hashes), and returns pass or fail. With only a few requests per second, this design works fine. But for a service handling tens of thousands of requests per second, authentication alone would put enormous pressure on the database. And mature databases are often single-node (scaling databases that handle transactions is no easy feat).\nIs it possible to issue a token to the user, and then verify it across server replicas or even different servers, without querying the database? This would make the server truly stateless. Anyone familiar with distributed systems knows how much simpler stateless architectures are.\nHow JWT Ensures Security # Since JWT is just Base64-encoded JSON, can a user just make one up and fool the server?\nOf course not. JWT uses digital signatures. The last part is generated using a secret key known only to the server, combined with the content of the first two parts. Since this key is only available on the server and it\u0026rsquo;s virtually impossible to reverse-engineer the key from the content, attackers cannot forge a valid JWT.\nJWT Structure # JWT consists of three parts:\nHeader Payload Signature All three parts are Base64-encoded strings joined by .. A typical JWT looks like this:\neyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.cThIIoDvwdueQB468K5xDc5633seEFoqwxjF_xSJyQQ The decoded header:\n{ \u0026#34;alg\u0026#34;: \u0026#34;HS256\u0026#34;, \u0026#34;typ\u0026#34;: \u0026#34;JWT\u0026#34; } The decoded payload:\n{ \u0026#34;sub\u0026#34;: \u0026#34;1234567890\u0026#34;, \u0026#34;name\u0026#34;: \u0026#34;John Doe\u0026#34;, \u0026#34;iat\u0026#34;: 1516239022 } The last part is the signature, used to verify that the first two parts haven\u0026rsquo;t been tampered with.\nHow to Use JWT # HTTP Header # Theoretically, as long as you transmit the JWT to the server, you\u0026rsquo;re done. RFC 75193 doesn\u0026rsquo;t mandate where JWT must be used. But since it\u0026rsquo;s a token, the common practice is to include it in the Request Header:\nAuthorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.cThIIoDvwdueQB468K5xDc5633seEFoqwxjF_xSJyQQ FastAPI Example # Here I\u0026rsquo;m using python-jose4. Install it with pip install \u0026quot;python-jose[cryptography]\u0026quot;. You can also use other libraries from https://jwt.io/libraries.\nFor brevity, I\u0026rsquo;ll skip the FastAPI boilerplate and only show the core code.\nFirst, create an endpoint to issue tokens:\nfrom fastapi.security import HTTPBasic, HTTPBasicCredentials http_basic = HTTPBasic() def create_jwt_access_token( data: dict, expires_delta: timedelta, ) -\u0026gt; str: to_encode = data.copy() to_encode.update({\u0026#34;exp\u0026#34;: datetime.utcnow() + expires_delta}) return jwt.encode( to_encode, \u0026#34;jwt_secret\u0026#34;, # Replace with your own secret key algorithm=\u0026#34;HS256\u0026#34;, ) @app.post(\u0026#34;/auth/issue-new-token\u0026#34;) def issue_new_token( credentials: HTTPBasicCredentials = Depends(http_basic), ): username = basic_authentication(credentials) # Verify username \u0026amp; password access_token = create_jwt_access_token( data={\u0026#34;sub\u0026#34;: username}, expires_delta=timedelta(seconds=1234), ) return {\u0026#34;access_token\u0026#34;: access_token, \u0026#34;token_type\u0026#34;: \u0026#34;bearer\u0026#34;} This is a minimal JWT generation endpoint. Note that there\u0026rsquo;s no rate limiting or error handling here, it\u0026rsquo;s just a basic demo.\nNext, the JWT verification function:\nfrom fastapi.security import HTTPBearer, HTTPAuthorizationCredentials http_bearer = HTTPBearer() def jwt_authentication( credentials: HTTPAuthorizationCredentials = Depends(http_bearer), ): credentials_exception = HTTPException( status_code=status.HTTP_401_UNAUTHORIZED, detail=\u0026#34;Could not validate credentials\u0026#34;, headers={\u0026#34;WWW-Authenticate\u0026#34;: \u0026#34;Bearer\u0026#34;}, ) try: if credentials.scheme.lower() != \u0026#34;bearer\u0026#34;: raise credentials_exception return jwt.decode( credentials.credentials, \u0026#34;jwt_secret\u0026#34;, # Replace with your own secret key algorithms=[\u0026#34;HS256\u0026#34;], ) except JWTError: raise credentials_exception Add it as a dependency to any endpoint that requires authentication:\n# Option 1: Verify only, don\u0026#39;t need payload @app.get(\u0026#34;/api/protected\u0026#34;, dependencies=[Depends(jwt_authentication)]) def protected_api(): ... # Option 2: Get payload for further processing @app.get(\u0026#34;/api/protected\u0026#34;) def protected_api(jwt_payload: dict = Depends(jwt_authentication)): # Process jwt_payload ... Frontend Approach # Since many components depend on backend APIs that require JWT, it\u0026rsquo;s common to use Context to manage the token for cross-component access.\nWhen building AuthStateContext.Provider, my approach is:\nSince we use OAuth, JWT arrives from the backend via query string. First check if there\u0026rsquo;s a new Token in the query string. If yes, extract it and store in localStorage. If no, proceed to the next step. Check if there\u0026rsquo;s a stored Token in localStorage. If yes, verify if it\u0026rsquo;s still valid. If no, or if the previous check failed, proceed to the next step. Call a \u0026ldquo;who am I\u0026rdquo; endpoint to let the backend verify the JWT (optional, but adds reliability). If all checks pass, pass the JWT state to other components via Provider. Otherwise, redirect to the appropriate page and prompt the user. Other components can access the verified Token using useContext(AuthStateContext).\nJWT Limitations # JWT has a few drawbacks to consider when designing your system:\nJWTs are hard to revoke once issued. This is the cost of being stateless. Due to point 1, tokens should have short expiration times, meaning the client needs to handle token refresh. JWT content is not encrypted by default. Anyone can read it, so don\u0026rsquo;t put sensitive information inside. JWT is recommended to be used over HTTPS. References # https://en.wikipedia.org/wiki/JSON_Web_Token https://jwt.io/\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://datatracker.ietf.org/doc/html/rfc7519\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://github.com/mpdavis/python-jose\u0026#160;\u0026#x21a9;\u0026#xfe0e;\n","date":"2023-01-17","externalUrl":null,"permalink":"/en/posts/2023/01/jwt-introduction/","section":"All Posts","summary":"","title":"Introduction to JWT and Practice","type":"posts"},{"content":"","date":"2023-01-17","externalUrl":null,"permalink":"/en/tags/jwt/","section":"Tags","summary":"","title":"JWT","type":"tags"},{"content":"","date":"2022-03-23","externalUrl":null,"permalink":"/en/categories/blog/","section":"Categories","summary":"","title":"Blog","type":"categories"},{"content":" Preface # I\u0026rsquo;ve been using the open-source font \u0026ldquo;LXGW WenKai\u0026rdquo;1 for a while now and really like it, so I decided to change my blog\u0026rsquo;s font to LXGW WenKai as well.\nConfiguration # Reading the font\u0026rsquo;s [Issue #24]2, I found that someone had already converted the font using the ttf2woff2 tool and provided a method to use it on web pages. After testing, I found that using lxgw-wenkai-webfont directly doesn\u0026rsquo;t render correctly in Safari, but lxgw-wenkai-lite-webfont works fine. The lite version has some font weights removed, which makes it more compatible.\nA quick search 3 revealed that the Congo theme allows inserting extra code into the template\u0026rsquo;s and sections. Simply create layouts/partials/extend-head.html or layouts/partials/extend-footer.html files. After creating the html file and pasting the code below, save and refresh the page to see the LXGW WenKai font applied to your blog.\n\u0026lt;link rel=\u0026#34;stylesheet\u0026#34; href=\u0026#34;https://cdn.jsdelivr.net/npm/lxgw-wenkai-lite-webfont@1.1.0/style.css\u0026#34; /\u0026gt; \u0026lt;style\u0026gt; body { font-family: \u0026#34;LXGW WenKai Lite\u0026#34;, sans-serif; } \u0026lt;/style\u0026gt; https://github.com/lxgw/LxgwWenKai\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://github.com/chawyehsu/lxgw-wenkai-webfont\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://jpanther.github.io/congo/docs/partials/#head-and-footer\u0026#160;\u0026#x21a9;\u0026#xfe0e;\n","date":"2022-03-23","externalUrl":null,"permalink":"/en/posts/2022/03/change-the-font-of-the-blog/","section":"All Posts","summary":"","title":"Changing the Blog Font","type":"posts"},{"content":"","date":"2022-03-23","externalUrl":null,"permalink":"/en/tags/font/","section":"Tags","summary":"","title":"Font","type":"tags"},{"content":"","date":"2022-03-23","externalUrl":null,"permalink":"/categories/%E5%8D%9A%E5%AE%A2/","section":"分类","summary":"","title":"博客","type":"categories"},{"content":" Preface # It seems like every year there are discussions about whether it\u0026rsquo;s still necessary to have your own blog. It\u0026rsquo;s a valid question. It\u0026rsquo;s 2022 now, and there are plenty of mature writing and publishing platforms available. For me, they have their pros and cons. While they make it easy to write and publish, they also come with various limitations.\nI started blogging in 2015, using free hosting to deploy my first website. In 2017, I bought my first server, discovered the Typecho1 blogging system, and purchased the Handsome2 theme. In 2018, I moved to WeChat Official Accounts. With five posts per week, no ads, and high quality content, I attracted a decent number of readers. I was full of enthusiasm back then, kept it up for three years, and wrote over 700 articles. Over time, I realized it was a platform with many restrictions. Combined with Tencent\u0026rsquo;s various antics, I deleted it in 2021.\nNow I\u0026rsquo;ve decided to restart my blog. Since deleting my WeChat account in 2021, I\u0026rsquo;ve become increasingly depressed. I don\u0026rsquo;t have many friends, and I rarely reach out to them unless necessary. I realized I was gradually losing my ability to express myself. That\u0026rsquo;s not a good thing. So I want to try to find that enthusiastic version of myself again through writing.\nChoosing a Platform # There are two types of blogs: dynamic blogs and static blogs.\nDynamic blogging platforms I\u0026rsquo;ve used:\nWordPress: https://wordpress.com Typecho: https://typecho.org Halo: https://www.halo.run Static blogging platforms I\u0026rsquo;ve used:\nGridea: https://open.gridea.dev Hexo: https://hexo.io Hugo: https://gohugo.io These are all great blogging platforms, each with its own strengths and weaknesses. Dynamic blogs are feature-rich and beginner-friendly, but require server and database maintenance. Static blogs are lightweight and easy to deploy, but relatively limited in functionality.\nGetting Started # After trying various blogging platforms, I chose Hugo this time. You can find installation instructions for four common operating systems on its official website3.\nCreating a Site # In Hugo, the command to create a website folder is hugo new site \u0026lt;site-name\u0026gt;. For example, here I create a blog folder named Blog.\nhugo new site Blog cd Blog Now you can enter hugo server to access the default generated webpage.\nhugo server Web Server is available at http://localhost:1313/ (bind address 127.0.0.1) Press Ctrl+C to stop Installing a Theme # Hugo has many great themes available at https://themes.gohugo.io. The theme\u0026rsquo;s GitHub repository or demo site usually has installation and configuration instructions.\nWriting Articles # You can create a new article named Hello World with the following command:\nhugo new posts/hello-world/index.md hugo server -D You can then visit http://localhost:1313/posts/hello-world to view it. You might notice the -D flag in the start command. This parameter allows you to preview drafts. New articles are set to draft: true by default. Unless you manually change true to false, the article won\u0026rsquo;t be visible in local preview or after publishing.\nHugo uses Markdown4 markup language for writing articles, which takes about 10 minutes to learn.\nDeployment # Hugo can be deployed in many places. The official website has deployment tutorials5, many of which are free and have a very low barrier to entry. I think hosting on GitHub Pages and Cloudflare Pages is the most convenient option.\nhttps://typecho.org\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://www.ihewro.com/archives/489\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://gohugo.io/installation\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://www.runoob.com/markdown/md-tutorial.html\u0026#160;\u0026#x21a9;\u0026#xfe0e;\nhttps://gohugo.io/hosting-and-deployment\u0026#160;\u0026#x21a9;\u0026#xfe0e;\n","date":"2022-02-11","externalUrl":null,"permalink":"/en/posts/2022/02/how-i-built-my-personal-blog/","section":"All Posts","summary":"","title":"How I Built My Personal Blog?","type":"posts"},{"content":"","date":"2022-02-11","externalUrl":null,"permalink":"/en/tags/hugo/","section":"Tags","summary":"","title":"Hugo","type":"tags"},{"content":"👋 Hey there, I\u0026rsquo;m Aaron.\n","externalUrl":null,"permalink":"/en/about/","section":"Hideaway","summary":"","title":"About Me","type":"page"},{"content":" If any content on this website inadvertently infringes your copyright or interests, please contact me and I will take appropriate action. Unless otherwise stated, all content on this site is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.\nBelow is a human-readable summary of the license (not a substitute).\nYou are free to: # Share — copy and redistribute the material in any medium or format Adapt — remix, transform, and build upon the material As long as you follow the license terms, the licensor cannot revoke these freedoms.\nUnder the following terms: # Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. NonCommercial — You may not use the material for commercial purposes. ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original. No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits. Notices: # You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation. No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material. Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License\n","externalUrl":null,"permalink":"/en/copyright/","section":"Hideaway","summary":"","title":"Copyright Notice","type":"page"},{"content":" Introduction # This privacy policy is intended to help you understand our practices regarding any information we may collect from you or that you provide to us, how we use it, and how we handle it.\nSince we do not collect any personal data, our practices are straightforward, and we are committed to protecting your privacy.\nInformation Processing # This site is a non-commercial personal blog. We are committed to protecting user privacy. Accordingly, we confirm that when you visit our website, we do not collect, store, or process any of your personal data.\nPersonal data refers to any information that could potentially identify you as an individual. Since we do not collect such information, we have no possibility to use, share, or sell it.\nSecurity Statement # While we do not collect personal data, we take your trust seriously when it comes to any non-personal data you may provide. We make every effort to use acceptable methods to protect this data and maintain the physical and software security of the services we use. However, no method of Internet transmission or electronic storage is 100% secure and reliable, and we cannot guarantee absolute security.\nPolicy Changes # We may update our privacy policy from time to time. Therefore, we recommend that you review this page periodically for any changes. We will notify you of any changes by posting the new privacy policy on this page. These changes are effective immediately after being posted on this page.\n","externalUrl":null,"permalink":"/en/privacy/","section":"Hideaway","summary":"","title":"Privacy Policy","type":"page"},{"content":"","externalUrl":null,"permalink":"/en/series/","section":"Series","summary":"","title":"Series","type":"series"}]