<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Database on Mini Fish</title>
    <link>https://blog.minifish.org/tags/database/</link>
    <description>Recent content in Database on Mini Fish</description>
    <image>
      <title>Mini Fish</title>
      <url>https://blog.minifish.org/android-chrome-512x512.png</url>
      <link>https://blog.minifish.org/android-chrome-512x512.png</link>
    </image>
    <generator>Hugo -- 0.154.5</generator>
    <language>en-US</language>
    <copyright>Mini Fish 2014-present. Licensed under CC-BY-NC</copyright>
    <lastBuildDate>Sat, 17 Jan 2026 10:00:00 +0800</lastBuildDate>
    <atom:link href="https://blog.minifish.org/tags/database/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>OceanBase Internals: Transactions, Replay, SQL Engine, and Unit Placement</title>
      <link>https://blog.minifish.org/posts/oceanbase-internals-transaction-replay-sql-unit-placement/</link>
      <pubDate>Sat, 17 Jan 2026 10:00:00 +0800</pubDate>
      <guid>https://blog.minifish.org/posts/oceanbase-internals-transaction-replay-sql-unit-placement/</guid>
      <description>&lt;h2 id=&#34;why-these-paths-matter&#34;&gt;Why These Paths Matter&lt;/h2&gt;
&lt;p&gt;OceanBase targets high availability and scalability in a shared-nothing cluster. The core engineering challenge is to make four critical subsystems work together with predictable latency and correctness:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Write transactions must be durable, replicated, and efficiently committed.&lt;/li&gt;
&lt;li&gt;Tablet replay must recover state quickly and safely.&lt;/li&gt;
&lt;li&gt;SQL parse to execute must optimize well while respecting multi-tenant constraints.&lt;/li&gt;
&lt;li&gt;Unit placement must map tenants to physical resources without fragmentation.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This article focuses on motivation, design, implementation highlights, and tradeoffs, using concrete code entry points from the OceanBase codebase.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<h2 id="why-these-paths-matter">Why These Paths Matter</h2>
<p>OceanBase targets high availability and scalability in a shared-nothing cluster. The core engineering challenge is to make four critical subsystems work together with predictable latency and correctness:</p>
<ul>
<li>Write transactions must be durable, replicated, and efficiently committed.</li>
<li>Tablet replay must recover state quickly and safely.</li>
<li>SQL parse to execute must optimize well while respecting multi-tenant constraints.</li>
<li>Unit placement must map tenants to physical resources without fragmentation.</li>
</ul>
<p>This article focuses on motivation, design, implementation highlights, and tradeoffs, using concrete code entry points from the OceanBase codebase.</p>
<h2 id="system-architecture">System Architecture</h2>
<p>OceanBase adopts a shared-nothing architecture where each node is equal and runs its own SQL engine, storage engine, and transaction engine. Understanding the overall architecture is essential before diving into implementation details.</p>
<h3 id="cluster-zone-and-node-organization">Cluster, Zone, and Node Organization</h3>
<pre tabindex="0"><code class="language-mermaid" data-lang="mermaid">graph TB
    subgraph Cluster[&#34;OceanBase Cluster&#34;]
        subgraph Z1[&#34;Zone 1&#34;]
            N1[&#34;OBServer Node 1&#34;]
        end
        subgraph Z2[&#34;Zone 2&#34;]
            N2[&#34;OBServer Node 2&#34;]
        end
        subgraph Z3[&#34;Zone 3&#34;]
            N3[&#34;OBServer Node 3&#34;]
        end
    end
    
    subgraph Proxy[&#34;obproxy Layer&#34;]
        P1[&#34;obproxy 1&#34;]
        P2[&#34;obproxy 2&#34;]
    end
    
    Client[&#34;Client Applications&#34;] --&gt; Proxy
    Proxy --&gt; N1
    Proxy --&gt; N2
    Proxy --&gt; N3
</code></pre><p><strong>Key Concepts:</strong></p>
<ul>
<li><strong>Cluster</strong>: A collection of nodes working together</li>
<li><strong>Zone</strong>: Logical availability zones for high availability and disaster recovery</li>
<li><strong>OBServer</strong>: Service process on each node handling SQL, storage, and transactions</li>
<li><strong>obproxy</strong>: Stateless proxy layer routing SQL requests to appropriate OBServer nodes</li>
</ul>
<h3 id="data-organization-partition-tablet-and-log-stream">Data Organization: Partition, Tablet, and Log Stream</h3>
<pre tabindex="0"><code class="language-mermaid" data-lang="mermaid">graph TB
    subgraph Table[&#34;Table&#34;]
        P1[&#34;Partition 1&#34;]
        P2[&#34;Partition 2&#34;]
        P3[&#34;Partition 3&#34;]
    end
    
    subgraph LS1[&#34;Log Stream 1&#34;]
        T1[&#34;Tablet 1&#34;]
        T2[&#34;Tablet 2&#34;]
    end
    
    subgraph LS2[&#34;Log Stream 2&#34;]
        T3[&#34;Tablet 3&#34;]
    end
    
    subgraph LS3[&#34;Log Stream 3&#34;]
        T4[&#34;Tablet 4&#34;]
    end
    
    P1 --&gt; T1
    P1 --&gt; T2
    P2 --&gt; T3
    P3 --&gt; T4
    
    T1 --&gt; LS1
    T2 --&gt; LS1
    T3 --&gt; LS2
    T4 --&gt; LS3
</code></pre><p><strong>Key Concepts:</strong></p>
<ul>
<li><strong>Partition</strong>: Logical shard of a table (hash, range, list partitioning)</li>
<li><strong>Tablet</strong>: Physical storage object storing ordered data records for a partition</li>
<li><strong>Log Stream (LS)</strong>: Replication unit using Multi-Paxos for data consistency</li>
<li><strong>Replication</strong>: Each tablet has multiple replicas across zones, with one leader accepting writes. Log streams replicate data via Multi-Paxos protocol across different zones.</li>
</ul>
<h3 id="multi-tenant-resource-model">Multi-Tenant Resource Model</h3>
<pre tabindex="0"><code class="language-mermaid" data-lang="mermaid">graph TB
    subgraph Tenant[&#34;Tenant&#34;]
        T1[&#34;Tenant 1 MySQL Mode&#34;]
        T2[&#34;Tenant 2 Oracle Mode&#34;]
        T3[&#34;System Tenant&#34;]
    end
    
    subgraph Pool[&#34;Resource Pool&#34;]
        RP1[&#34;Pool 1&#34;]
        RP2[&#34;Pool 2&#34;]
        RP3[&#34;Pool 3&#34;]
    end
    
    subgraph Unit[&#34;Resource Unit&#34;]
        U1[&#34;Unit 1 CPU Memory Disk&#34;]
        U2[&#34;Unit 2 CPU Memory Disk&#34;]
        U3[&#34;Unit 3 CPU Memory Disk&#34;]
    end
    
    subgraph Server[&#34;Physical Server&#34;]
        S1[&#34;Server 1&#34;]
        S2[&#34;Server 2&#34;]
        S3[&#34;Server 3&#34;]
    end
    
    T1 --&gt; RP1
    T2 --&gt; RP2
    T3 --&gt; RP3
    
    RP1 --&gt; U1
    RP2 --&gt; U2
    RP3 --&gt; U3
    
    U1 --&gt; S1
    U2 --&gt; S2
    U3 --&gt; S3
</code></pre><p><strong>Key Concepts:</strong></p>
<ul>
<li><strong>Tenant</strong>: Isolated database instance (MySQL or Oracle compatibility)</li>
<li><strong>Resource Pool</strong>: Groups resource units for a tenant across zones</li>
<li><strong>Resource Unit</strong>: Virtual container with CPU, memory, and disk resources</li>
<li><strong>Unit Placement</strong>: Rootserver schedules units to physical servers based on resource constraints</li>
</ul>
<h3 id="layered-architecture">Layered Architecture</h3>
<pre tabindex="0"><code class="language-mermaid" data-lang="mermaid">graph TB
    subgraph App[&#34;Application Layer&#34;]
        Client[&#34;Client Applications&#34;]
    end
    
    subgraph Proxy[&#34;Proxy Layer&#34;]
        ODP[&#34;obproxy Router&#34;]
    end
    
    subgraph OBServer[&#34;OBServer Layer&#34;]
        subgraph Node[&#34;OBServer Node&#34;]
            SQL[&#34;SQL Engine&#34;]
            TX[&#34;Transaction Engine&#34;]
            ST[&#34;Storage Engine&#34;]
        end
    end
    
    subgraph Storage[&#34;Storage Layer&#34;]
        subgraph LS[&#34;Log Stream&#34;]
            Tablet[&#34;Tablet&#34;]
            Memtable[&#34;Memtable&#34;]
            SSTable[&#34;SSTable&#34;]
        end
        Palf[&#34;Paxos Log Service&#34;]
    end
    
    subgraph Resource[&#34;Resource Layer&#34;]
        Tenant[&#34;Tenant&#34;]
        Unit[&#34;Resource Unit&#34;]
        Pool[&#34;Resource Pool&#34;]
        RS[&#34;Rootserver&#34;]
    end
    
    Client --&gt; ODP
    ODP --&gt; SQL
    SQL --&gt; TX
    TX --&gt; ST
    ST --&gt; LS
    LS --&gt; Palf
    Tenant --&gt; Unit
    Unit --&gt; Pool
    Pool --&gt; RS
    RS --&gt; Node
</code></pre><p><strong>Key Concepts:</strong></p>
<ul>
<li><strong>SQL Engine</strong>: Parses, optimizes, and executes SQL statements</li>
<li><strong>Transaction Engine</strong>: Manages transaction lifecycle, commit protocols, and consistency</li>
<li><strong>Storage Engine</strong>: Handles data organization, memtables, and SSTables</li>
<li><strong>Log Service</strong>: Provides Paxos-based replication and durability</li>
<li><strong>Rootserver</strong>: Manages cluster metadata, resource scheduling, and placement</li>
</ul>
<h2 id="design-overview">Design Overview</h2>
<p>At a high level, each node runs a full SQL engine, storage engine, and transaction engine. Data is organized into tablets, which belong to log streams. Log streams replicate changes using Paxos-based log service. Tenants slice resources via unit configurations and pools, while rootserver components place those units on servers.</p>
<p>The following sections walk through each path with the relevant implementation anchors.</p>
<h2 id="architecture-diagrams">Architecture Diagrams</h2>
<h3 id="transaction-write-path">Transaction Write Path</h3>
<pre tabindex="0"><code class="language-mermaid" data-lang="mermaid">flowchart LR
  subgraph A[&#34;Transaction Write Path&#34;]
    C[&#34;Client&#34;] --&gt; S[&#34;SQL Engine&#34;]
    S --&gt; T[&#34;Transaction Context&#34;]
    T --&gt; M[&#34;Memtable Write&#34;]
    M --&gt; R[&#34;Redo Buffer&#34;]
    R --&gt; L[&#34;Log Service&#34;]
    L --&gt; P[&#34;Replicated Log&#34;]
    P --&gt; K[&#34;Commit Result&#34;]
  end
</code></pre><h3 id="tablet-replay-path">Tablet Replay Path</h3>
<pre tabindex="0"><code class="language-mermaid" data-lang="mermaid">flowchart LR
  subgraph B[&#34;Tablet Replay Path&#34;]
    L[&#34;Log Service&#34;] --&gt; RS[&#34;Replay Service&#34;]
    RS --&gt; E[&#34;Tablet Replay Executor&#34;]
    E --&gt; LS[&#34;Log Stream&#34;]
    LS --&gt; TB[&#34;Tablet&#34;]
    TB --&gt; CK[&#34;Replay Checks&#34;]
    CK --&gt; AP[&#34;Apply Operation&#34;]
    AP --&gt; ST[&#34;Updated Tablet State&#34;]
  end
</code></pre><h3 id="sql-compile-and-execute">SQL Compile and Execute</h3>
<pre tabindex="0"><code class="language-mermaid" data-lang="mermaid">flowchart LR
  subgraph C[&#34;SQL Compile and Execute&#34;]
    Q[&#34;SQL Text&#34;] --&gt; P[&#34;Parser&#34;]
    P --&gt; R[&#34;Resolver&#34;]
    R --&gt; W[&#34;Rewriter&#34;]
    W --&gt; O[&#34;Optimizer&#34;]
    O --&gt; LP[&#34;Logical Plan&#34;]
    LP --&gt; CG[&#34;Code Generator&#34;]
    CG --&gt; PP[&#34;Physical Plan&#34;]
    PP --&gt; EX[&#34;Executor&#34;]
  end
</code></pre><h3 id="unit-placement">Unit Placement</h3>
<pre tabindex="0"><code class="language-mermaid" data-lang="mermaid">flowchart LR
  subgraph D[&#34;Unit Placement&#34;]
    UC[&#34;Unit Config&#34;] --&gt; RP[&#34;Resource Pool&#34;]
    RP --&gt; PS[&#34;Placement Strategy&#34;]
    PS --&gt; CS[&#34;Candidate Servers&#34;]
    CS --&gt; CH[&#34;Chosen Server&#34;]
    CH --&gt; UN[&#34;Unit Instance&#34;]
  end
</code></pre><h2 id="write-transaction-from-memtable-to-replicated-log">Write Transaction: From Memtable to Replicated Log</h2>
<h3 id="motivation">Motivation</h3>
<p>A write transaction must be both fast and durable. OceanBase uses memtables for in-memory writes, and a log stream for redo replication. The design must allow low-latency commit while supporting parallel redo submission and multi-participant (2PC) transactions.</p>
<h3 id="design-sketch">Design Sketch</h3>
<ul>
<li>Each transaction is represented by a per-LS context (<code>ObPartTransCtx</code>).</li>
<li>Redo is flushed based on pressure or explicit triggers.</li>
<li>Commit chooses one-phase or two-phase based on participants.</li>
<li>Logs are submitted via a log adapter backed by logservice.</li>
</ul>
<h3 id="implementation-highlights">Implementation Highlights</h3>
<ul>
<li>Transaction context lifecycle and commit logic are in <code>src/storage/tx/ob_trans_part_ctx.cpp</code>.</li>
<li>Redo submission is driven by <code>submit_redo_after_write</code>, which switches between serial and parallel logging based on thresholds.</li>
<li>Commit decides between one-phase and two-phase commit depending on participant count.</li>
<li>The log writer (<code>ObTxLSLogWriter</code>) submits serialized logs via <code>ObITxLogAdapter</code>, which is wired to logservice (<code>ObLogHandler</code>).</li>
</ul>
<h3 id="tradeoffs">Tradeoffs</h3>
<ul>
<li><strong>Serial vs parallel redo</strong>: Serial logging is simpler and cheaper for small transactions, but parallel logging reduces tail latency for large transactions at the cost of more coordination.</li>
<li><strong>1PC vs 2PC</strong>: 1PC is fast for single-participant transactions; 2PC is required for distributed consistency but increases coordination overhead.</li>
<li><strong>In-memory batching vs durability</strong>: Larger batching improves throughput but can delay durability and increase replay time.</li>
</ul>
<h2 id="tablet-replay-reconstructing-state-safely">Tablet Replay: Reconstructing State Safely</h2>
<h3 id="motivation-1">Motivation</h3>
<p>Recovery needs to be deterministic and safe: the system must replay logs to reconstruct tablet state without violating invariants or applying obsolete data.</p>
<h3 id="design-sketch-1">Design Sketch</h3>
<ul>
<li>Logservice schedules replay tasks per log stream.</li>
<li>Tablet replay executor fetches the LS, locates the tablet, validates replay conditions, and applies the log.</li>
<li>Specialized replay executors handle different log types (e.g., schema updates, split operations).</li>
</ul>
<h3 id="implementation-highlights-1">Implementation Highlights</h3>
<ul>
<li>Replay orchestration lives in <code>src/logservice/replayservice/ob_log_replay_service.cpp</code>.</li>
<li>Tablet replay logic is in <code>src/logservice/replayservice/ob_tablet_replay_executor.cpp</code>.</li>
<li>Specific tablet operations are applied in dedicated executors, such as <code>ObTabletServiceClogReplayExecutor</code> in <code>src/storage/tablet/ob_tablet_service_clog_replay_executor.cpp</code>.</li>
</ul>
<h3 id="tradeoffs-1">Tradeoffs</h3>
<ul>
<li><strong>Strictness vs throughput</strong>: Replay barriers enforce ordering for correctness but can reduce parallelism.</li>
<li><strong>Tablet existence checks</strong>: Allowing missing tablets can speed recovery but requires careful validation to avoid partial state.</li>
<li><strong>MDS synchronization</strong>: Metadata state updates improve correctness but add contention via locks.</li>
</ul>
<h2 id="sql-parse-to-execute-compile-pipeline-for-performance">SQL Parse to Execute: Compile Pipeline for Performance</h2>
<h3 id="motivation-2">Motivation</h3>
<p>OceanBase supports MySQL and Oracle compatibility with rich SQL features. The compile pipeline must be fast, cache-friendly, and yield efficient execution plans.</p>
<h3 id="design-sketch-2">Design Sketch</h3>
<ul>
<li>SQL text enters the engine via <code>ObSql::stmt_query</code>.</li>
<li>Parsing produces a parse tree.</li>
<li>Resolution turns the parse tree into a typed statement tree.</li>
<li>Rewrite and optimization generate a logical plan.</li>
<li>Code generation produces a physical plan and execution context.</li>
</ul>
<h3 id="implementation-highlights-2">Implementation Highlights</h3>
<ul>
<li>Entry and query handling: <code>src/sql/ob_sql.cpp</code> (<code>stmt_query</code>, <code>handle_text_query</code>).</li>
<li>Resolver: <code>ObResolver</code> in <code>src/sql/resolver/ob_resolver.h</code>.</li>
<li>Transform and optimize: <code>ObSql::transform_stmt</code> and <code>ObSql::optimize_stmt</code> in <code>src/sql/ob_sql.cpp</code>.</li>
<li>Code generation: <code>ObSql::code_generate</code> in <code>src/sql/ob_sql.cpp</code>.</li>
</ul>
<h3 id="tradeoffs-2">Tradeoffs</h3>
<ul>
<li><strong>Plan cache vs compile accuracy</strong>: Plan caching reduces latency but may reuse suboptimal plans under changing data distributions.</li>
<li><strong>Rewrite aggressiveness</strong>: More transformations can yield better plans but increase compile cost.</li>
<li><strong>JIT and rich formats</strong>: Faster execution for some workloads, but added complexity and memory pressure.</li>
</ul>
<h2 id="unit-placement-scheduling-tenant-resources">Unit Placement: Scheduling Tenant Resources</h2>
<h3 id="motivation-3">Motivation</h3>
<p>Multi-tenancy requires predictable isolation and efficient resource utilization. Unit placement must respect CPU, memory, and disk constraints while minimizing fragmentation.</p>
<h3 id="design-sketch-3">Design Sketch</h3>
<ul>
<li>Unit config defines resource demands.</li>
<li>Resource pool groups units by tenant and zone.</li>
<li>Placement strategy scores candidate servers to pick a host for each unit.</li>
</ul>
<h3 id="implementation-highlights-3">Implementation Highlights</h3>
<ul>
<li>Resource types and pools: <code>src/share/unit/ob_unit_config.h</code>, <code>src/share/unit/ob_resource_pool.h</code>, <code>src/share/unit/ob_unit_info.h</code>.</li>
<li>Placement policy: <code>src/rootserver/ob_unit_placement_strategy.cpp</code> uses a weighted dot-product of remaining resources to choose a server.</li>
<li>Orchestration: <code>src/rootserver/ob_unit_manager.cpp</code> handles creation, alteration, and migration of units and pools.</li>
</ul>
<h3 id="tradeoffs-3">Tradeoffs</h3>
<ul>
<li><strong>Greedy placement vs global optimality</strong>: Dot-product scoring is efficient and practical but may not be globally optimal.</li>
<li><strong>Capacity normalization</strong>: Assuming comparable server capacities simplifies scoring but may bias placement in heterogeneous clusters.</li>
<li><strong>Latency vs stability</strong>: Fast placement decisions can lead to more churn; conservative placement improves stability but can reduce utilization.</li>
</ul>
<h2 id="closing-thoughts">Closing Thoughts</h2>
<p>These four paths demonstrate how OceanBase balances correctness, performance, and operability. The code structure follows clear separation of responsibilities: transaction logic is in <code>storage/tx</code>, replication and replay are in <code>logservice</code>, SQL compilation is in <code>sql</code>, and scheduling is in <code>rootserver</code> and <code>share/unit</code>. The tradeoffs are explicit and largely encoded in thresholds and policies, which makes tuning feasible without invasive rewrites.</p>
<p>If you are extending OceanBase, start with the entry points highlighted above and follow the call chains into the relevant subsystem. It is the fastest way to build a mental model grounded in the actual implementation.</p>
]]></content:encoded>
    </item>
    <item>
      <title>How to Compare Data Consistency between MySQL and PostgreSQL</title>
      <link>https://blog.minifish.org/posts/how-to-compare-data-consistency-between-mysql-and-postgresql/</link>
      <pubDate>Sun, 09 May 2021 18:13:00 +0800</pubDate>
      <guid>https://blog.minifish.org/posts/how-to-compare-data-consistency-between-mysql-and-postgresql/</guid>
      <description>&lt;h2 id=&#34;background&#34;&gt;Background&lt;/h2&gt;
&lt;p&gt;Recently, I encountered a problem where a user wanted to synchronize data from PostgreSQL to TiDB (which uses the same protocol as MySQL) and wanted to know whether the data after synchronization is consistent. I hadn&amp;rsquo;t dealt with this kind of issue before, so I did a bit of research.&lt;/p&gt;
&lt;p&gt;Typically, to verify data consistency, you compute a checksum on both sides and compare them.&lt;/p&gt;
&lt;h2 id=&#34;tidb-mysql-side&#34;&gt;TiDB (MySQL) Side&lt;/h2&gt;
&lt;p&gt;For the verification of a specific table, the following SQL is used:&lt;/p&gt;</description>
      <content:encoded><![CDATA[<h2 id="background">Background</h2>
<p>Recently, I encountered a problem where a user wanted to synchronize data from PostgreSQL to TiDB (which uses the same protocol as MySQL) and wanted to know whether the data after synchronization is consistent. I hadn&rsquo;t dealt with this kind of issue before, so I did a bit of research.</p>
<p>Typically, to verify data consistency, you compute a checksum on both sides and compare them.</p>
<h2 id="tidb-mysql-side">TiDB (MySQL) Side</h2>
<p>For the verification of a specific table, the following SQL is used:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-SQL" data-lang="SQL"><span style="display:flex;"><span><span style="color:#66d9ef">SELECT</span> bit_xor(
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">CAST</span>(crc32(
</span></span><span style="display:flex;"><span>        concat_ws(<span style="color:#e6db74">&#39;,&#39;</span>,
</span></span><span style="display:flex;"><span>            col1, col2, col3, <span style="color:#960050;background-color:#1e0010">…</span>, colN,
</span></span><span style="display:flex;"><span>            concat(<span style="color:#66d9ef">isnull</span>(col1), <span style="color:#66d9ef">isnull</span>(col2), <span style="color:#960050;background-color:#1e0010">…</span>, <span style="color:#66d9ef">isnull</span>(colN))
</span></span><span style="display:flex;"><span>        )
</span></span><span style="display:flex;"><span>    ) <span style="color:#66d9ef">AS</span> UNSIGNED)
</span></span><span style="display:flex;"><span>)
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">FROM</span> t;
</span></span></code></pre></div><p>Let&rsquo;s look at a specific example:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-SQL" data-lang="SQL"><span style="display:flex;"><span><span style="color:#66d9ef">DROP</span> <span style="color:#66d9ef">TABLE</span> <span style="color:#66d9ef">IF</span> <span style="color:#66d9ef">EXISTS</span> t;
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">CREATE</span> <span style="color:#66d9ef">TABLE</span> t (i INT, j INT);
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">INSERT</span> <span style="color:#66d9ef">INTO</span> t <span style="color:#66d9ef">VALUES</span> (<span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">3</span>), (<span style="color:#66d9ef">NULL</span>, <span style="color:#66d9ef">NULL</span>);
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">SELECT</span> bit_xor(
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">CAST</span>(crc32(
</span></span><span style="display:flex;"><span>        concat_ws(<span style="color:#e6db74">&#39;,&#39;</span>,
</span></span><span style="display:flex;"><span>            i, j,
</span></span><span style="display:flex;"><span>            concat(<span style="color:#66d9ef">isnull</span>(i), <span style="color:#66d9ef">isnull</span>(j))
</span></span><span style="display:flex;"><span>        )
</span></span><span style="display:flex;"><span>    ) <span style="color:#66d9ef">AS</span> UNSIGNED)
</span></span><span style="display:flex;"><span>)
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">FROM</span> t;
</span></span></code></pre></div><p>The result is:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-text" data-lang="text"><span style="display:flex;"><span>+-------------------------------------------------------------------------------------------------------------------------------------------+
</span></span><span style="display:flex;"><span>| bit_xor(
</span></span><span style="display:flex;"><span>    CAST(crc32(
</span></span><span style="display:flex;"><span>        concat_ws(&#39;,&#39;,
</span></span><span style="display:flex;"><span>            i, j,
</span></span><span style="display:flex;"><span>            concat(isnull(i), isnull(j))
</span></span><span style="display:flex;"><span>        )
</span></span><span style="display:flex;"><span>    ) AS UNSIGNED)
</span></span><span style="display:flex;"><span>) |
</span></span><span style="display:flex;"><span>+-------------------------------------------------------------------------------------------------------------------------------------------+
</span></span><span style="display:flex;"><span>|                                                           5062371 |
</span></span><span style="display:flex;"><span>+-------------------------------------------------------------------------------------------------------------------------------------------+
</span></span><span style="display:flex;"><span>1 row in set (0.00 sec)
</span></span></code></pre></div><h2 id="postgresql-side">PostgreSQL Side</h2>
<p>The goal is simply to write the same SQL as above, but PostgreSQL does not support <code>bit_xor</code>, <code>crc32</code>, <code>isnull</code>, nor does it have unsigned types. Therefore, the solution is relatively straightforward—relying on UDFs (User-Defined Functions).</p>
<p>After some research, the main missing functions can be addressed with a few custom implementations.</p>
<p><code>bit_xor</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-SQL" data-lang="SQL"><span style="display:flex;"><span><span style="color:#66d9ef">CREATE</span> <span style="color:#66d9ef">OR</span> <span style="color:#66d9ef">REPLACE</span> <span style="color:#66d9ef">AGGREGATE</span> bit_xor(<span style="color:#66d9ef">IN</span> v bigint) (SFUNC <span style="color:#f92672">=</span> int8xor, <span style="color:#66d9ef">STYPE</span> <span style="color:#f92672">=</span> bigint);
</span></span></code></pre></div><p><code>crc32</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-SQL" data-lang="SQL"><span style="display:flex;"><span><span style="color:#66d9ef">CREATE</span> <span style="color:#66d9ef">OR</span> <span style="color:#66d9ef">REPLACE</span> <span style="color:#66d9ef">FUNCTION</span> crc32(text_string text) <span style="color:#66d9ef">RETURNS</span> bigint <span style="color:#66d9ef">AS</span> <span style="color:#960050;background-color:#1e0010">$$</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">DECLARE</span>
</span></span><span style="display:flex;"><span>    tmp bigint;
</span></span><span style="display:flex;"><span>    i int;
</span></span><span style="display:flex;"><span>    j int;
</span></span><span style="display:flex;"><span>    byte_length int;
</span></span><span style="display:flex;"><span>    binary_string bytea;
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">BEGIN</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">IF</span> text_string <span style="color:#f92672">=</span> <span style="color:#e6db74">&#39;&#39;</span> <span style="color:#66d9ef">THEN</span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">RETURN</span> <span style="color:#ae81ff">0</span>;
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">END</span> <span style="color:#66d9ef">IF</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    i <span style="color:#f92672">=</span> <span style="color:#ae81ff">0</span>;
</span></span><span style="display:flex;"><span>    tmp <span style="color:#f92672">=</span> <span style="color:#ae81ff">4294967295</span>;
</span></span><span style="display:flex;"><span>    byte_length <span style="color:#f92672">=</span> <span style="color:#66d9ef">bit_length</span>(text_string) <span style="color:#f92672">/</span> <span style="color:#ae81ff">8</span>;
</span></span><span style="display:flex;"><span>    binary_string <span style="color:#f92672">=</span> decode(<span style="color:#66d9ef">replace</span>(text_string, E<span style="color:#e6db74">&#39;\\&#39;</span>, E<span style="color:#e6db74">&#39;\\\\&#39;</span>), <span style="color:#e6db74">&#39;escape&#39;</span>);
</span></span><span style="display:flex;"><span>    LOOP
</span></span><span style="display:flex;"><span>        tmp <span style="color:#f92672">=</span> (tmp <span style="color:#f92672">#</span> get_byte(binary_string, i))::bigint;
</span></span><span style="display:flex;"><span>        i <span style="color:#f92672">=</span> i <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span>;
</span></span><span style="display:flex;"><span>        j <span style="color:#f92672">=</span> <span style="color:#ae81ff">0</span>;
</span></span><span style="display:flex;"><span>        LOOP
</span></span><span style="display:flex;"><span>            tmp <span style="color:#f92672">=</span> ((tmp <span style="color:#f92672">&gt;&gt;</span> <span style="color:#ae81ff">1</span>) <span style="color:#f92672">#</span> (<span style="color:#ae81ff">3988292384</span> <span style="color:#f92672">*</span> (tmp <span style="color:#f92672">&amp;</span> <span style="color:#ae81ff">1</span>)))::bigint;
</span></span><span style="display:flex;"><span>            j <span style="color:#f92672">=</span> j <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span>;
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">IF</span> j <span style="color:#f92672">&gt;=</span> <span style="color:#ae81ff">8</span> <span style="color:#66d9ef">THEN</span>
</span></span><span style="display:flex;"><span>                EXIT;
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">END</span> <span style="color:#66d9ef">IF</span>;
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">END</span> LOOP;
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">IF</span> i <span style="color:#f92672">&gt;=</span> byte_length <span style="color:#66d9ef">THEN</span>
</span></span><span style="display:flex;"><span>            EXIT;
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">END</span> <span style="color:#66d9ef">IF</span>;
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">END</span> LOOP;
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">RETURN</span> (tmp <span style="color:#f92672">#</span> <span style="color:#ae81ff">4294967295</span>);
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">END</span>
</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">$$</span> <span style="color:#66d9ef">IMMUTABLE</span> <span style="color:#66d9ef">LANGUAGE</span> plpgsql;
</span></span></code></pre></div><p><code>isnull</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-SQL" data-lang="SQL"><span style="display:flex;"><span><span style="color:#66d9ef">CREATE</span> <span style="color:#66d9ef">OR</span> <span style="color:#66d9ef">REPLACE</span> <span style="color:#66d9ef">FUNCTION</span> <span style="color:#66d9ef">isnull</span>(anyelement) <span style="color:#66d9ef">RETURNS</span> int <span style="color:#66d9ef">AS</span> <span style="color:#960050;background-color:#1e0010">$$</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">BEGIN</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">RETURN</span> <span style="color:#66d9ef">CAST</span>((<span style="color:#960050;background-color:#1e0010">$</span><span style="color:#ae81ff">1</span> <span style="color:#66d9ef">IS</span> <span style="color:#66d9ef">NULL</span>) <span style="color:#66d9ef">AS</span> INT);
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">END</span>
</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">$$</span> <span style="color:#66d9ef">LANGUAGE</span> plpgsql;
</span></span></code></pre></div><p>After creating the three UDFs above, let&rsquo;s test the previous example. Note that <code>UNSIGNED</code> should be changed to <code>BIGINT</code>.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-SQL" data-lang="SQL"><span style="display:flex;"><span><span style="color:#66d9ef">DROP</span> <span style="color:#66d9ef">TABLE</span> <span style="color:#66d9ef">IF</span> <span style="color:#66d9ef">EXISTS</span> t;
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">CREATE</span> <span style="color:#66d9ef">TABLE</span> t (i INT, j INT);
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">INSERT</span> <span style="color:#66d9ef">INTO</span> t <span style="color:#66d9ef">VALUES</span> (<span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">3</span>), (<span style="color:#66d9ef">NULL</span>, <span style="color:#66d9ef">NULL</span>);
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">SELECT</span> bit_xor(
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">CAST</span>(crc32(
</span></span><span style="display:flex;"><span>        concat_ws(<span style="color:#e6db74">&#39;,&#39;</span>,
</span></span><span style="display:flex;"><span>            i, j,
</span></span><span style="display:flex;"><span>            concat(<span style="color:#66d9ef">isnull</span>(i), <span style="color:#66d9ef">isnull</span>(j))
</span></span><span style="display:flex;"><span>        )
</span></span><span style="display:flex;"><span>    ) <span style="color:#66d9ef">AS</span> BIGINT)
</span></span><span style="display:flex;"><span>)
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">FROM</span> t;
</span></span></code></pre></div><p>The result:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-text" data-lang="text"><span style="display:flex;"><span> bit_xor
</span></span><span style="display:flex;"><span>---------
</span></span><span style="display:flex;"><span> 5062371
</span></span><span style="display:flex;"><span>(1 row)
</span></span></code></pre></div><p>It&rsquo;s exactly the same as on the TiDB (MySQL) side.</p>
<h2 id="postscript">Postscript</h2>
<ol>
<li>I haven&rsquo;t tested more extensively; this is just a simple test.</li>
<li>UDFs are indeed a great feature that greatly enhance flexibility.</li>
</ol>
]]></content:encoded>
    </item>
    <item>
      <title>How to Read TiDB Source Code (Part 5)</title>
      <link>https://blog.minifish.org/posts/how-to-read-tidb-source-code-part-5/</link>
      <pubDate>Tue, 08 Sep 2020 11:36:00 +0800</pubDate>
      <guid>https://blog.minifish.org/posts/how-to-read-tidb-source-code-part-5/</guid>
      <description>&lt;p&gt;When using TiDB, you may occasionally encounter some exceptions, such as the &amp;ldquo;Lost connection to MySQL server during query&amp;rdquo; error. This indicates that the connection between the client and the database has been disconnected (not due to user action). The reasons for disconnection can vary. This article attempts to analyze some common TiDB errors from the perspective of exception handling and code analysis. Additionally, some exceptions are not errors but performance issues due to slow execution. In the second half of this article, we will also introduce common tools for tracking performance.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p>When using TiDB, you may occasionally encounter some exceptions, such as the &ldquo;Lost connection to MySQL server during query&rdquo; error. This indicates that the connection between the client and the database has been disconnected (not due to user action). The reasons for disconnection can vary. This article attempts to analyze some common TiDB errors from the perspective of exception handling and code analysis. Additionally, some exceptions are not errors but performance issues due to slow execution. In the second half of this article, we will also introduce common tools for tracking performance.</p>
<h2 id="lost-connection">Lost Connection</h2>
<p>There are generally three reasons for a Lost Connection:</p>
<ol>
<li>A timeout occurs either directly between the client and the database or at some point along the intermediate link, such as from the client to the Proxy or from the Proxy to the database.</li>
<li>A bug occurs during SQL execution, which can generally be recovered, thus preventing the TiDB server from crashing completely (panic).</li>
<li>TiDB itself crashes, often due to excessive memory use, causing an OOM (Out of Memory), or a user deliberately kills TiDB. Another possibility is an unrecovered bug, which typically appears more frequently in background threads.</li>
</ol>
<h3 id="timeout">Timeout</h3>
<h4 id="direct-timeout">Direct Timeout</h4>
<p>TiDB supports the MySQL-compatible <code>wait_timeout</code> variable, with a default value of 0, meaning no timeout is set, unlike MySQL&rsquo;s default of 8 hours.</p>
<p><img alt="lost" loading="lazy" src="/posts/images/20200908132926.webp"></p>
<p>The only place it is used in the code is in <code>getSessionVarsWaitTimeout</code>. In the connection&rsquo;s Run section, its value is set for packet IO. If the variable is non-zero, a timeout is set before each <code>readPacket</code>.</p>
<p><img alt="lost" loading="lazy" src="/posts/images/20200908134545.webp"></p>
<p>If the client does not send data beyond the specified time, the connection will be disconnected. At this time, a log message &ldquo;read packet timeout, close this connection&rdquo; will appear, along with the specific timeout duration.</p>
<h4 id="intermediate-link-timeout">Intermediate Link Timeout</h4>
<p>Another scenario is an intermediate link timeout. A normal timeout in an intermediate link (proxy) typically returns an EOF error to the database. In older versions, at least a connection closed log would be output.</p>
<p><img alt="lost" loading="lazy" src="/posts/images/20200908141322.webp"></p>
<p>In the newer master version, product managers suggested changing this log to a debug level, so it is generally no longer output.</p>
<p>However, in the new version, a monitoring item called <code>DisconnectionCounter</code> has been added,</p>
<p><img alt="lost" loading="lazy" src="/posts/images/20200908141537.webp"></p>
<p><img alt="lost" loading="lazy" src="/posts/images/20200908142131.webp"></p>
<p>which records normal and abnormal disconnections as a supplement to downgraded logging.</p>
<h3 id="bugs-that-are-recovered">Bugs that Are Recovered</h3>
<p>TiDB &ldquo;basically&rdquo; can recover from panics caused by unknown bugs. However, if there is an array out-of-bounds, a null pointer reference, or intentional panic, it cannot guarantee correct results for the current and subsequent SQL, so terminating the current connection is a wise choice.</p>
<p><img alt="lost" loading="lazy" src="/posts/images/20200908143849.webp"></p>
<p>At this time, an error log &ldquo;connection running loop panic&rdquo; will appear, along with a <code>lastSQL</code> field that outputs the current erroneous SQL.</p>
<h3 id="panic-not-recovered">Panic Not Recovered</h3>
<p>Whether it&rsquo;s an unrecovered panic or a system-level OOM-induced panic, they do not leave a log in TiDB&rsquo;s logs. TiDB clusters managed by deployment tools like Ansible or TiUP will automatically restart a crashed TiDB server. Consequently, the log will contain a new &ldquo;Welcome&rdquo; message, which might be overlooked. However, the Uptime in monitoring will show TiDB&rsquo;s Uptime reset to zero, making this issue relatively easy to detect. Of course, it&rsquo;s better to have accompanying alerts.</p>
<p>Unrecovered panic outputs are Golang&rsquo;s default outputs, usually redirected to <code>tidb_stderr.log</code> by deployment tools. Older versions of Ansible overwrite this file every restart, but now use an append mode.</p>
<p><img alt="lost" loading="lazy" src="/posts/images/15992142135768.webp"></p>
<p>Nevertheless, it has some other drawbacks, like lacking timestamps. This makes it difficult to timestamp-match with TiDB logs. This <a href="https://github.com/pingcap/tidb/pull/18310">PR</a> implemented distinguishing <code>tidb_stderr.log</code> based on PID but hasn&rsquo;t been coordinated with the deployment tools and is temporarily disabled.</p>
<p>To get this standard panic output, you can use the panicparse introduced in the previous article to parse the panic results. Typically, you can look at the topmost stack. The example in the image evidently shows an out-of-memory error, commonly referred to as OOM. To identify which SQL caused the OOM, check TiDB&rsquo;s logs for resource-heavy SQL, which are usually logged with the <code>expensive_query</code> tag, and can be checked by grepping the logs. This will not be exemplified here.</p>
<h2 id="tracing">Tracing</h2>
<p>TiDB has supported tracing since version 2.1, but it hasn&rsquo;t been widely used. I think there are two main reasons:</p>
<ol>
<li>
<p>The initial version of tracing only supported the JSON format, requiring the output to be copied and pasted into a TiDB-specific web page at a special host port to view it. Although novel, the multiple steps involved prevented widespread adoption.</p>
<p><img alt="lost" loading="lazy" src="/posts/images/trace-view.webp"></p>
</li>
<li>
<p>Another issue is that tracing provides insight only after a problem is known. If developers suspect a problem or slow execution in advance, they must proactively add events at those points. Often, unforeseen issues cannot be covered, leaving gaps.</p>
</li>
</ol>
<p>Once the framework of tracing is in place, adding events is relatively straightforward and involves adding code like the snippet below at the desired points:</p>
<p><img alt="lost" loading="lazy" src="/posts/images/20200908165442.webp"></p>
<p>Interested individuals can add events to TiDB as needed, offering a good hands-on experience.</p>
<p>Eventually, tracing added <code>format='row'</code> and <code>format='log'</code> features. I personally favor <code>format='log'</code>.</p>
<h3 id="difference-between-tracing-and-explain-analyze">Difference between Tracing and Explain (Analyze)</h3>
<ol>
<li>Tracing operates at the function level, while Explain operates at the operator level. Tracing is easier to add and more granular and does not need to be part of a plan.</li>
<li>Tracing can trace any SQL, while Explain only shows data reading parts. For example, with an Insert, Explain shows almost nothing, whereas tracing provides detailed insights from SQL parsing to the full transaction commit.</li>
</ol>
]]></content:encoded>
    </item>
    <item>
      <title>How to Read TiDB Source Code (Part 4)</title>
      <link>https://blog.minifish.org/posts/how-to-read-tidb-source-code-part-4/</link>
      <pubDate>Fri, 31 Jul 2020 10:58:00 +0800</pubDate>
      <guid>https://blog.minifish.org/posts/how-to-read-tidb-source-code-part-4/</guid>
      <description>&lt;p&gt;This article will introduce some key functions and the interpretation of logs in TiDB.&lt;/p&gt;
&lt;h2 id=&#34;key-functions&#34;&gt;Key Functions&lt;/h2&gt;
&lt;p&gt;The definition of key functions varies from person to person, so the content of this section is subjective.&lt;/p&gt;
&lt;h3 id=&#34;execute&#34;&gt;execute&lt;/h3&gt;
&lt;p&gt;&lt;img alt=&#34;func&#34; loading=&#34;lazy&#34; src=&#34;https://blog.minifish.org/posts/images/20200812152326.webp&#34;&gt;&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;execute&lt;/code&gt; function is the necessary pathway for text protocol execution. It also nicely demonstrates the various processes of SQL handling.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;ParseSQL analyzes the SQL. The final implementation is in the parser, where SQL is parsed according to the rules introduced in the second article. Note that the parsed SQL may be a single statement or multiple statements. TiDB itself supports the multi-SQL feature, allowing multiple SQL statements to be executed at once.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p>This article will introduce some key functions and the interpretation of logs in TiDB.</p>
<h2 id="key-functions">Key Functions</h2>
<p>The definition of key functions varies from person to person, so the content of this section is subjective.</p>
<h3 id="execute">execute</h3>
<p><img alt="func" loading="lazy" src="/posts/images/20200812152326.webp"></p>
<p>The <code>execute</code> function is the necessary pathway for text protocol execution. It also nicely demonstrates the various processes of SQL handling.</p>
<ol>
<li>
<p>ParseSQL analyzes the SQL. The final implementation is in the parser, where SQL is parsed according to the rules introduced in the second article. Note that the parsed SQL may be a single statement or multiple statements. TiDB itself supports the multi-SQL feature, allowing multiple SQL statements to be executed at once.</p>
</li>
<li>
<p>After parsing, a <code>stmtNodes</code> array is returned, which is processed one-by-one in the for loop below. The first step is to compile, where the core of compile is optimization, generating a plan. By following the <code>Optimize</code> function, you can find logic similar to logical and physical optimization found in other common databases.</p>
<p><img alt="func" loading="lazy" src="/posts/images/20200812153017.webp"></p>
</li>
<li>
<p>The last part is execution, where <code>executeStatement</code> and particularly the <code>runStmt</code> function are key functions.</p>
</li>
</ol>
<h3 id="runstmt">runStmt</h3>
<p>Judging from the call graph of <code>runStmt</code>, this function is almost the mandatory pathway for all SQL execution. Except for point query statements using the binary protocol with automatic commit, all other statements go through this function. This function is responsible for executing SQL, excluding SQL parsing and compilation (the binary protocol does not need repeated SQL parsing, nor does SQL compilation require plan caching).</p>
<p><img alt="func" loading="lazy" src="/posts/images/20200731112400.webp"></p>
<p>The core part of the <code>runStmt</code> function is as shown above. From top to bottom:</p>
<ol>
<li>
<p>checkTxnAborted</p>
<p>When a transaction is already corrupted and cannot be committed, the user must actively close the transaction to end the already corrupted transaction. During execution, transactions may encounter errors that cannot be handled and must be terminated. The transaction cannot be silently closed because the user may continue to execute SQL and assume it is still within the transaction. This function ensures that all subsequent SQL commands by the user are not executed and directly return an error until the user uses rollback or commit to explicitly close the transaction for normal execution.</p>
</li>
<li>
<p>Exec</p>
<p>Execute the SQL and return the result set (rs).</p>
</li>
<li>
<p>IsReadOnly</p>
<p>After executing a SQL, it&rsquo;s necessary to determine whether it is a read-only SQL. If it is not read-only, it must be temporarily stored in the transaction&rsquo;s execution history. This execution history is used when a transaction conflict or other errors require the transaction to be retried. Read-only SQL is bypassed because the retry of the transaction is done during the commit phase, and at this point, the only feedback to the client can be success or failure of the commit; reading results is meaningless.</p>
<p>This section also includes <code>StmtCommit</code> and <code>StmtRollback</code>. TiDB supports MySQL-like statement commits and rollbacks—if a statement fails during a transaction, that single statement will be atomically rolled back, while other successfully executed statements will eventually commit with the transaction.</p>
<p>In TiDB, the feature of statement commit is implemented with a two-layer buffer: both the transaction and the statement have their own buffers. After a statement executes successfully, the statement’s buffer is merged into the transaction buffer. If a statement fails, the statement’s buffer is discarded, thus ensuring the atomicity of statement commits. Of course, a statement commit may fail, in which case the entire transaction buffer becomes unusable, and the transaction can only be rolled back.</p>
</li>
<li>
<p>finishStmt</p>
<p>Once a statement is executed, should it be committed? This depends on whether the transaction was explicitly started (i.e., with <code>begin</code> or <code>start transaction</code>) and whether autocommit is enabled. The role of <code>finishStmt</code> is to, post-execution, check if it should be committed based on the above conditions. It&rsquo;s essentially for cleaning up and checking after each statement execution.</p>
</li>
<li>
<p>pending section</p>
<p>Some SQLs in TiDB do not require a transaction (e.g., the <code>set</code> statement). However, before parsing, the database doesn’t know whether the statement requires a transaction. The latency of starting a transaction in TiDB is relatively high because it requires obtaining a TSO (timestamp oracle) from PD. TiDB has an optimization to asynchronously obtain a TSO, meaning a TSO is prepared regardless of whether a transaction is eventually needed. If a statement indeed doesn’t require a TSO and a transaction is not activated, remaining in a pending status, the pending transaction must be closed.</p>
</li>
</ol>
<h2 id="logs">Logs</h2>
<p>Let&rsquo;s first look at a section of logs from TiDB at initial startup, divided into several parts:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-text" data-lang="text"><span style="display:flex;"><span>[2020/08/12 16:12:07.282 +08:00] [INFO] [printer.go:42] [&#34;Welcome to TiDB.&#34;] [&#34;Release Version&#34;=None] [Edition=None] [&#34;Git Commit Hash&#34;=None] [&#34;Git Branch&#34;=None] [&#34;UTC Build Time&#34;=None] [GoVersion=go1.15] [&#34;Race Enabled&#34;=false] [&#34;Check Table Before Drop&#34;=false] [&#34;TiKV Min Version&#34;=v3.0.0-60965b006877ca7234adaced7890d7b029ed1306]
</span></span><span style="display:flex;"><span>[2020/08/12 16:12:07.300 +08:00] [INFO] [printer.go:56] [&#34;loaded config&#34;] [config=&#34;{\&#34;host\&#34;:\&#34;0.0.0.0\&#34;,\&#34;advertise-address\&#34;:\&#34;0.0.0.0\&#34;,\&#34;port\&#34;:4000,\&#34;cors\&#34;:\&#34;\&#34;,\&#34;store\&#34;:\&#34;mocktikv\&#34;,\&#34;path\&#34;:\&#34;/tmp/tidb\&#34;,\&#34;socket\&#34;:\&#34;\&#34;,\&#34;lease\&#34;:\&#34;45s\&#34;,\&#34;run-ddl\&#34;:true,\&#34;split-table\&#34;:true,\&#34;token-limit\&#34;:1000,\&#34;oom-use-tmp-storage\&#34;:true,\&#34;tmp-storage-path\&#34;:\&#34;C:\\\\Users\\\\username\\\\AppData\\\\Local\\\\Temp\\\\tidb\\\\tmp-storage\&#34;,\&#34;oom-action\&#34;:\&#34;log\&#34;,\&#34;mem-quota-query\&#34;:1073741824,\&#34;tmp-storage-quota\&#34;:-1,\&#34;enable-streaming\&#34;:false,\&#34;enable-batch-dml\&#34;:false,\&#34;lower-case-table-names\&#34;:2,\&#34;server-version\&#34;:\&#34;\&#34;,\&#34;log\&#34;:{\&#34;level\&#34;:\&#34;info\&#34;,\&#34;format\&#34;:\&#34;text\&#34;,\&#34;disable-timestamp\&#34;:null,\&#34;enable-timestamp\&#34;:null,\&#34;disable-error-stack\&#34;:null,\&#34;enable-error-stack\&#34;:null,\&#34;file\&#34;:{\&#34;filename\&#34;:\&#34;\&#34;,\&#34;max-size\&#34;:300,\&#34;max-days\&#34;:0,\&#34;max-backups\&#34;:0},\&#34;enable-slow-log\&#34;:true,\&#34;slow-query-file\&#34;:\&#34;tidb-slow.log\&#34;,\&#34;slow-threshold\&#34;:300,\&#34;expensive-threshold\&#34;:10000,\&#34;query-log-max-len\&#34;:4096,\&#34;record-plan-in-slow-log\&#34;:1},\&#34;security\&#34;:{\&#34;skip-grant-table\&#34;:false,\&#34;ssl-ca\&#34;:\&#34;\&#34;,\&#34;ssl-cert\&#34;:\&#34;\&#34;,\&#34;ssl-key\&#34;:\&#34;\&#34;,\&#34;require-secure-transport\&#34;:false,\&#34;cluster-ssl-ca\&#34;:\&#34;\&#34;,\&#34;cluster-ssl-cert\&#34;:\&#34;\&#34;,\&#34;cluster-ssl-key\&#34;:\&#34;\&#34;,\&#34;cluster-verify-cn\&#34;:null},\&#34;status\&#34;:{\&#34;status-host\&#34;:\&#34;0.0.0.0\&#34;,\&#34;metrics-addr\&#34;:\&#34;\&#34;,\&#34;status-port\&#34;:10080,\&#34;metrics-interval\&#34;:15,\&#34;report-status\&#34;:true,\&#34;record-db-qps\&#34;:false},\&#34;performance\&#34;:{\&#34;max-procs\&#34;:0,\&#34;max-memory\&#34;:0,\&#34;stats-lease\&#34;:\&#34;3s\&#34;,\&#34;stmt-count-limit\&#34;:5000,\&#34;feedback-probability\&#34;:0.05,\&#34;query-feedback-limit\&#34;:1024,\&#34;pseudo-estimate-ratio\&#34;:0.8,\&#34;force-priority\&#34;:\&#34;NO_PRIORITY\&#34;,\&#34;bind-info-lease\&#34;:\&#34;3s\&#34;,\&#34;txn-total-size-limit\&#34;:104857600,\&#34;tcp-keep-alive\&#34;:true,\&#34;cross-join\&#34;:true,\&#34;run-auto-analyze\&#34;:true,\&#34;agg-push-down-join\&#34;:false,\&#34;committer-concurrency\&#34;:16,\&#34;max-txn-ttl\&#34;:600000},\&#34;prepared-plan-cache\&#34;:{\&#34;enabled\&#34;:false,\&#34;capacity\&#34;:100,\&#34;memory-guard-ratio\&#34;:0.1},\&#34;opentracing\&#34;:{\&#34;enable\&#34;:false,\&#34;rpc-metrics\&#34;:false,\&#34;sampler\&#34;:{\&#34;type\&#34;:\&#34;const\&#34;,\&#34;param\&#34;:1,\&#34;sampling-server-url\&#34;:\&#34;\&#34;,\&#34;max-operations\&#34;:0,\&#34;sampling-refresh-interval\&#34;:0},\&#34;reporter\&#34;:{\&#34;queue-size\&#34;:0,\&#34;buffer-flush-interval\&#34;:0,\&#34;log-spans\&#34;:false,\&#34;local-agent-host-port\&#34;:\&#34;\&#34;}},\&#34;proxy-protocol\&#34;:{\&#34;networks\&#34;:\&#34;\&#34;,\&#34;header-timeout\&#34;:5},\&#34;tikv-client\&#34;:{\&#34;grpc-connection-count\&#34;:4,\&#34;grpc-keepalive-time\&#34;:10,\&#34;grpc-keepalive-timeout\&#34;:3,\&#34;commit-timeout\&#34;:\&#34;41s\&#34;,\&#34;max-batch-size\&#34;:128,\&#34;overload-threshold\&#34;:200,\&#34;max-batch-wait-time\&#34;:0,\&#34;batch-wait-size\&#34;:8,\&#34;enable-chunk-rpc\&#34;:true,\&#34;region-cache-ttl\&#34;:600,\&#34;store-limit\&#34;:0,\&#34;store-liveness-timeout\&#34;:\&#34;120s\&#34;,\&#34;copr-cache\&#34;:{\&#34;enable\&#34;:false,\&#34;capacity-mb\&#34;:1000,\&#34;admission-max-result-mb\&#34;:10,\&#34;admission-min-process-ms\&#34;:5}},\&#34;binlog\&#34;:{\&#34;enable\&#34;:false,\&#34;ignore-error\&#34;:false,\&#34;write-timeout\&#34;:\&#34;15s\&#34;,\&#34;binlog-socket\&#34;:\&#34;\&#34;,\&#34;strategy\&#34;:\&#34;range\&#34;},\&#34;compatible-kill-query\&#34;:false,\&#34;plugin\&#34;:{\&#34;dir\&#34;:\&#34;\&#34;,\&#34;load\&#34;:\&#34;\&#34;},\&#34;pessimistic-txn\&#34;:{\&#34;enable\&#34;:true,\&#34;max-retry-count\&#34;:256},\&#34;check-mb4-value-in-utf8\&#34;:true,\&#34;max-index-length\&#34;:3072,\&#34;alter-primary-key\&#34;:false,\&#34;treat-old-version-utf8-as-utf8mb4\&#34;:true,\&#34;enable-table-lock\&#34;:false,\&#34;delay-clean-table-lock\&#34;:0,\&#34;split-region-max-num\&#34;:1000,\&#34;stmt-summary\&#34;:{\&#34;enable\&#34;:true,\&#34;enable-internal-query\&#34;:false,\&#34;max-stmt-count\&#34;:200,\&#34;max-sql-length\&#34;:4096,\&#34;refresh-interval\&#34;:1800,\&#34;history-size\&#34;:24},\&#34;repair-mode\&#34;:false,\&#34;repair-table-list\&#34;:[],\&#34;isolation-read\&#34;:{\&#34;engines\&#34;:[\&#34;tikv\&#34;,\&#34;tiflash\&#34;,\&#34;tidb\&#34;]},\&#34;max-server-connections\&#34;:0,\&#34;new_collations_enabled_on_first_bootstrap\&#34;:false,\&#34;experimental\&#34;:{\&#34;allow-auto-random\&#34;:false,\&#34;allow-expression-index\&#34;:false}}&#34;]
</span></span></code></pre></div><ol>
<li>Mandatory startup outputs: &ldquo;Welcome to TiDB,&rdquo; git hash, Golang version, etc.</li>
<li>Actually loaded configuration (this section is somewhat difficult to read)</li>
</ol>
<p>The remainder are some routine startup logs. The process can be referenced from the main function section introduced in the first article, mainly outputting the initial system table creation process.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-text" data-lang="text"><span style="display:flex;"><span>[2020/08/12 16:12:07.300 +08:00] [INFO] [main.go:341] [&#34;disable Prometheus push client&#34;]
</span></span><span style="display:flex;"><span>[2020/08/12 16:12:07.300 +08:00] [INFO] [store.go:68] [&#34;new store&#34;] [path=mocktikv:///tmp/tidb]
</span></span><span style="display:flex;"><span>[2020/08/12 16:12:07.300 +08:00] [INFO] [systime_mon.go:25] [&#34;start system time monitor&#34;]
</span></span><span style="display:flex;"><span>[2020/08/12 16:12:07.310 +08:00] [INFO] [store.go:74] [&#34;new store with retry success&#34;]
</span></span><span style="display:flex;"><span>[2020/08/12 16:12:07.310 +08:00] [INFO] [tidb.go:71] [&#34;new domain&#34;] [store=8d19232e-a273-4e31-ba9b-a3467998345c] [&#34;ddl lease&#34;=45s] [&#34;stats lease&#34;=3s]
</span></span><span style="display:flex;"><span>[2020/08/12 16:12:07.315 +08:00] [INFO] [ddl.go:321] [&#34;[ddl] start DDL&#34;] [ID=0e1bd28e-03ed-4900-bf71-f58b3b9d954a] [runWorker=true]
</span></span><span style="display:flex;"><span>[2020/08/12 16:12:07.315 +08:00] [INFO] [ddl.go:309] [&#34;[ddl] start delRangeManager OK&#34;] [&#34;is a emulator&#34;=true]
</span></span><span style="display:flex;"><span>[2020/08/12 16:12:07.315 +08:00] [INFO] [ddl_worker.go:130] [&#34;[ddl] start DDL worker&#34;] [worker=&#34;worker 1, tp general&#34;]
</span></span><span style="display:flex;"><span>[2020/08/12 16:12:07.315 +08:00] [INFO] [ddl_worker.go:130] [&#34;[ddl] start DDL worker&#34;] [worker=&#34;worker 2, tp add index&#34;]
</span></span><span style="display:flex;"><span>[2020/08/12 16:12:07.315 +08:00] [INFO] [delete_range.go:133] [&#34;[ddl] start delRange emulator&#34;]
</span></span><span style="display:flex;"><span>[2020/08/12 16:12:07.317 +08:00] [INFO] [domain.go:144] [&#34;full load InfoSchema success&#34;] [usedSchemaVersion=0] [neededSchemaVersion=0] [&#34;start time&#34;=2.0015ms]
</span></span><span style="display:flex;"><span>[2020/08/12 16:12:07.317 +08:00] [INFO] [domain.go:368] [&#34;full load and reset schema validator&#34;]
</span></span><span style="display:flex;"><span>[2020/08/12 16:12:07.317 +08:00] [INFO] [tidb.go:199] [&#34;rollbackTxn for ddl/autocommit failed&#34;]
</span></span></code></pre></div><p>Because DDL logs are very numerous, the TiDB logs basically record each step of the DDL execution, so I&rsquo;ve truncated this part of the log here. However, the basic outline can be sorted out. Firstly, the DDL execution is initiated from ddl_api, at this time recording <code>[&quot;CRUCIAL OPERATION&quot;]</code> style logs. DDL is a crucial operation, so it belongs to CRUCIAL type logs. Then, we can see a series of logs with the ddl keyword linked together, such as <code>[ddl] add DDL jobs</code>, <code>[ddl] start DDL job</code>, <code>[ddl] run DDL job</code>, <code>[ddl] finish DDL job</code>, and <code>[ddl] DDL job is finished</code>. These represent the process from when the DDL owner acquires a job to its final execution completion. Moreover, they have a unique job ID, which can be used to link a DDL in the log with something like <code>jobs=&quot;ID:2</code>.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-text" data-lang="text"><span style="display:flex;"><span>[2020/08/12 16:12:07.518 +08:00] [INFO] [server.go:235] [&#34;server is running MySQL protocol&#34;] [addr=0.0.0.0:4000]
</span></span><span style="display:flex;"><span>[2020/08/12 16:12:07.518 +08:00] [INFO] [http_status.go:80] [&#34;for status and metrics report&#34;] [&#34;listening on addr&#34;=0.0.0.0:10080]
</span></span><span style="display:flex;"><span>[2020/08/12 16:12:07.520 +08:00] [INFO] [domain.go:1015] [&#34;init stats info time&#34;] [&#34;take time&#34;=3.0126ms]
</span></span><span style="display:flex;"><span>[2020/08/12 16:15:41.482 +08:00] [INFO] [server.go:388] [&#34;new connection&#34;] [conn=1] [remoteAddr=127.0.0.1:64888]
</span></span><span style="display:flex;"><span>[2020/08/12 21:03:19.954 +08:00] [INFO] [server.go:391] [&#34;connection closed&#34;] [conn=1]
</span></span></code></pre></div><p>Thereafter, the appearance of <code>server is running MySQL protocol</code> means that TiDB can provide services externally. Later, there are logs corresponding to the creation and closing of each connection, namely <code>new connection</code> and <code>connection closed</code>. Of course, they also have their corresponding connection ID, which is unique for a TiDB. You can use the keyword <code>conn=1</code> in the log to contextually link them together.</p>
<h3 id="stack-logs">Stack Logs</h3>
<p>Most of TiDB&rsquo;s SQL errors (except for duplicate entry and syntax errors) will output the complete stack information. Due to the requirements of unified log format, the stack now looks very unsightly&hellip;</p>
<p>For this stack trace, I believe no one really enjoys reading it. Therefore, we need to paste it into Vim and execute <code>%s/\\n/\r/g</code> and <code>%s/\\t/    /g</code> to turn it into a Golang-style stack.</p>
<p>When you see which module it&rsquo;s stuck in, like the plan part here, you can find the corresponding colleague for support.</p>
<p>However, there is a more user-friendly tool for dealing with Golang’s lengthy stack called <a href="https://github.com/maruel/panicparse">panicparse</a>. To install it, simply run
<code>go get github.com/maruel/panicparse/v2/cmd/pp</code>. The effect is as follows:</p>
<p><img alt="func" loading="lazy" src="/posts/images/20200813172149.webp"></p>
<p>Whether it&rsquo;s TiDB running goroutines or panic outputs, it can be parsed using this. It has several features:</p>
<ol>
<li>It can display active and inactive goroutines.</li>
<li>It can show the relationships between goroutines.</li>
<li>Keyword highlighting.</li>
<li>Supports Windows.</li>
</ol>
<p>The latest 2.0.0 version supports race detector and HTML formatted output.</p>
<p>This concludes the introduction to the analysis of key functions and logs (startup, DDL, connection, error stack).</p>
]]></content:encoded>
    </item>
    <item>
      <title>How to Read TiDB Source Code (Part 3)</title>
      <link>https://blog.minifish.org/posts/how-to-read-tidb-source-code-part-3/</link>
      <pubDate>Tue, 28 Jul 2020 11:47:00 +0800</pubDate>
      <guid>https://blog.minifish.org/posts/how-to-read-tidb-source-code-part-3/</guid>
      <description>&lt;p&gt;In the previous article, we introduced methods for viewing syntax and configurations. In this article, we will discuss how to view system variables, including default values, scopes, and how to monitor metrics.&lt;/p&gt;
&lt;h2 id=&#34;system-variables&#34;&gt;System Variables&lt;/h2&gt;
&lt;p&gt;The system variable names in TiDB are defined in &lt;a href=&#34;https://github.com/pingcap/tidb/blob/db0310b17901b1a59f7f728294455ed9667f88ac/sessionctx/variable/tidb_vars.go&#34;&gt;tidb_vars.go&lt;/a&gt;. This file also includes some default values for variables, but the place where they are actually assembled is &lt;a href=&#34;https://github.com/pingcap/tidb/blob/12aac547a9068c404ad18093ae4d0ea4d060a465/sessionctx/variable/sysvar.go#L96&#34;&gt;defaultSysVars&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;defaultSysVars&#34; loading=&#34;lazy&#34; src=&#34;https://blog.minifish.org/posts/images/20200728151254.webp&#34;&gt;&lt;/p&gt;
&lt;p&gt;This large struct array defines the scope, variable names, and default values for all variables in TiDB. Besides TiDB&amp;rsquo;s own system variables, it also includes compatibility with MySQL&amp;rsquo;s system variables.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p>In the previous article, we introduced methods for viewing syntax and configurations. In this article, we will discuss how to view system variables, including default values, scopes, and how to monitor metrics.</p>
<h2 id="system-variables">System Variables</h2>
<p>The system variable names in TiDB are defined in <a href="https://github.com/pingcap/tidb/blob/db0310b17901b1a59f7f728294455ed9667f88ac/sessionctx/variable/tidb_vars.go">tidb_vars.go</a>. This file also includes some default values for variables, but the place where they are actually assembled is <a href="https://github.com/pingcap/tidb/blob/12aac547a9068c404ad18093ae4d0ea4d060a465/sessionctx/variable/sysvar.go#L96">defaultSysVars</a>.</p>
<p><img alt="defaultSysVars" loading="lazy" src="/posts/images/20200728151254.webp"></p>
<p>This large struct array defines the scope, variable names, and default values for all variables in TiDB. Besides TiDB&rsquo;s own system variables, it also includes compatibility with MySQL&rsquo;s system variables.</p>
<h3 id="scope">Scope</h3>
<p>In TiDB, there are three types of variable scopes literally:</p>
<p><img alt="defaultSysVars" loading="lazy" src="/posts/images/20200728151833.webp"></p>
<p>They are ScopeNone, ScopeGlobal, and ScopeSession. They represent:</p>
<ul>
<li>ScopeNone: Read-only variables</li>
<li>ScopeGlobal: Global variables</li>
<li>ScopeSession: Session variables</li>
</ul>
<p>The actual effect of these scopes is that when you use SQL to read or write them, you need to use the corresponding syntax. If the SQL fails, the SQL operation does not take effect. If the SQL executes successfully, it merely means the setting is complete, but it does not mean that it takes effect according to the corresponding scope.</p>
<p>Let&rsquo;s use the method mentioned in the first article to start a single-node TiDB for demonstration:</p>
<h4 id="scopenone">ScopeNone</h4>
<p>Take <code>performance_schema_max_mutex_classes</code> as an example,</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span>MySQL  <span style="color:#ae81ff">127</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">1</span>:<span style="color:#ae81ff">4000</span>  <span style="color:#66d9ef">SQL</span> <span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">select</span> <span style="color:#f92672">@@</span>performance_schema_max_mutex_classes;
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">----------------------------------------+
</span></span></span><span style="display:flex;"><span><span style="color:#f92672">|</span> <span style="color:#f92672">@@</span>performance_schema_max_mutex_classes <span style="color:#f92672">|</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">----------------------------------------+
</span></span></span><span style="display:flex;"><span><span style="color:#f92672">|</span> <span style="color:#ae81ff">200</span>                                    <span style="color:#f92672">|</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">----------------------------------------+
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff">1</span> <span style="color:#66d9ef">row</span> <span style="color:#66d9ef">in</span> <span style="color:#66d9ef">set</span> (<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0002</span> sec)
</span></span><span style="display:flex;"><span>MySQL  <span style="color:#ae81ff">127</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">1</span>:<span style="color:#ae81ff">4000</span>  <span style="color:#66d9ef">SQL</span> <span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">select</span> <span style="color:#f92672">@@</span><span style="color:#66d9ef">global</span>.performance_schema_max_mutex_classes;
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">-----------------------------------------------+
</span></span></span><span style="display:flex;"><span><span style="color:#f92672">|</span> <span style="color:#f92672">@@</span><span style="color:#66d9ef">global</span>.performance_schema_max_mutex_classes <span style="color:#f92672">|</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">-----------------------------------------------+
</span></span></span><span style="display:flex;"><span><span style="color:#f92672">|</span> <span style="color:#ae81ff">200</span>                                           <span style="color:#f92672">|</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">-----------------------------------------------+
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff">1</span> <span style="color:#66d9ef">row</span> <span style="color:#66d9ef">in</span> <span style="color:#66d9ef">set</span> (<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0004</span> sec)
</span></span><span style="display:flex;"><span>MySQL  <span style="color:#ae81ff">127</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">1</span>:<span style="color:#ae81ff">4000</span>  <span style="color:#66d9ef">SQL</span> <span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">select</span> <span style="color:#f92672">@@</span><span style="color:#66d9ef">session</span>.performance_schema_max_mutex_classes;
</span></span><span style="display:flex;"><span>ERROR: <span style="color:#ae81ff">1238</span> (HY000): <span style="color:#66d9ef">Variable</span> <span style="color:#e6db74">&#39;performance_schema_max_mutex_classes&#39;</span> <span style="color:#66d9ef">is</span> a <span style="color:#66d9ef">GLOBAL</span> <span style="color:#66d9ef">variable</span>
</span></span></code></pre></div><p>As you can see, the scope of ScopeNone can be read as a global variable,</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span>MySQL  <span style="color:#ae81ff">127</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">1</span>:<span style="color:#ae81ff">4000</span>  <span style="color:#66d9ef">SQL</span> <span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">set</span> <span style="color:#66d9ef">global</span> performance_schema_max_mutex_classes <span style="color:#f92672">=</span> <span style="color:#ae81ff">1</span>;
</span></span><span style="display:flex;"><span>ERROR: <span style="color:#ae81ff">1105</span> (HY000): <span style="color:#66d9ef">Variable</span> <span style="color:#e6db74">&#39;performance_schema_max_mutex_classes&#39;</span> <span style="color:#66d9ef">is</span> a <span style="color:#66d9ef">read</span><span style="color:#f92672">-</span><span style="color:#66d9ef">only</span> <span style="color:#66d9ef">variable</span>
</span></span><span style="display:flex;"><span>MySQL  <span style="color:#ae81ff">127</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">1</span>:<span style="color:#ae81ff">4000</span>  <span style="color:#66d9ef">SQL</span> <span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">set</span> performance_schema_max_mutex_classes <span style="color:#f92672">=</span> <span style="color:#ae81ff">1</span>;
</span></span><span style="display:flex;"><span>ERROR: <span style="color:#ae81ff">1105</span> (HY000): <span style="color:#66d9ef">Variable</span> <span style="color:#e6db74">&#39;performance_schema_max_mutex_classes&#39;</span> <span style="color:#66d9ef">is</span> a <span style="color:#66d9ef">read</span><span style="color:#f92672">-</span><span style="color:#66d9ef">only</span> <span style="color:#66d9ef">variable</span>
</span></span><span style="display:flex;"><span>MySQL  <span style="color:#ae81ff">127</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">1</span>:<span style="color:#ae81ff">4000</span>  <span style="color:#66d9ef">SQL</span> <span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">set</span> <span style="color:#66d9ef">session</span> performance_schema_max_mutex_classes <span style="color:#f92672">=</span> <span style="color:#ae81ff">1</span>;
</span></span><span style="display:flex;"><span>ERROR: <span style="color:#ae81ff">1105</span> (HY000): <span style="color:#66d9ef">Variable</span> <span style="color:#e6db74">&#39;performance_schema_max_mutex_classes&#39;</span> <span style="color:#66d9ef">is</span> a <span style="color:#66d9ef">read</span><span style="color:#f92672">-</span><span style="color:#66d9ef">only</span> <span style="color:#66d9ef">variable</span>
</span></span></code></pre></div><p>But it cannot be set in any way.</p>
<p>To trace the usage of ScopeNone, you will see</p>
<p><img alt="defaultSysVars" loading="lazy" src="/posts/images/20200728155134.webp"></p>
<p>In <code>setSysVariable</code>, when this type of scope variable is encountered, an error is directly returned.</p>
<p><img alt="defaultSysVars" loading="lazy" src="/posts/images/20200728155332.webp"></p>
<p>In <code>ValidateGetSystemVar</code>, it is handled as a global variable.
From a theoretical standpoint, these ScopeNone variables are essentially a single copy in the code. Once TiDB is started, they exist in memory as read-only and are not actually stored in TiKV.</p>
<h4 id="scopeglobal">ScopeGlobal</h4>
<p>Using <code>gtid_mode</code> as an example,</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span>MySQL  <span style="color:#ae81ff">127</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">1</span>:<span style="color:#ae81ff">4000</span>  <span style="color:#66d9ef">SQL</span> <span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">select</span> <span style="color:#f92672">@@</span>gtid_mode;
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">-------------+
</span></span></span><span style="display:flex;"><span><span style="color:#f92672">|</span> <span style="color:#f92672">@@</span>gtid_mode <span style="color:#f92672">|</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">-------------+
</span></span></span><span style="display:flex;"><span><span style="color:#f92672">|</span> <span style="color:#66d9ef">OFF</span>         <span style="color:#f92672">|</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">-------------+
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff">1</span> <span style="color:#66d9ef">row</span> <span style="color:#66d9ef">in</span> <span style="color:#66d9ef">set</span> (<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0003</span> sec)
</span></span><span style="display:flex;"><span>MySQL  <span style="color:#ae81ff">127</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">1</span>:<span style="color:#ae81ff">4000</span>  <span style="color:#66d9ef">SQL</span> <span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">select</span> <span style="color:#f92672">@@</span><span style="color:#66d9ef">global</span>.gtid_mode;
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">--------------------+
</span></span></span><span style="display:flex;"><span><span style="color:#f92672">|</span> <span style="color:#f92672">@@</span><span style="color:#66d9ef">global</span>.gtid_mode <span style="color:#f92672">|</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">--------------------+
</span></span></span><span style="display:flex;"><span><span style="color:#f92672">|</span> <span style="color:#66d9ef">OFF</span>                <span style="color:#f92672">|</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">--------------------+
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff">1</span> <span style="color:#66d9ef">row</span> <span style="color:#66d9ef">in</span> <span style="color:#66d9ef">set</span> (<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0006</span> sec)
</span></span><span style="display:flex;"><span>MySQL  <span style="color:#ae81ff">127</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">1</span>:<span style="color:#ae81ff">4000</span>  <span style="color:#66d9ef">SQL</span> <span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">select</span> <span style="color:#f92672">@@</span><span style="color:#66d9ef">session</span>.gtid_mode;
</span></span><span style="display:flex;"><span>ERROR: <span style="color:#ae81ff">1238</span> (HY000): <span style="color:#66d9ef">Variable</span> <span style="color:#e6db74">&#39;gtid_mode&#39;</span> <span style="color:#66d9ef">is</span> a <span style="color:#66d9ef">GLOBAL</span> <span style="color:#66d9ef">variable</span>
</span></span></code></pre></div><p>It works the same way as MySQL global variable reading,</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span>MySQL  <span style="color:#ae81ff">127</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">1</span>:<span style="color:#ae81ff">4000</span>  <span style="color:#66d9ef">SQL</span> <span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">set</span> gtid_mode<span style="color:#f92672">=</span><span style="color:#66d9ef">on</span>;
</span></span><span style="display:flex;"><span>ERROR: <span style="color:#ae81ff">1105</span> (HY000): <span style="color:#66d9ef">Variable</span> <span style="color:#e6db74">&#39;gtid_mode&#39;</span> <span style="color:#66d9ef">is</span> a <span style="color:#66d9ef">GLOBAL</span> <span style="color:#66d9ef">variable</span> <span style="color:#66d9ef">and</span> should be <span style="color:#66d9ef">set</span> <span style="color:#66d9ef">with</span> <span style="color:#66d9ef">SET</span> <span style="color:#66d9ef">GLOBAL</span>
</span></span><span style="display:flex;"><span>MySQL  <span style="color:#ae81ff">127</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">1</span>:<span style="color:#ae81ff">4000</span>  <span style="color:#66d9ef">SQL</span> <span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">set</span> <span style="color:#66d9ef">session</span> gtid_mode<span style="color:#f92672">=</span><span style="color:#66d9ef">on</span>;
</span></span><span style="display:flex;"><span>ERROR: <span style="color:#ae81ff">1105</span> (HY000): <span style="color:#66d9ef">Variable</span> <span style="color:#e6db74">&#39;gtid_mode&#39;</span> <span style="color:#66d9ef">is</span> a <span style="color:#66d9ef">GLOBAL</span> <span style="color:#66d9ef">variable</span> <span style="color:#66d9ef">and</span> should be <span style="color:#66d9ef">set</span> <span style="color:#66d9ef">with</span> <span style="color:#66d9ef">SET</span> <span style="color:#66d9ef">GLOBAL</span>
</span></span><span style="display:flex;"><span>MySQL  <span style="color:#ae81ff">127</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">1</span>:<span style="color:#ae81ff">4000</span>  <span style="color:#66d9ef">SQL</span> <span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">set</span> <span style="color:#66d9ef">global</span> gtid_mode<span style="color:#f92672">=</span><span style="color:#66d9ef">on</span>;
</span></span><span style="display:flex;"><span>Query OK, <span style="color:#ae81ff">0</span> <span style="color:#66d9ef">rows</span> affected (<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0029</span> sec)
</span></span><span style="display:flex;"><span>MySQL  <span style="color:#ae81ff">127</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">1</span>:<span style="color:#ae81ff">4000</span>  <span style="color:#66d9ef">SQL</span> <span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">select</span> <span style="color:#f92672">@@</span><span style="color:#66d9ef">global</span>.gtid_mode;
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">--------------------+
</span></span></span><span style="display:flex;"><span><span style="color:#f92672">|</span> <span style="color:#f92672">@@</span><span style="color:#66d9ef">global</span>.gtid_mode <span style="color:#f92672">|</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">--------------------+
</span></span></span><span style="display:flex;"><span><span style="color:#f92672">|</span> <span style="color:#66d9ef">ON</span>                 <span style="color:#f92672">|</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">--------------------+
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff">1</span> <span style="color:#66d9ef">row</span> <span style="color:#66d9ef">in</span> <span style="color:#66d9ef">set</span> (<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0005</span> sec)
</span></span><span style="display:flex;"><span>MySQL  <span style="color:#ae81ff">127</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">1</span>:<span style="color:#ae81ff">4000</span>  <span style="color:#66d9ef">SQL</span> <span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">select</span> <span style="color:#f92672">@@</span>gtid_mode;
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">-------------+
</span></span></span><span style="display:flex;"><span><span style="color:#f92672">|</span> <span style="color:#f92672">@@</span>gtid_mode <span style="color:#f92672">|</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">-------------+
</span></span></span><span style="display:flex;"><span><span style="color:#f92672">|</span> <span style="color:#66d9ef">ON</span>          <span style="color:#f92672">|</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">-------------+
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff">1</span> <span style="color:#66d9ef">row</span> <span style="color:#66d9ef">in</span> <span style="color:#66d9ef">set</span> (<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0006</span> sec)
</span></span></code></pre></div><p>The setting method is also compatible with MySQL. At this point, we can shut down the single-instance TiDB and restart it,</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span>MySQL  <span style="color:#ae81ff">127</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">1</span>:<span style="color:#ae81ff">4000</span>  <span style="color:#66d9ef">SQL</span> <span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">select</span> <span style="color:#f92672">@@</span>gtid_mode;
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">-------------+
</span></span></span><span style="display:flex;"><span><span style="color:#f92672">|</span> <span style="color:#f92672">@@</span>gtid_mode <span style="color:#f92672">|</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">-------------+
</span></span></span><span style="display:flex;"><span><span style="color:#f92672">|</span> <span style="color:#66d9ef">ON</span>          <span style="color:#f92672">|</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">-------------+
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff">1</span> <span style="color:#66d9ef">row</span> <span style="color:#66d9ef">in</span> <span style="color:#66d9ef">set</span> (<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0003</span> sec)
</span></span></code></pre></div><p>And you can see that the result can still be read, meaning that this setting was persisted to the storage engine.
Looking closely at the code, you can see:</p>
<p><img alt="defaultSysVars" loading="lazy" src="/posts/images/20200728164505.webp"></p>
<p>The actual implementation involves executing an internal replace statement to update the original value. This constitutes a complete transaction involving acquiring two TSOs and committing the entire process, making it slower compared to setting session variables.</p>
<h4 id="scopesession">ScopeSession</h4>
<p>Using <code>rand_seed2</code> as an example,</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span>MySQL  <span style="color:#ae81ff">127</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">1</span>:<span style="color:#ae81ff">4000</span>  <span style="color:#66d9ef">SQL</span> <span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">select</span> <span style="color:#f92672">@@</span>rand_seed2;
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">--------------+
</span></span></span><span style="display:flex;"><span><span style="color:#f92672">|</span> <span style="color:#f92672">@@</span>rand_seed2 <span style="color:#f92672">|</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">--------------+
</span></span></span><span style="display:flex;"><span><span style="color:#f92672">|</span>              <span style="color:#f92672">|</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">--------------+
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff">1</span> <span style="color:#66d9ef">row</span> <span style="color:#66d9ef">in</span> <span style="color:#66d9ef">set</span> (<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0005</span> sec)
</span></span><span style="display:flex;"><span>MySQL  <span style="color:#ae81ff">127</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">1</span>:<span style="color:#ae81ff">4000</span>  <span style="color:#66d9ef">SQL</span> <span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">select</span> <span style="color:#f92672">@@</span><span style="color:#66d9ef">session</span>.rand_seed2;
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">----------------------+
</span></span></span><span style="display:flex;"><span><span style="color:#f92672">|</span> <span style="color:#f92672">@@</span><span style="color:#66d9ef">session</span>.rand_seed2 <span style="color:#f92672">|</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">----------------------+
</span></span></span><span style="display:flex;"><span><span style="color:#f92672">|</span>                      <span style="color:#f92672">|</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">----------------------+
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff">1</span> <span style="color:#66d9ef">row</span> <span style="color:#66d9ef">in</span> <span style="color:#66d9ef">set</span> (<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0003</span> sec)
</span></span><span style="display:flex;"><span>MySQL  <span style="color:#ae81ff">127</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">1</span>:<span style="color:#ae81ff">4000</span>  <span style="color:#66d9ef">SQL</span> <span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">select</span> <span style="color:#f92672">@@</span><span style="color:#66d9ef">global</span>.rand_seed2;
</span></span><span style="display:flex;"><span>ERROR: <span style="color:#ae81ff">1238</span> (HY000): <span style="color:#66d9ef">Variable</span> <span style="color:#e6db74">&#39;rand_seed2&#39;</span> <span style="color:#66d9ef">is</span> a <span style="color:#66d9ef">SESSION</span> <span style="color:#66d9ef">variable</span>
</span></span></code></pre></div><p>Reading is compatible with MySQL.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span>MySQL  <span style="color:#ae81ff">127</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">1</span>:<span style="color:#ae81ff">4000</span>  <span style="color:#66d9ef">SQL</span> <span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">set</span> rand_seed2<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;abc&#39;</span>;
</span></span><span style="display:flex;"><span>Query OK, <span style="color:#ae81ff">0</span> <span style="color:#66d9ef">rows</span> affected (<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0006</span> sec)
</span></span><span style="display:flex;"><span>MySQL  <span style="color:#ae81ff">127</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">1</span>:<span style="color:#ae81ff">4000</span>  <span style="color:#66d9ef">SQL</span> <span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">set</span> <span style="color:#66d9ef">session</span> rand_seed2<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;bcd&#39;</span>;
</span></span><span style="display:flex;"><span>Query OK, <span style="color:#ae81ff">0</span> <span style="color:#66d9ef">rows</span> affected (<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0004</span> sec)
</span></span><span style="display:flex;"><span>MySQL  <span style="color:#ae81ff">127</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">1</span>:<span style="color:#ae81ff">4000</span>  <span style="color:#66d9ef">SQL</span> <span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">set</span> <span style="color:#66d9ef">global</span> rand_seed2<span style="color:#f92672">=</span><span style="color:#e6db74">&#39;cde&#39;</span>;
</span></span><span style="display:flex;"><span>ERROR: <span style="color:#ae81ff">1105</span> (HY000): <span style="color:#66d9ef">Variable</span> <span style="color:#e6db74">&#39;rand_seed2&#39;</span> <span style="color:#66d9ef">is</span> a <span style="color:#66d9ef">SESSION</span> <span style="color:#66d9ef">variable</span> <span style="color:#66d9ef">and</span> can<span style="color:#e6db74">&#39;t be used with SET GLOBAL
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">MySQL  127.0.0.1:4000  SQL &gt; select @@rand_seed2;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">+--------------+
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">| @@rand_seed2 |
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">+--------------+
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">| bcd          |
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">+--------------+
</span></span></span></code></pre></div><p>The setting is also compatible with MySQL. It can be simply observed that this operation only changes the session&rsquo;s memory.
The actual place where it finally takes effect is <a href="https://github.com/pingcap/tidb/blob/f360ad7a434e4edd4d7ebce5ed5dc2b9826b6ed0/sessionctx/variable/session.go#L998">SetSystemVar</a>.</p>
<p><img alt="defaultSysVars" loading="lazy" src="/posts/images/20200728171914.webp"></p>
<p>There are some tricks here.</p>
<h3 id="actual-scope-of-variables">Actual Scope of Variables</h3>
<p>The previous section covered setting session variables. Based on MySQL&rsquo;s variable rules, setting a global variable does not affect the current session. Only newly created sessions will load global variables for session variable assignment. Ultimately, the active session variable take effect. Global variables without session properties still have unique characteristics, and this chapter will cover:</p>
<ol>
<li>Activation of session variables</li>
<li>Activation of pure global variables</li>
<li>Mechanism of global variable function</li>
</ol>
<p>These three aspects.</p>
<h4 id="activation-of-session-variables">Activation of Session Variables</h4>
<p>Whether a session variable is also a global variable only affects whether it needs to load global variable data from the storage engine when the session starts. The default value in the code is the initial value for eternity if no loading is required.</p>
<p>The actual range where a variable operates can only be observed in <a href="https://github.com/pingcap/tidb/blob/f360ad7a434e4edd4d7ebce5ed5dc2b9826b6ed0/sessionctx/variable/session.go#L998">SetSystemVar</a>.</p>
<p><img alt="defaultSysVars" loading="lazy" src="/posts/images/20200728173351.webp"></p>
<p>For example, in this part, <code>s.MemQuotaNestedLoopApply = tidbOptInt64(val, DefTiDBMemQuotaNestedLoopApply)</code> changes the <code>s</code> structure, effectively changing the current session,</p>
<p>Whereas <code>atomic.StoreUint32(&amp;ProcessGeneralLog, uint32(tidbOptPositiveInt32(val, DefTiDBGeneralLog)))</code> changes the value of the global variable <code>ProcessGeneralLog</code>, thereby affecting the entire TiDB instance when <code>set tidb_general_log = 1</code> is executed.</p>
<h4 id="activation-of-pure-global-variables">Activation of Pure Global Variables</h4>
<p>Pure global variables in current TiDB are used for background threads like DDL, statistics, etc.</p>
<p><img alt="defaultSysVars" loading="lazy" src="/posts/images/20200728174207.webp">
<img alt="defaultSysVars" loading="lazy" src="/posts/images/20200728174243.webp"></p>
<p>Because only one TiDB server requires them, session-level variables hold no meaning for these.</p>
<h4 id="mechanism-of-global-variable-function">Mechanism of Global Variable Function</h4>
<p>Global variables in TiDB don&rsquo;t activate immediately after setting. A connection fetches the latest global system variables from TiKV to assign them to the current session the first time it&rsquo;s established. Concurrent connection creation results in frequent access to the TiKV node holding a few global variables. Thus, TiDB caches global variables, updating them every two seconds, significantly reducing TiKV load.
The problem arises that after setting a global variable, a brief wait is necessary before creating a new connection, ensuring new connections will read the latest global variable. This is one of the few eventual consistency locations within TiDB.</p>
<p>For specific details, see <a href="https://github.com/pingcap/tidb/blob/838b6a0cf2df2d1907508e56d9de9ba7fab502e5/session/session.go#L1990">this commentary</a> in <code>loadCommonGlobalVariablesIfNeeded</code>.</p>
<p><img alt="defaultSysVars" loading="lazy" src="/posts/images/20200728191527.webp"></p>
<h2 id="metrics">Metrics</h2>
<p>Compared to system variables, Metrics in TiDB are simpler, or straightforward. The most common Metrics are Histogram and Counter, the former is used to record actual values for an operation and the latter records occurrences of fixed events.
All Metrics in TiDB are uniformly located <a href="https://github.com/pingcap/tidb/tree/cbc225fa17c93a3f58bef41b5accb57beb0d9586/metrics">here</a>, with AlertManager and Grafana scripts also available separately under alertmanager and grafana.</p>
<p>There are many Metrics, and from a beginner&rsquo;s perspective, it&rsquo;s best to focus on a specific monitoring example. Let&rsquo;s take the TPS (transactions per second) panel as an example.</p>
<p><img alt="tps" loading="lazy" src="/posts/images/20200729205545.webp"></p>
<p>Click EDIT and you will see the monitoring formula is:</p>
<p><img alt="tps2" loading="lazy" src="/posts/images/20200729210124.webp"></p>
<p>The <code>tidb_session_transaction_duration_seconds</code> is the name of this specific metric. Since it is a histogram, it can actually be expressed as three types of values: sum, count, and bucket, which represent the total sum of values, the count (which functions the same as a counter), and the distribution by bucket, respectively.</p>
<p>In this context, [1m] represents a time window of 1 minute, indicating the precision of the measurement. The rate function calculates the slope, essentially the rate of change, indicating how many times something occurs per second. The sum function is used for aggregation, and when combined with by (type, txn_mode), it represents aggregation by the dimensions of type and txn_mode.</p>
<p>The Legend below displays the dimensions above using {{type}}-{{txn_mode}}. When surrounded by {{}}, it can display the actual label names.</p>
<p>In this representation, the final states of transactions are commit, abort, and rollback. A commit indicates a successful user-initiated transaction, rollback indicates a user-initiated rollback (which cannot fail), and abort indicates a user-initiated commit that failed.</p>
<p>The second label, txn_mode, refers to two modes: optimistic and pessimistic transactions. There&rsquo;s nothing further to explain about these modes.</p>
<p>Corresponding to the code:</p>
<p><img alt="alt text" loading="lazy" src="/posts/images/20200729211352.webp"></p>
<p>This segment of code shows that <code>tidb_session_transaction_duration_seconds</code> is divided into several parts, including namespace and subsystem. Generally, to find a variable in a formula like <code>tidb_session_transaction_duration_seconds_count</code> within TiDB code, you need to remove the first two words and the last word.</p>
<p>From this code snippet, you can see it&rsquo;s a histogram, specifically a HistogramVec, which is an array of histograms because it records data with several different labels. The labels LblTxnMode and LblType are these two labels.</p>
<p><img alt="alt text" loading="lazy" src="/posts/images/20200729211511.webp"></p>
<p>Checking the references, there is a place for registration, which is in the main function we discussed in the first article, where metrics are registered.</p>
<p><img alt="alt text" loading="lazy" src="/posts/images/20200729211725.webp"></p>
<p>Other references show how metrics are instantiated. Why do we do this? Mainly because as the number of labels increases, the performance of metrics becomes poorer, which is related to Prometheus&rsquo;s implementation. We had no choice but to create many instantiated global variables.</p>
<p><img alt="alt text" loading="lazy" src="/posts/images/20200729211935.webp"></p>
<p>Taking the implementation of Rollback as an example, its essence is to record the actual execution time of a transaction when Rollback is truly executed. Since it’s a histogram, it is also used as a counter in this instance.</p>
]]></content:encoded>
    </item>
    <item>
      <title>How to Read TiDB Source Code (Part 2)</title>
      <link>https://blog.minifish.org/posts/how-to-read-tidb-source-code-part-2/</link>
      <pubDate>Sun, 12 Jul 2020 12:09:00 +0800</pubDate>
      <guid>https://blog.minifish.org/posts/how-to-read-tidb-source-code-part-2/</guid>
      <description>&lt;p&gt;Continuing from &lt;a href=&#34;https://blog.minifish.org/posts/tidb1&#34;&gt;the previous article&lt;/a&gt;, we learned how to set up the environment for reading code and where to start reading the code. In this part, we&amp;rsquo;ll introduce methods for viewing code based on some common needs.&lt;/p&gt;
&lt;h2 id=&#34;how-to-check-the-support-level-of-a-syntax&#34;&gt;How to Check the Support Level of a Syntax&lt;/h2&gt;
&lt;p&gt;There are usually two methods:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Check through the parser repo&lt;/li&gt;
&lt;li&gt;Directly check within the TiDB repo&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Both of these methods require the &lt;a href=&#34;https://blog.minifish.org/posts/tidb1#%E7%8E%AF%E5%A2%83%E6%90%AD%E5%BB%BA&#34;&gt;environment setup from the previous article&lt;/a&gt;. If you haven&amp;rsquo;t tried that yet, give it a go.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p>Continuing from <a href="/posts/tidb1">the previous article</a>, we learned how to set up the environment for reading code and where to start reading the code. In this part, we&rsquo;ll introduce methods for viewing code based on some common needs.</p>
<h2 id="how-to-check-the-support-level-of-a-syntax">How to Check the Support Level of a Syntax</h2>
<p>There are usually two methods:</p>
<ol>
<li>Check through the parser repo</li>
<li>Directly check within the TiDB repo</li>
</ol>
<p>Both of these methods require the <a href="/posts/tidb1#%E7%8E%AF%E5%A2%83%E6%90%AD%E5%BB%BA">environment setup from the previous article</a>. If you haven&rsquo;t tried that yet, give it a go.</p>
<h3 id="preparation">Preparation</h3>
<ol>
<li>
<p>Install GoYacc Support</p>
<p><img alt="goyacc" loading="lazy" src="/posts/images/20200712124300.webp"></p>
<p>The GoYacc Support plugin is a creation by a colleague at our company, a third-party plugin officially accepted by JetBrains, a well-regarded product. It includes syntax highlighting and intelligence, which is great!</p>
</li>
<li>
<p>Download <a href="https://github.com/pingcap/parser">parser repo</a></p>
<p>If you&rsquo;re checking syntax directly from the parser, you need to download it manually. If you&rsquo;re navigating from TiDB, IDEA will automatically download the code, so no extra steps are needed.</p>
</li>
</ol>
<h3 id="check-via-parser-repo">Check via parser repo</h3>
<p>Open the parser using IDEA, switch to the branch you need, and locate the parser.y file. However, it is more recommended to check from within TiDB.</p>
<h3 id="check-via-tidb-repo">Check via TiDB repo</h3>
<ol>
<li>
<p>Open the TiDB project with IDEA and switch to the required branch</p>
<p><img alt="co" loading="lazy" src="/posts/images/20200712183012.webp"></p>
</li>
<li>
<p>Find the parser.y file; make sure to select the broadest search scope</p>
<p><img alt="parser.y" loading="lazy" src="/posts/images/20200712183658.webp"></p>
<p>Alternatively, you can find it in the file list,</p>
<p><img alt="parser.y2" loading="lazy" src="/posts/images/20200712184101.webp"></p>
<p><img alt="parser.y3" loading="lazy" src="/posts/images/20200712184157.webp"></p>
</li>
</ol>
<p>Let&rsquo;s take checking the <code>SHOW ENGINES</code> SQL statement as an example.</p>
<p>The entry point for the entire statement parsing is <a href="https://github.com/pingcap/parser/blob/f56688124d8bbba98ca103dbcc667d0e3b9bef30/parser.y#L1309-L1308">Start</a>. Below it is the StatementList, followed by Statement. Under the large list of Statements, you can find ShowStmt.</p>
<p><img alt="parser.y4" loading="lazy" src="/posts/images/20200712184841.webp"></p>
<p>However, ShowStmt is actually quite complex. Another way is to directly search for <code>ShowEngines</code> within parser.y, since naming follows Golang conventions, with camel case and capitalized letters for public exposure. Naturally, if familiar with the code, you&rsquo;d know <code>ShowEngines</code> is under <code>ShowTargetFilterable</code>. Its first branch is <code>ShowEngines</code>.</p>
<p><img alt="parser.y5" loading="lazy" src="/posts/images/20200712185533.webp"></p>
<p><strong>What is the level of support for <code>SHOW ENGINES</code>?</strong></p>
<p>You can look at how <code>ast.ShowEngines</code> is processed. Here, you can&rsquo;t just jump to it; you need to copy and search.</p>
<p><img alt="parser.y6" loading="lazy" src="/posts/images/20200712190242.webp"></p>
<p>You only need to see how it&rsquo;s processed under TiDB, and you can skip test files.</p>
<p><img alt="parser.y7" loading="lazy" src="/posts/images/20200712190752.webp"></p>
<p>One is the actual implementation,</p>
<p><img alt="parser.y7" loading="lazy" src="/posts/images/20200712190839.webp"></p>
<p>The other is the build schema, which you can ignore for now,</p>
<p><img alt="parser.y7" loading="lazy" src="/posts/images/20200712190956.webp"></p>
<p>Entering <code>fetchShowEngines</code>, you can see the specific implementation is simple, running an internal SQL to read a system table.</p>
<p><img alt="parser.y7" loading="lazy" src="/posts/images/20200712191054.webp"></p>
<p>Checking <code>SHOW ENGINES</code> ends here. You can see that it&rsquo;s fully supported.</p>
<p><strong>Which statements only have syntax support?</strong></p>
<p>Taking the temporary table creation syntax as an example, find its position in the parser.y file.</p>
<p><img alt="parser.y8" loading="lazy" src="/posts/images/20200712191711.webp"></p>
<p>It&rsquo;s an option.</p>
<p><img alt="parser.y9" loading="lazy" src="/posts/images/20200712191843.webp"></p>
<p>You can see that if the temporary table option is specified, it simply returns true with an attached warning, stating that the table is still treated as a regular table. Previously, the parser had a lot of operations that only returned without doing anything, not even a warning, but these are now rare.</p>
<h4 id="advantages-of-querying-via-tidb-repo">Advantages of Querying via TiDB repo</h4>
<p>You can see that checking via the TiDB repo allows you to find the parser&rsquo;s detailed hash using IDEA. If you check directly via the parser, you need to first look up the parser’s hash in TiDB’s go.mod, then check out to the corresponding hash in the parser. If you need to check specific implementations, you have to go back to TiDB, making back-and-forth checks less convenient compared to looking within a single project. The only advantage is the ease of blaming commit history.</p>
<h2 id="viewing-and-modifying-default-configuration">Viewing and Modifying Default Configuration</h2>
<p>The default configurations can be easily viewed in TiDB, specifically the variable <a href="https://github.com/pingcap/tidb/blob/72f6a0405837b92e40de979a4f3134d9aa19a5b3/config/config.go#L547">defaultConf</a>. The configurations listed here are the actual default settings.</p>
<p><img alt="conf1" loading="lazy" src="/posts/images/20200713172228.webp"></p>
<p>Taking the first Host configuration as an example, it has a mapping to toml and json files.</p>
<p><img alt="conf2" loading="lazy" src="/posts/images/20200713172535.webp"></p>
<p>This essentially shows how it&rsquo;s written in a toml file. The <code>DefHost</code> following it is the specific default value.</p>
<p><img alt="conf3" loading="lazy" src="/posts/images/20200713180137.webp"></p>
<p>Something important to note is that configurations have a hierarchical relationship. For example, the log-related configuration in the configuration file is:</p>
<p><img alt="conf4" loading="lazy" src="/posts/images/20200715164756.webp"></p>
<p>In the code, it is represented as:</p>
<p><img alt="conf5" loading="lazy" src="/posts/images/20200715164930.webp"></p>
<p>This denotes a configuration called &ldquo;level&rdquo; under the log configuration.</p>
<p>What if you want to add more levels? For instance, the most complex configuration for CopCache adds another layer under tikv-client called copr-cache.</p>
<p><img alt="conf6" loading="lazy" src="/posts/images/20200715165243.webp"></p>
<p>Since toml files do not support multi-level nesting, this leads to the most complex configuration writing in TiDB.</p>
<p><img alt="conf6" loading="lazy" src="/posts/images/20200715165456.webp"></p>
<p>To use non-default configurations with the TiDB started through IDEA as mentioned above, the simplest way is to modify this defaultConf.</p>
<h2 id="summary">Summary</h2>
<p>From this, you can see that checking whether a statement is supported, and whether it’s just syntax support or has a specific implementation, can be achieved with the described methods. You also learned how to view and modify default configurations, allowing you to conduct some verifications yourself. In the next article, I plan to introduce TiDB’s system variables.</p>
]]></content:encoded>
    </item>
    <item>
      <title>How to Read TiDB Source Code (Part 1)</title>
      <link>https://blog.minifish.org/posts/how-to-read-tidb-source-code-part-1/</link>
      <pubDate>Mon, 06 Jul 2020 16:51:00 +0800</pubDate>
      <guid>https://blog.minifish.org/posts/how-to-read-tidb-source-code-part-1/</guid>
      <description>&lt;h2 id=&#34;background&#34;&gt;Background&lt;/h2&gt;
&lt;p&gt;There are many articles on reading the source code of TiDB, often referred to as &lt;a href=&#34;https://pingcap.com/blog-cn/#TiDB-%E6%BA%90%E7%A0%81%E9%98%85%E8%AF%BB&#34;&gt;the &amp;ldquo;Twenty-Four Chapters Scriptures&amp;rdquo;&lt;/a&gt;. However, these introductions typically proceed from a macro to a micro perspective. This series attempts to introduce how to read TiDB&amp;rsquo;s source code from an easier angle. The goals we aim to achieve are:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Enable readers to start reading TiDB&amp;rsquo;s code themselves, rather than understanding it passively through pre-written articles.&lt;/li&gt;
&lt;li&gt;Provide some common examples of looking into the details of the code, such as examining the scope of a variable.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;After all, teaching people to fish is better than giving them fish. While the code changes often, the methods remain mostly unchanged.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<h2 id="background">Background</h2>
<p>There are many articles on reading the source code of TiDB, often referred to as <a href="https://pingcap.com/blog-cn/#TiDB-%E6%BA%90%E7%A0%81%E9%98%85%E8%AF%BB">the &ldquo;Twenty-Four Chapters Scriptures&rdquo;</a>. However, these introductions typically proceed from a macro to a micro perspective. This series attempts to introduce how to read TiDB&rsquo;s source code from an easier angle. The goals we aim to achieve are:</p>
<ol>
<li>Enable readers to start reading TiDB&rsquo;s code themselves, rather than understanding it passively through pre-written articles.</li>
<li>Provide some common examples of looking into the details of the code, such as examining the scope of a variable.</li>
</ol>
<p>After all, teaching people to fish is better than giving them fish. While the code changes often, the methods remain mostly unchanged.</p>
<p>Why choose TiDB to read?</p>
<ol>
<li>
<p>I am not familiar with TiKV or PD.</p>
</li>
<li>
<p>TiDB is the entry point directly interacting with users and is also the most likely to be questioned.</p>
</li>
<li>
<p>TiDB can run independently and be debugged. If you want to run some SQL after reading the code to verify your understanding, it can be easily done.</p>
</li>
</ol>
<h2 id="preparations">Preparations</h2>
<ol>
<li>
<p>A development machine</p>
<p>TiDB is a pure Golang project. It can be conveniently developed on Linux, MacOS, and even Windows. My environment is Windows 10.</p>
</li>
<li>
<p>A copy of the TiDB source code, available for download at the <a href="https://github.com/pingcap/tidb">official repo</a>.</p>
</li>
<li>
<p><a href="https://golang.org/">Golang</a> environment, following the official guide is straightforward.</p>
</li>
<li>
<p>Goland or IntelliJ IDEA + Golang plugin</p>
<p>I personally feel there&rsquo;s no difference between the two. Why not recommend VSCode + Golang plugin? Mainly because I&rsquo;m used to the JetBrains suite, and indeed commercial software tends to be higher quality than community software. For long-term use, it&rsquo;s recommended to pay for it. Students can use it for free, but need to renew the license every year.</p>
</li>
</ol>
<h2 id="environment-setup">Environment Setup</h2>
<ol>
<li>
<p>After installing the Golang environment, remember to set the GOPATH.</p>
</li>
<li>
<p>The TiDB code doesn&rsquo;t need to be developed under the GOPATH, so you can place it anywhere. I usually create a directory called work and throw various codes in there.</p>
</li>
<li>
<p>Open Goland/IDEA. I use IDEA because I often look at code in other languages.</p>
</li>
<li>
<p>Open with IDEA, select the tidb directory.</p>
<p><img alt="src" loading="lazy" src="/posts/images/20200706174108.webp"></p>
</li>
<li>
<p>At this point, IDEA typically prompts you to set up GOROOT and enable Go Modules. Follow the recommendations.</p>
</li>
</ol>
<p>The environment setup is now complete.</p>
<h2 id="entry-points">Entry Points</h2>
<p>At the beginning, someone advised me to start with the session package. However, after some experience, I personally feel there are two better entry points: the <code>main</code> function and the <code>dispatch</code> function.</p>
<h3 id="main-function">main Function</h3>
<p>The <code>main</code> function of TiDB can be seen at <a href="https://github.com/pingcap/tidb/blob/6b6096f1f18a03d655d04d67a2f21d7fbfca2e3f/tidb-server/main.go#L160">link</a>. You can roughly go through what happens when starting a tidb-server from top to bottom.</p>
<p><img alt="main" loading="lazy" src="/posts/images/20200706220211.webp"></p>
<p>From top to bottom:</p>
<ul>
<li>
<p>Parse flags</p>
</li>
<li>
<p>Output version information and exit</p>
</li>
<li>
<p>Register store and monitoring</p>
</li>
<li>
<p>Configuration file check</p>
</li>
<li>
<p>Initialize temporary folders, etc.</p>
</li>
<li>
<p>Set global variables, CPU affinity, log, trace, print server information, set binlog, set monitoring</p>
</li>
<li>
<p>Create store and domain</p>
<p>The <code>createStoreAndDomain</code> method is important, as critical background threads are created here.</p>
</li>
<li>
<p>Create server and register stop signal function</p>
</li>
<li>
<p>Start the server</p>
<p>Within <code>runServer</code>, the <code>srv.Run()</code> actually brings up the tidb-server.
<img alt="run" loading="lazy" src="/posts/images/20200706221611.webp">
In the <code>Run()</code> function here, the server continuously listens to network requests, creating a new connection for each new request and using a new goroutine to serve it continually.</p>
</li>
<li>
<p>After this, cleanup work is done when the server needs to stop, ultimately writing out the logs.</p>
</li>
</ul>
<p>Thus, the entire <code>main</code> function process ends. Through the <code>main</code> function, you can see the complete lifecycle of a server from creation to destruction.</p>
<p>Additionally, with IDEA, you can easily start and debug TiDB. Click on this triangle symbol as shown in the image below:</p>
<p><img alt="run1" loading="lazy" src="/posts/images/20200706222247.webp"></p>
<p><img alt="run2" loading="lazy" src="/posts/images/20200706222457.webp"></p>
<p>A pop-up with options to run and debug the <code>main</code> function will appear. Essentially, this starts a TiDB with default configurations. TiDB defaults to using mocktikv as the storage engine, so it can be started on a single machine for various testing and validation.</p>
<p>As for how to modify the configuration for starting and debugging, this will be introduced in subsequent articles in the series.</p>
<h3 id="dispatch-function">dispatch Function</h3>
<p>From here, we can proceed further to another suitable entry point function, <code>dispatch</code>.</p>
<p>The <code>dispatch</code> function has several characteristics:</p>
<ol>
<li>
<p>Requests coming from clients only enter the <code>dispatch</code> function, meaning from this point onward, user requests are executed. If you set breakpoints here, you can conveniently filter out SQL executed by internal threads.</p>
</li>
<li>
<p>From here, various requests are dispatched into different processing logic, ensuring you don’t miss any user requests. It avoids situations like spending significant time reading text protocol code only to find out the user is actually using a binary protocol.</p>
</li>
<li>
<p><code>dispatch</code> itself is located at a very early stage, meaning its parameters mostly come directly from the client&rsquo;s initial information. If it&rsquo;s a text protocol, directly reading parameters can parse out the SQL text.</p>
</li>
</ol>
<p><img alt="dispatch1" loading="lazy" src="/posts/images/20200707150344.webp"></p>
<p>At the start, <code>dispatch</code> primarily focuses on obtaining tokens corresponding to the token-limit parameter. Requests that can&rsquo;t get a token won&rsquo;t execute, which explains why you can create many connections but only 1000 SQL executions are allowed simultaneously by default.</p>
<p>Next, we enter the most crucial switch case:</p>
<p><img alt="dispatch2" loading="lazy" src="/posts/images/20200707150736.webp"></p>
<p>These commands are MySQL protocol commands, so it&rsquo;s apparent from here exactly what TiDB implements. For comparison, you can refer to <a href="https://dev.mysql.com/doc/internals/en/text-protocol.html">this link</a> (this link is only for the text protocol). For full details, see the figure below:</p>
<p><img alt="dispatch3" loading="lazy" src="/posts/images/20200707151452.webp"></p>
<p>Within <code>dispatch</code>, the most important are <code>mysql.ComQuery</code>, as well as the trio <code>mysql.ComStmtPrepare</code>, <code>mysql.ComStmtExecute</code>, and <code>mysql.ComStmtClose</code>. The latter trio is more frequently used in actual production, hence even more important. In contrast, <code>mysql.ComQuery</code> is generally used only for some simple tests and validations.</p>
<p>Since <code>dispatch</code> is the entry point for interfacing with clients, it can conveniently tally how many requests the database has handled. The so-called QPS derived from monitoring statistics is essentially the number of times this function executes per second. Here arises an issue: in cases like multi-query requests, such as <code>select 1; select 1; select 1;</code>, multiple statements sent together are regarded as a single request by <code>dispatch</code>, but as multiple by clients. While using the binary protocol, some clients prepare a statement, then execute, and finally close it. Seemingly equivalent to executing a single SQL from the client&rsquo;s perspective, the database actually completes three requests.</p>
<p>In summary, users’ perceived QPS may not necessarily align with the number of <code>dispatch</code> function calls. In later versions, the QPS panel in TiDB&rsquo;s monitoring was changed to CPS, which stands for Commands Per Second, representing the number of commands executed per second.</p>
<p>Looking at the callers of <code>dispatch</code> can also reveal information that helps explain some frequently asked questions:</p>
<p><img alt="dispatch4" loading="lazy" src="/posts/images/20200707154120.webp"></p>
<ol>
<li>
<p>An EOF error in <code>dispatch</code> typically means the client has actively disconnected, so there&rsquo;s no need to maintain the database connection, and it is severed.</p>
</li>
<li>
<p>In case of an undetermined error (indicating a transaction&rsquo;s commit is uncertain—whether it has succeeded or failed needs manual intervention for verification), manual intervention is required immediately, and the connection will be closed.</p>
</li>
<li>
<p>If writing binlog fails and <code>ignore-error = false</code>, previously the tidb-server process wouldn&rsquo;t exit but couldn&rsquo;t provide services. Now, the tidb-server will exit directly.</p>
</li>
<li>
<p>For all other <code>dispatch</code> errors, the connection will not be closed, allowing service to continue, but the failure information will be logged as &ldquo;command dispatched failed&rdquo;, which is arguably one of the most critical logs for TiDB.</p>
</li>
</ol>
<h2 id="conclusion">Conclusion</h2>
<p>This concludes the introduction from setting up the environment to finding a reasonable entry point to start reading code. Subsequent posts in the series will cover aspects such as configuration (adjustments, default values), variables (default values, scope, actual range, activation), supported syntax, etc. Stay tuned.</p>
]]></content:encoded>
    </item>
    <item>
      <title>How TiDB Implements the INSERT Statement</title>
      <link>https://blog.minifish.org/posts/how-tidb-implements-the-insert-statement/</link>
      <pubDate>Wed, 11 Jul 2018 14:18:00 +0800</pubDate>
      <guid>https://blog.minifish.org/posts/how-tidb-implements-the-insert-statement/</guid>
      <description>&lt;p&gt;In a previous article &lt;a href=&#34;https://cn.pingcap.com/blog/tidb-source-code-reading-4&#34;&gt;“TiDB Source Code Reading Series (4) Overview of INSERT Statement”&lt;/a&gt;, we introduced the general process of the INSERT statement. Why write a separate article for INSERT? Because in TiDB, simply inserting a piece of data is the simplest and most common case. It becomes more complex when defining various behaviors within the INSERT statement, such as how to handle situations with Unique Key conflicts: Should we return an error? Ignore the current data insertion? Or overwrite existing data? Therefore, this article will continue to delve into the INSERT statement.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p>In a previous article <a href="https://cn.pingcap.com/blog/tidb-source-code-reading-4">“TiDB Source Code Reading Series (4) Overview of INSERT Statement”</a>, we introduced the general process of the INSERT statement. Why write a separate article for INSERT? Because in TiDB, simply inserting a piece of data is the simplest and most common case. It becomes more complex when defining various behaviors within the INSERT statement, such as how to handle situations with Unique Key conflicts: Should we return an error? Ignore the current data insertion? Or overwrite existing data? Therefore, this article will continue to delve into the INSERT statement.</p>
<p>This article will first introduce the classification of INSERT statements in TiDB, along with the syntax and semantics of each statement, and then describe the source code implementation of the five types of INSERT statements.</p>
<h2 id="types-of-insert-statements">Types of INSERT Statements</h2>
<p>In broad terms, TiDB has the following six types of INSERT statements:</p>
<ul>
<li><code>Basic INSERT</code></li>
<li><code>INSERT IGNORE</code></li>
<li><code>INSERT ON DUPLICATE KEY UPDATE</code></li>
<li><code>INSERT IGNORE ON DUPLICATE KEY UPDATE</code></li>
<li><code>REPLACE</code></li>
<li><code>LOAD DATA</code></li>
</ul>
<p>In theory, all six statements belong to the category of INSERT statements.</p>
<p>The first one, <code>Basic INSERT</code>, is the most common INSERT statement, using the syntax <code>INSERT INTO VALUES ()</code>. It implies inserting a record, and if a unique constraint conflict occurs (such as primary key conflict, unique index conflict), it returns an execution failure.</p>
<p>The second, with the syntax <code>INSERT IGNORE INTO VALUES ()</code>, ignores the current INSERT row if a unique constraint conflict occurs and logs a warning. After the statement execution finishes, you can use <code>SHOW WARNINGS</code> to see which rows were not inserted.</p>
<p>The third one, with the syntax <code>INSERT INTO VALUES () ON DUPLICATE KEY UPDATE</code>, updates the conflicting row, then inserts data if there is a conflict. If the updated row conflicts with another row in the table, it returns an error.</p>
<p>The fourth one, similar to the previous case, if the updated row conflicts with another row, this does not insert the row and shows a warning.</p>
<p>The fifth one, with the syntax <code>REPLACE INTO VALUES ()</code>, deletes the conflicting row in the table after a conflict and continues to attempt data insertion. If another conflict occurs again, it continues to delete conflicting data on the table until there is no conflicting data left in the table, then inserts the data.</p>
<p>The last one, using the syntax <code>LOAD DATA INFILE INTO</code>, has semantics similar to <code>INSERT IGNORE</code>, both ignoring conflicts. The difference is that <code>LOAD DATA</code> imports data files into a table, meaning its data source is a CSV data file.</p>
<p>Since <code>INSERT IGNORE ON DUPLICATE KEY UPDATE</code> involves special processing on <code>INSERT ON DUPLICATE KEY UPDATE</code>, it won&rsquo;t be explained in detail separately but will be covered in the same section. Due to the unique nature of <code>LOAD DATA</code>, it will be discussed in other chapters.</p>
<h2 id="basic-insert-statement">Basic INSERT Statement</h2>
<p>The major differences among the several INSERT statements lie in the execution level. Continuing from the <a href="https://cn.pingcap.com/blog/tidb-source-code-reading-4">“TiDB Source Code Reading Series (4) Overview of INSERT Statement”</a>, here is the statement execution process. Those who do not remember the previous content can refer back to the original article.</p>
<p>INSERT&rsquo;s execution logic is located in <a href="https://github.com/pingcap/tidb/blob/ab332eba2a04bc0a996aa72e36190c779768d0f1/executor/insert.go">executor/insert.go</a>. In fact, the execution logic for all four types of INSERT statements covered previously is in this file. Here, we first discuss the most basic <code>Basic INSERT</code>.</p>
<p><code>InsertExec</code> is an implementation of the INSERT executor, conforming to the Executor interface. The most important methods are the following three interfaces:</p>
<ul>
<li>Open: Performs some initialization</li>
<li>Next: Executes the write operation</li>
<li>Close: Performs some cleanup tasks</li>
</ul>
<p>Among them, the most important and complex is the Next method. Depending on whether a SELECT statement is used to retrieve data (<code>INSERT SELECT FROM</code>), the Next process is divided into two branches: <a href="https://github.com/pingcap/tidb/blob/ab332eba2a04bc0a996aa72e36190c779768d0f1/executor/insert_common.go#L180:24">insertRows</a> and <a href="https://github.com/pingcap/tidb/blob/ab332eba2a04bc0a996aa72e36190c779768d0f1/executor/insert_common.go#L277:24">insertRowsFromSelect</a>. Both processes eventually lead to the <code>exec</code> function to execute the INSERT.</p>
<p>In the <code>exec</code> function, the first four types of INSERT statements are processed together. The standard INSERT covered in this section directly enters <a href="https://github.com/pingcap/tidb/blob/5bdf34b9bba3fc4d3e50a773fa8e14d5fca166d5/executor/insert.go#L42:22">insertOneRow</a>.</p>
<p>Before discussing <a href="https://github.com/pingcap/tidb/blob/5bdf34b9bba3fc4d3e50a773fa8e14d5fca166d5/executor/insert.go#L42:22">insertOneRow</a>, let&rsquo;s look at a segment of SQL.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span><span style="color:#66d9ef">CREATE</span> <span style="color:#66d9ef">TABLE</span> t (i INT <span style="color:#66d9ef">UNIQUE</span>);
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">INSERT</span> <span style="color:#66d9ef">INTO</span> t <span style="color:#66d9ef">VALUES</span> (<span style="color:#ae81ff">1</span>);
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">BEGIN</span>;
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">INSERT</span> <span style="color:#66d9ef">INTO</span> t <span style="color:#66d9ef">VALUES</span> (<span style="color:#ae81ff">1</span>);
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">COMMIT</span>;
</span></span></code></pre></div><p>Paste these lines of SQL sequentially into MySQL and TiDB to see the results.</p>
<p>MySQL:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span>mysql<span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">CREATE</span> <span style="color:#66d9ef">TABLE</span> t (i INT <span style="color:#66d9ef">UNIQUE</span>);
</span></span><span style="display:flex;"><span>Query OK, <span style="color:#ae81ff">0</span> <span style="color:#66d9ef">rows</span> affected (<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">15</span> sec)
</span></span><span style="display:flex;"><span>mysql<span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">INSERT</span> <span style="color:#66d9ef">INTO</span> t <span style="color:#66d9ef">VALUES</span> (<span style="color:#ae81ff">1</span>);
</span></span><span style="display:flex;"><span>Query OK, <span style="color:#ae81ff">1</span> <span style="color:#66d9ef">row</span> affected (<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">01</span> sec)
</span></span><span style="display:flex;"><span>mysql<span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">BEGIN</span>;
</span></span><span style="display:flex;"><span>Query OK, <span style="color:#ae81ff">0</span> <span style="color:#66d9ef">rows</span> affected (<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">00</span> sec)
</span></span><span style="display:flex;"><span>mysql<span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">INSERT</span> <span style="color:#66d9ef">INTO</span> t <span style="color:#66d9ef">VALUES</span> (<span style="color:#ae81ff">1</span>);
</span></span><span style="display:flex;"><span>ERROR <span style="color:#ae81ff">1062</span> (<span style="color:#ae81ff">23000</span>): Duplicate entry <span style="color:#e6db74">&#39;1&#39;</span> <span style="color:#66d9ef">for</span> <span style="color:#66d9ef">key</span> <span style="color:#e6db74">&#39;i&#39;</span>
</span></span><span style="display:flex;"><span>mysql<span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">COMMIT</span>;
</span></span><span style="display:flex;"><span>Query OK, <span style="color:#ae81ff">0</span> <span style="color:#66d9ef">rows</span> affected (<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">11</span> sec)
</span></span></code></pre></div><p>TiDB:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span>mysql<span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">CREATE</span> <span style="color:#66d9ef">TABLE</span> t (i INT <span style="color:#66d9ef">UNIQUE</span>);
</span></span><span style="display:flex;"><span>Query OK, <span style="color:#ae81ff">0</span> <span style="color:#66d9ef">rows</span> affected (<span style="color:#ae81ff">1</span>.<span style="color:#ae81ff">04</span> sec)
</span></span><span style="display:flex;"><span>mysql<span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">INSERT</span> <span style="color:#66d9ef">INTO</span> t <span style="color:#66d9ef">VALUES</span> (<span style="color:#ae81ff">1</span>);
</span></span><span style="display:flex;"><span>Query OK, <span style="color:#ae81ff">1</span> <span style="color:#66d9ef">row</span> affected (<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">12</span> sec)
</span></span><span style="display:flex;"><span>mysql<span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">BEGIN</span>;
</span></span><span style="display:flex;"><span>Query OK, <span style="color:#ae81ff">0</span> <span style="color:#66d9ef">rows</span> affected (<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">01</span> sec)
</span></span><span style="display:flex;"><span>mysql<span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">INSERT</span> <span style="color:#66d9ef">INTO</span> t <span style="color:#66d9ef">VALUES</span> (<span style="color:#ae81ff">1</span>);
</span></span><span style="display:flex;"><span>Query OK, <span style="color:#ae81ff">1</span> <span style="color:#66d9ef">row</span> affected (<span style="color:#ae81ff">0</span>.<span style="color:#ae81ff">00</span> sec)
</span></span><span style="display:flex;"><span>mysql<span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">COMMIT</span>;
</span></span><span style="display:flex;"><span>ERROR <span style="color:#ae81ff">1062</span> (<span style="color:#ae81ff">23000</span>): Duplicate entry <span style="color:#e6db74">&#39;1&#39;</span> <span style="color:#66d9ef">for</span> <span style="color:#66d9ef">key</span> <span style="color:#e6db74">&#39;i&#39;</span>
</span></span></code></pre></div><p>As you can see, for INSERT statements, TiDB performs conflict detection at the time of transaction commit, whereas MySQL does it when the statement is executed. The reason for this is that TiDB is designed with a layered structure with TiKV; to ensure efficient execution, only read operations within a transaction must retrieve data from the storage engine, while all write operations are initially placed within the transaction&rsquo;s own <a href="https://github.com/pingcap/tidb/blob/ab332eba2a04bc0a996aa72e36190c779768d0f1/kv/memdb_buffer.go#L31">memDbBuffer</a> in a single TiDB instance. The data is then written to TiKV as a batch during transaction commit. In the implementation, the <a href="https://github.com/pingcap/tidb/blob/e28a81813cfd290296df32056d437ccd17f321fe/kv/kv.go#L23">PresumeKeyNotExists</a> option is set within <a href="https://github.com/pingcap/tidb/blob/5bdf34b9bba3fc4d3e50a773fa8e14d5fca166d5/executor/insert.go#L42:22">insertOneRow</a>, assuming that insertions will not encounter conflicts if no conflicts are detected locally, without needing to check for conflicting data in TiKV. These data are marked as pending verification, and the <code>BatchGet</code> interface is used during the commit process to batch check the whole transaction&rsquo;s pending data.</p>
<p>After all the data goes through <a href="https://github.com/pingcap/tidb/blob/5bdf34b9bba3fc4d3e50a773fa8e14d5fca166d5/executor/insert.go#L42:22">insertOneRow</a> and completes the insertion, the INSERT statement essentially concludes. The remaining tasks involve setting the lastInsertID and other return information, and then returning the results to the client.</p>
<h2 id="insert-ignore-statement">INSERT IGNORE Statement</h2>
<p>The semantics of <code>INSERT IGNORE</code> were introduced earlier. It was mentioned how a standard INSERT checks at the time of commit, but can <code>INSERT IGNORE</code> do the same? The answer is no, because:</p>
<ol>
<li>If <code>INSERT IGNORE</code> is checked at the commit, the transaction module will need to know which rows should be ignored and which should immediately raise errors and roll back, undoubtedly increasing module coupling.</li>
<li>Users want to immediately know which rows were not inserted through <code>INSERT IGNORE</code>. In other words, they would like to see which rows were not actually inserted immediately through <code>SHOW WARNINGS</code>.</li>
</ol>
<p>This requires checking data conflicts promptly when executing <code>INSERT IGNORE</code>. One obvious approach is to try reading the data intended for insertion, logging a warning when finding a conflict, and proceeding to the next row. However, if the statement inserts multiple rows, it would require repetitive reads from TiKV for conflict detection, which would be inefficient. Therefore, TiDB implements a <a href="https://github.com/pingcap/tidb/blob/3c0bfc19b252c129f918ab645c5e7d34d0c3d154/executor/batch_checker.go#L43:6">batchChecker</a>, with the code located in <a href="https://github.com/pingcap/tidb/blob/ab332eba2a04bc0a996aa72e36190c779768d0f1/executor/batch_checker.go">executor/batch_checker.go</a>.</p>
<p>In the <a href="https://github.com/pingcap/tidb/blob/3c0bfc19b252c129f918ab645c5e7d34d0c3d154/executor/batch_checker.go#L43:6">batchChecker</a>, first, prepare the data for insertion, constructing possible conflicting unique constraints into a key within <a href="https://github.com/pingcap/tidb/blob/3c0bfc19b252c129f918ab645c5e7d34d0c3d154/executor/batch_checker.go#L85:24">getKeysNeedCheck</a>. TiDB implements unique constraints by constructing unique keys, as detailed in <a href="https://cn.pingcap.com/blog/tidb-internal-2/">“Three Articles to Understand TiDB&rsquo;s Technical Inside Story – On Computation”</a>.</p>
<p>Then, pass the constructed keys through <a href="https://github.com/pingcap/tidb/blob/c84a71d666b8732593e7a1f0ec3d9b730e50d7bf/kv/txn.go#L97:6">BatchGetValues</a> to read them all at once, resulting in a key-value map where those read are the conflicting data.</p>
<p>Finally, check the keys of the data intended for insertion against the results from <a href="https://github.com/pingcap/tidb/blob/c84a71d666b8732593e7a1f0ec3d9b730e50d7bf/kv/txn.go#L97:6">BatchGetValues</a>. If a conflicting row is found, prepare a warning message and proceed to the next row. If a conflicting row isn’t found, a safe INSERT can proceed. The implementation of this portion is found in <a href="https://github.com/pingcap/tidb/blob/ab332eba2a04bc0a996aa72e36190c779768d0f1/executor/insert_common.go#L490:24">batchCheckAndInsert</a>.</p>
<p>Similarly, after executing the insertion for all data, return information is set, and the execution results are returned to the client.</p>
<h2 id="insert-on-duplicate-key-update-statement">INSERT ON DUPLICATE KEY UPDATE Statement</h2>
<p><code>INSERT ON DUPLICATE KEY UPDATE</code> is the most complex among the INSERT statements. Its semantic essence includes both an INSERT and an UPDATE. The complexity arises since during an UPDATE, a row can be updated to any valid version.</p>
<p>In the previous section, it was discussed how TiDB uses batching to implement conflict checking for special INSERT statements. The same method is used for <code>INSERT ON DUPLICATE KEY UPDATE</code>, but the implementation process is somewhat more complex due to the semantic complexity.</p>
<p>Initially, similar to <code>INSERT IGNORE</code>, the keys constructed from the data to be inserted are read out at once using <a href="https://github.com/pingcap/tidb/blob/c84a71d666b8732593e7a1f0ec3d9b730e50d7bf/kv/txn.go#L97:6">BatchGetValues</a>, resulting in a key-value map. Then, all records corresponding to the read keys are again read using a batch <a href="https://github.com/pingcap/tidb/blob/c84a71d666b8732593e7a1f0ec3d9b730e50d7bf/kv/txn.go#L97:6">BatchGetValues</a>, prepared for possible future UPDATE operations. The specific implementation is in <a href="https://github.com/pingcap/tidb/blob/3c0bfc19b252c129f918ab645c5e7d34d0c3d154/executor/batch_checker.go#L225:24">initDupOldRowValue</a>.</p>
<p>Then, during conflict checking, if a conflict occurs, an UPDATE is performed first. As discussed in the Basic INSERT section earlier, TiDB executes INSERT in TiKV during commit. Similarly, UPDATE is also executed in TiKV during commit. In this UPDATE process, unique constraint conflicts might still occur. If so, then an error is returned. If the statement is <code>INSERT IGNORE ON DUPLICATE KEY UPDATE</code>, this error is ignored, and the next row proceeds.</p>
<p>In the UPDATE from the previous step, another scenario can occur, as in the SQL below:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span><span style="color:#66d9ef">CREATE</span> <span style="color:#66d9ef">TABLE</span> t (i INT <span style="color:#66d9ef">UNIQUE</span>);
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">INSERT</span> <span style="color:#66d9ef">INTO</span> t <span style="color:#66d9ef">VALUES</span> (<span style="color:#ae81ff">1</span>), (<span style="color:#ae81ff">1</span>) <span style="color:#66d9ef">ON</span> DUPLICATE <span style="color:#66d9ef">KEY</span> <span style="color:#66d9ef">UPDATE</span> i <span style="color:#f92672">=</span> i;
</span></span></code></pre></div><p>Here, it is clear that there are no original data in the table; the INSERT in the second line cannot read out possibly conflicting data, but there is a conflict between the two rows of data intended to be inserted themselves. Correct execution here should involve the first 1 being inserted normally, with the second 1 encountering conflict and updating the first 1. Thus, it is necessary to handle it as follows: remove the key-value of the data updated in the previous step from the initial step&rsquo;s key-value map, reconstruct unique constraint keys and values for the data from the UPDATE based on table information, and add this key-value pair back into the initial key-value map for subsequent data conflict checking. The detail implementation is in <a href="https://github.com/pingcap/tidb/blob/2fba9931c7ffbb6dd939d5b890508eaa21281b4f/executor/batch_checker.go#L232">fillBackKeys</a>. This scenario also arises in other INSERT statements like <code>INSERT IGNORE</code>, <code>REPLACE</code>, and <code>LOAD DATA</code>. It is introduced here because <code>INSERT ON DUPLICATE KEY UPDATE</code> showcases the full functionality of the <code>batchChecker</code>.</p>
<p>Finally, after all data completes insertion/update, return information is set, and results are returned to the client.</p>
<h2 id="replace-statement">REPLACE Statement</h2>
<p>Although the REPLACE statement appears as a separate type of DML, in examining its syntax, it is merely replacing INSERT with REPLACE compared to a standard <code>Basic INSERT</code>. The difference is that REPLACE is a one-to-many statement. Briefly, for a typical INSERT statement which needs to INSERT a row and encounters a unique constraint conflict, various treatments are available:</p>
<ul>
<li>Abandon the insert and return an error: <code>Basic INSERT</code></li>
<li>Abandon the insert without error: <code>INSERT IGNORE</code></li>
<li>Abandon the insert, turning it into updating the conflicting row. If the updated value conflicts again,</li>
<li>Return an error: <code>INSERT ON DUPLICATE KEY UPDATE</code></li>
<li>No error: <code>INSERT IGNORE ON DUPLICATE KEY UPDATE</code>They all handle conflicts when a row of data conflicts with a row in the table differently. However, the REPLACE statement is distinct; it will delete all conflicting rows it encounters until there are no more conflicts, and then insert the data. If there are 5 unique indexes in the table, there could be 5 rows conflicting with the row waiting to be inserted. The REPLACE statement will delete these 5 rows all at once and then insert its own data. See the SQL below:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span><span style="color:#66d9ef">CREATE</span> <span style="color:#66d9ef">TABLE</span> t (
</span></span><span style="display:flex;"><span>i int <span style="color:#66d9ef">unique</span>,
</span></span><span style="display:flex;"><span>j int <span style="color:#66d9ef">unique</span>,
</span></span><span style="display:flex;"><span>k int <span style="color:#66d9ef">unique</span>,
</span></span><span style="display:flex;"><span>l int <span style="color:#66d9ef">unique</span>,
</span></span><span style="display:flex;"><span>m int <span style="color:#66d9ef">unique</span>);
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">INSERT</span> <span style="color:#66d9ef">INTO</span> t <span style="color:#66d9ef">VALUES</span>
</span></span><span style="display:flex;"><span>(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">1</span>),
</span></span><span style="display:flex;"><span>(<span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">2</span>),
</span></span><span style="display:flex;"><span>(<span style="color:#ae81ff">3</span>, <span style="color:#ae81ff">3</span>, <span style="color:#ae81ff">3</span>, <span style="color:#ae81ff">3</span>, <span style="color:#ae81ff">3</span>),
</span></span><span style="display:flex;"><span>(<span style="color:#ae81ff">4</span>, <span style="color:#ae81ff">4</span>, <span style="color:#ae81ff">4</span>, <span style="color:#ae81ff">4</span>, <span style="color:#ae81ff">4</span>);
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">REPLACE</span> <span style="color:#66d9ef">INTO</span> t <span style="color:#66d9ef">VALUES</span> (<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">3</span>, <span style="color:#ae81ff">4</span>, <span style="color:#ae81ff">5</span>);
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">SELECT</span> <span style="color:#f92672">*</span> <span style="color:#66d9ef">FROM</span> t;
</span></span><span style="display:flex;"><span>i j k l m
</span></span><span style="display:flex;"><span><span style="color:#ae81ff">1</span> <span style="color:#ae81ff">2</span> <span style="color:#ae81ff">3</span> <span style="color:#ae81ff">4</span> <span style="color:#ae81ff">5</span>
</span></span></code></pre></div><p>After execution, it actually affects 5 rows of data.</p>
<p>Once we understand the uniqueness of the REPLACE statement, we can more easily comprehend its specific implementation.</p>
<p>Similar to the INSERT statement, the main execution part of the REPLACE statement is also in its Next method. Unlike INSERT, it passes its own <a href="https://github.com/pingcap/tidb/blob/f6dbad0f5c3cc42cafdfa00275abbd2197b8376b/executor/replace.go#L160">exec</a> method through <a href="https://github.com/pingcap/tidb/blob/ab332eba2a04bc0a996aa72e36190c779768d0f1/executor/insert_common.go#L277:24">insertRowsFromSelect</a> and <a href="https://github.com/pingcap/tidb/blob/ab332eba2a04bc0a996aa72e36190c779768d0f1/executor/insert_common.go#L180:24">insertRows</a>. In <a href="https://github.com/pingcap/tidb/blob/f6dbad0f5c3cc42cafdfa00275abbd2197b8376b/executor/replace.go#L160">exec</a>, it calls <a href="https://github.com/pingcap/tidb/blob/f6dbad0f5c3cc42cafdfa00275abbd2197b8376b/executor/replace.go#L95">replaceRow</a>, which also uses batch conflict detection in <a href="https://github.com/pingcap/tidb/blob/3c0bfc19b252c129f918ab645c5e7d34d0c3d154/executor/batch_checker.go#L43:6">batchChecker</a>. The difference from INSERT is that all detected conflicts are deleted here, and finally, the row to be inserted is written in.</p>
<h2 id="in-conclusion">In Conclusion</h2>
<p>The INSERT statement is among the most complex, versatile, and powerful of all DML statements. It includes statements like <code>INSERT ON DUPLICATE UPDATE</code>, which can perform both INSERT and UPDATE operations, and REPLACE, where a single row of data can impact many rows. The INSERT statement itself can be connected to a SELECT statement as input for the data to be inserted, thus its implementation is influenced by the planner (for more on the planner, see related source code reading articles: <a href="https://cn.pingcap.com/blog/tidb-source-code-reading-7/">Part 7: Rule-Based Optimization</a> and <a href="https://cn.pingcap.com/blog/tidb-source-code-reading-8/">Part 8: Cost-Based Optimization</a>). Familiarity with the implementation of various INSERT-related statements in TiDB can help readers use these statements more reasonably and efficiently in the future. Additionally, readers interested in contributing code to TiDB can also gain a quicker understanding of this part of the implementation through this article.</p>
]]></content:encoded>
    </item>
    <item>
      <title>How to Test CockroachDB Performance Using Benchmarksql</title>
      <link>https://blog.minifish.org/posts/how-to-test-cockroachdb-performance-using-benchmarksql/</link>
      <pubDate>Fri, 06 Jul 2018 21:21:00 +0800</pubDate>
      <guid>https://blog.minifish.org/posts/how-to-test-cockroachdb-performance-using-benchmarksql/</guid>
      <description>&lt;h2 id=&#34;why-test-tpc-c&#34;&gt;Why Test TPC-C&lt;/h2&gt;
&lt;p&gt;First of all, TPC-C is the de facto OLTP benchmark standard. It is a set of specifications, and any database can publish its test results under this standard, so there&amp;rsquo;s no issue of quarreling over the testing tools used.&lt;/p&gt;
&lt;p&gt;Secondly, TPC-C is closer to real-world scenarios as it includes a transaction model within it. In the flow of this transaction model, there are both high-frequency simple transaction statements and low-frequency inventory query statements. Therefore, it tests the database more comprehensively and practically.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<h2 id="why-test-tpc-c">Why Test TPC-C</h2>
<p>First of all, TPC-C is the de facto OLTP benchmark standard. It is a set of specifications, and any database can publish its test results under this standard, so there&rsquo;s no issue of quarreling over the testing tools used.</p>
<p>Secondly, TPC-C is closer to real-world scenarios as it includes a transaction model within it. In the flow of this transaction model, there are both high-frequency simple transaction statements and low-frequency inventory query statements. Therefore, it tests the database more comprehensively and practically.</p>
<h2 id="testing-tpc-c-on-cockroachdb">Testing TPC-C on CockroachDB</h2>
<p>This year, CockroachDB released its TPC-C performance results. However, unfortunately, they did not use a tool recognized by the database industry that implements the TPC-C standard for testing. Instead, they used their own implementation of a TPC-C tool. The compliance level of this tool was not recognized. In the white paper officially released by them, it is also mentioned that this TPC-C cannot be compared with the TPC-C standard.</p>
<p>Therefore, I thought of using a highly recognized tool in the industry for testing. Here, I chose Benchmarksql version 5.0.</p>
<p>Benchmarksql 5.0 supports the PostgreSQL protocol, Oracle protocol, and MySQL protocol (the MySQL protocol is supported in the code, but the author hasn&rsquo;t fully tested it, so the official documentation doesn&rsquo;t mention MySQL). Among these, the PostgreSQL protocol is supported by CockroachDB.</p>
<h3 id="test-preparation">Test Preparation</h3>
<p>After preparing the Benchmarksql code, don&rsquo;t rush into testing. There are three main pitfalls here that need to be addressed first.</p>
<ol>
<li>
<p><strong>CockroachDB does not support adding a primary key after table creation.</strong> Therefore, you need to include the primary key when creating the table. Specifically, in the <code>run</code> folder under the root directory of the Benchmarksql code, create a <code>sql.cdb</code> folder. Copy <code>tableCreates.sql</code> and <code>indexCreates.sql</code> from the <code>sql.common</code> folder at the same level into <code>sql.cdb</code>. Then move the primary keys in <code>indexCreates.sql</code> into the table creation statements in <code>tableCreates.sql</code>. For how to define indexes while creating tables, please refer to the database documentation syntax via Google.</p>
</li>
<li>
<p><strong>CockroachDB is a &ldquo;strongly typed&rdquo; database.</strong> This is my own way of describing it. It has a rather peculiar behavior: when you add different data types (e.g., int + float), it will report an error saying, &ldquo;InternalError: unsupported binary operator: &lt;int&gt; + &lt;float&gt;&rdquo;. Generally, databases don&rsquo;t behave like this; most would perform some implicit conversions, or in other words, they are very tolerant of SQL writers. But CockroachDB is unique in that if you don&rsquo;t specify the type, it reports an error. This greatly reduces the burden of type inference in its internal implementation.</p>
<p>This behavior causes Benchmarksql to fail to run the tests properly. The solution is to add the required type at the position where the error occurs. For example, change <code>update t set i = i + ?;</code> (the <code>?</code> is generally filled in using <code>prepare/execute</code>) to <code>update t set i = i + ?::DECIMAL;</code>. Yes, CockroachDB specifies types explicitly by adding <code>::&lt;type_name&gt;</code> at the end. But strangely, not all additions require type specification.</p>
</li>
<li>
<p><strong>CockroachDB does not support <code>SELECT FOR UPDATE</code>.</strong> This is the easiest to solve: comment out all <code>FOR UPDATE</code> clauses in Benchmarksql. CockroachDB itself supports the serializable isolation level; lacking <code>FOR UPDATE</code> doesn&rsquo;t affect consistency.</p>
</li>
</ol>
<h3 id="starting-the-test">Starting the Test</h3>
<p>After overcoming the pitfalls mentioned above, you can proceed with the normal testing process: creating the database, creating tables and indexes, importing data, and testing. You can refer to Benchmarksql&rsquo;s <code>HOW-TO-RUN.txt</code>.</p>
<h3 id="test-results">Test Results</h3>
<p>On my test machine with 40 cores, 128 GB of memory, and SSD, under 100 warehouses, the tpmC is approximately 5,000. This is about one-tenth of PostgreSQL 10 on the same machine. PostgreSQL can reach around 500,000 tpmC.</p>
]]></content:encoded>
    </item>
    <item>
      <title>How to Test CockroachDB Performance Using Sysbench</title>
      <link>https://blog.minifish.org/posts/how-to-test-cockroachdb-performance-using-sysbench/</link>
      <pubDate>Mon, 11 Jun 2018 13:50:00 +0800</pubDate>
      <guid>https://blog.minifish.org/posts/how-to-test-cockroachdb-performance-using-sysbench/</guid>
      <description>&lt;h2 id=&#34;compiling-sysbench-with-pgsql-support&#34;&gt;Compiling Sysbench with pgsql Support&lt;/h2&gt;
&lt;p&gt;CockroachDB uses the PostgreSQL protocol. If you want to use Sysbench for testing, you need to enable pg protocol support in Sysbench. Sysbench already supports the pg protocol, but it is not enabled by default during compilation. You can configure it with the following command:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;./configure --with-pgsql
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Of course, preliminary work involves downloading the Sysbench source code and installing the necessary PostgreSQL header files required for compilation (you can use &lt;code&gt;yum&lt;/code&gt; or &lt;code&gt;sudo&lt;/code&gt; to install them).&lt;/p&gt;</description>
      <content:encoded><![CDATA[<h2 id="compiling-sysbench-with-pgsql-support">Compiling Sysbench with pgsql Support</h2>
<p>CockroachDB uses the PostgreSQL protocol. If you want to use Sysbench for testing, you need to enable pg protocol support in Sysbench. Sysbench already supports the pg protocol, but it is not enabled by default during compilation. You can configure it with the following command:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-shell" data-lang="shell"><span style="display:flex;"><span>./configure --with-pgsql
</span></span></code></pre></div><p>Of course, preliminary work involves downloading the Sysbench source code and installing the necessary PostgreSQL header files required for compilation (you can use <code>yum</code> or <code>sudo</code> to install them).</p>
<h2 id="testing">Testing</h2>
<p>The testing method is no different from testing MySQL or PostgreSQL; you can test any of the create, read, update, delete (CRUD) operations you like. The only thing to note is to set <code>auto_inc</code> to <code>off</code>.</p>
<p>This is because CockroachDB&rsquo;s auto-increment behavior is different from PostgreSQL&rsquo;s. It generates a unique <code>id</code>, but it does not guarantee that the <code>id</code>s are sequential or incremental. This is fine when inserting data. However, during delete, update, or query operations, since all SQL statements use <code>id</code> as the condition for these operations, you may encounter situations where data cannot be found.</p>
<p>That is:</p>
<p>When <code>auto_inc = on</code> (which is the default value in Sysbench)</p>
<h3 id="table-structure">Table Structure</h3>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span><span style="color:#66d9ef">CREATE</span> <span style="color:#66d9ef">TABLE</span> sbtest1 (
</span></span><span style="display:flex;"><span>   id INT <span style="color:#66d9ef">NOT</span> <span style="color:#66d9ef">NULL</span> <span style="color:#66d9ef">DEFAULT</span> unique_rowid(),
</span></span><span style="display:flex;"><span>   k INTEGER <span style="color:#66d9ef">NOT</span> <span style="color:#66d9ef">NULL</span> <span style="color:#66d9ef">DEFAULT</span> <span style="color:#ae81ff">0</span>:::INT,
</span></span><span style="display:flex;"><span>   <span style="color:#66d9ef">c</span> STRING(<span style="color:#ae81ff">120</span>) <span style="color:#66d9ef">NOT</span> <span style="color:#66d9ef">NULL</span> <span style="color:#66d9ef">DEFAULT</span> <span style="color:#e6db74">&#39;&#39;</span>:::STRING,
</span></span><span style="display:flex;"><span>   <span style="color:#66d9ef">pad</span> STRING(<span style="color:#ae81ff">60</span>) <span style="color:#66d9ef">NOT</span> <span style="color:#66d9ef">NULL</span> <span style="color:#66d9ef">DEFAULT</span> <span style="color:#e6db74">&#39;&#39;</span>:::STRING,
</span></span><span style="display:flex;"><span>   <span style="color:#66d9ef">CONSTRAINT</span> <span style="color:#e6db74">&#34;&#34;</span><span style="color:#66d9ef">primary</span><span style="color:#e6db74">&#34;&#34;</span> <span style="color:#66d9ef">PRIMARY</span> <span style="color:#66d9ef">KEY</span> (id <span style="color:#66d9ef">ASC</span>),
</span></span><span style="display:flex;"><span>   <span style="color:#66d9ef">INDEX</span> k_1 (k <span style="color:#66d9ef">ASC</span>),
</span></span><span style="display:flex;"><span>   FAMILY <span style="color:#e6db74">&#34;&#34;</span><span style="color:#66d9ef">primary</span><span style="color:#e6db74">&#34;&#34;</span> (id, k, <span style="color:#66d9ef">c</span>, <span style="color:#66d9ef">pad</span>)
</span></span><span style="display:flex;"><span>)
</span></span></code></pre></div><h3 id="data">Data</h3>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span>root<span style="color:#f92672">@</span>:<span style="color:#ae81ff">26257</span><span style="color:#f92672">/</span>sbtest<span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">SELECT</span> id <span style="color:#66d9ef">FROM</span> sbtest1 <span style="color:#66d9ef">ORDER</span> <span style="color:#66d9ef">BY</span> id <span style="color:#66d9ef">LIMIT</span> <span style="color:#ae81ff">1</span>;
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">--------------------+
</span></span></span><span style="display:flex;"><span><span style="color:#f92672">|</span>         id         <span style="color:#f92672">|</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">--------------------+
</span></span></span><span style="display:flex;"><span><span style="color:#f92672">|</span> <span style="color:#ae81ff">354033003848892419</span> <span style="color:#f92672">|</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">+</span><span style="color:#75715e">--------------------+
</span></span></span></code></pre></div><p>As you can see, the data does not start from <code>1</code>, nor is it sequential. Normally, the <code>id</code> in a Sysbench table should be within the range <code>[1, table_size]</code>.</p>
<h3 id="sql">SQL</h3>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span><span style="color:#66d9ef">UPDATE</span> sbtest<span style="color:#f92672">%</span>u <span style="color:#66d9ef">SET</span> k <span style="color:#f92672">=</span> k <span style="color:#f92672">+</span> <span style="color:#ae81ff">1</span> <span style="color:#66d9ef">WHERE</span> id <span style="color:#f92672">=</span> <span style="color:#f92672">?</span>
</span></span></code></pre></div><p>Taking the <code>UPDATE</code> statement as an example, <code>id</code> is used as the query condition. Sysbench assumes that this <code>id</code> should be between <code>[1, table_size]</code>, but in reality, it&rsquo;s not.</p>
<h3 id="example-of-correct-testing-command-line">Example of Correct Testing Command Line</h3>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-shell" data-lang="shell"><span style="display:flex;"><span>sysbench --db-driver<span style="color:#f92672">=</span>pgsql --pgsql-host<span style="color:#f92672">=</span>127.0.0.1 --pgsql-port<span style="color:#f92672">=</span><span style="color:#ae81ff">26257</span> --pgsql-user<span style="color:#f92672">=</span>root --pgsql-db<span style="color:#f92672">=</span>sbtest <span style="color:#ae81ff">\
</span></span></span><span style="display:flex;"><span>        --time<span style="color:#f92672">=</span><span style="color:#ae81ff">180</span> --threads<span style="color:#f92672">=</span><span style="color:#ae81ff">50</span> --report-interval<span style="color:#f92672">=</span><span style="color:#ae81ff">10</span> --tables<span style="color:#f92672">=</span><span style="color:#ae81ff">32</span> --table-size<span style="color:#f92672">=</span><span style="color:#ae81ff">10000000</span> <span style="color:#ae81ff">\
</span></span></span><span style="display:flex;"><span>        oltp_update_index <span style="color:#ae81ff">\
</span></span></span><span style="display:flex;"><span>        --sum_ranges<span style="color:#f92672">=</span><span style="color:#ae81ff">50</span> --distinct_ranges<span style="color:#f92672">=</span><span style="color:#ae81ff">50</span> --range_size<span style="color:#f92672">=</span><span style="color:#ae81ff">100</span> --simple_ranges<span style="color:#f92672">=</span><span style="color:#ae81ff">100</span> --order_ranges<span style="color:#f92672">=</span><span style="color:#ae81ff">100</span> <span style="color:#ae81ff">\
</span></span></span><span style="display:flex;"><span>        --index_updates<span style="color:#f92672">=</span><span style="color:#ae81ff">100</span> --non_index_updates<span style="color:#f92672">=</span><span style="color:#ae81ff">10</span> --auto_inc<span style="color:#f92672">=</span>off prepare/run/cleanup
</span></span></code></pre></div><h3 id="insert-testing">INSERT Testing</h3>
<p>Let&rsquo;s discuss the INSERT test separately. The INSERT test refers to Sysbench&rsquo;s <code>oltp_insert</code>. The characteristic of this test is that when <code>auto_inc</code> is <code>on</code>, data is inserted during the prepare phase of the test; otherwise, only the table is created without inserting data. Because when <code>auto_inc</code> is <code>on</code>, after the prepare phase, during the run phase, the inserted data will not cause conflicts due to the guarantee of the auto-increment column. When <code>auto_inc</code> is <code>off</code>, the <code>id</code> of the data inserted during the run phase is randomly assigned, which aligns with some actual testing scenarios.</p>
<p>For CockroachDB, when testing INSERT operations with <code>auto_inc</code> set to <code>off</code>, after the prepare phase, during the run phase of data insertion, you can observe the monitoring metrics (by connecting to CockroachDB&rsquo;s HTTP port) under the &ldquo;Distribution&rdquo; section in &ldquo;KV Transactions&rdquo;. You&rsquo;ll notice a large number of &ldquo;Fast-path Committed&rdquo; transactions. This indicates that transactions are committed using one-phase commit (1PC). That is, the data involved in the transaction does not span across CockroachDB nodes, so there&rsquo;s no need to ensure consistency through two-phase commit transactions. This is an optimization in CockroachDB, which is very effective in INSERT tests and can deliver excellent performance.</p>
<p>If <code>auto_inc</code> is <code>on</code>, although for other tests that require read-before-write operations, the results in CockroachDB might be inflated, it is still fair for the INSERT test. If time permits, you can supplement the tests to see the differences.</p>
]]></content:encoded>
    </item>
    <item>
      <title>How to View CMU DB Group&#39;s OLTP-Bench</title>
      <link>https://blog.minifish.org/posts/how-to-view-cmu-db-groups-oltp-bench/</link>
      <pubDate>Fri, 23 Feb 2018 00:00:00 +0000</pubDate>
      <guid>https://blog.minifish.org/posts/how-to-view-cmu-db-groups-oltp-bench/</guid>
      <description>&lt;h2 id=&#34;introduction-to-oltp-bench&#34;&gt;Introduction to OLTP-Bench&lt;/h2&gt;
&lt;p&gt;OLTP-Bench is an open-source benchmarking tool platform for OLTP scenarios from CMU&amp;rsquo;s DB Group. It was designed to provide a simple, easy-to-use, and extensible testing platform.&lt;/p&gt;
&lt;p&gt;It connects to databases via the JDBC interface, supporting the following test suites:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;TPC-C&lt;/li&gt;
&lt;li&gt;Wikipedia&lt;/li&gt;
&lt;li&gt;Synthetic Resource Stresser&lt;/li&gt;
&lt;li&gt;Twitter&lt;/li&gt;
&lt;li&gt;Epinions.com&lt;/li&gt;
&lt;li&gt;TATP&lt;/li&gt;
&lt;li&gt;AuctionMark&lt;/li&gt;
&lt;li&gt;SEATS&lt;/li&gt;
&lt;li&gt;YCSB&lt;/li&gt;
&lt;li&gt;JPAB (Hibernate)&lt;/li&gt;
&lt;li&gt;CH-benCHmark&lt;/li&gt;
&lt;li&gt;Voter (Japanese &amp;ldquo;American Idol&amp;rdquo;)&lt;/li&gt;
&lt;li&gt;SIBench (Snapshot Isolation)&lt;/li&gt;
&lt;li&gt;SmallBank&lt;/li&gt;
&lt;li&gt;LinkBench&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Detailed project information can be found &lt;a href=&#34;http://db.cs.cmu.edu/projects/oltp-bench/&#34;&gt;here&lt;/a&gt;, and the GitHub page is &lt;a href=&#34;https://github.com/oltpbenchmark/oltpbench&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<h2 id="introduction-to-oltp-bench">Introduction to OLTP-Bench</h2>
<p>OLTP-Bench is an open-source benchmarking tool platform for OLTP scenarios from CMU&rsquo;s DB Group. It was designed to provide a simple, easy-to-use, and extensible testing platform.</p>
<p>It connects to databases via the JDBC interface, supporting the following test suites:</p>
<ul>
<li>TPC-C</li>
<li>Wikipedia</li>
<li>Synthetic Resource Stresser</li>
<li>Twitter</li>
<li>Epinions.com</li>
<li>TATP</li>
<li>AuctionMark</li>
<li>SEATS</li>
<li>YCSB</li>
<li>JPAB (Hibernate)</li>
<li>CH-benCHmark</li>
<li>Voter (Japanese &ldquo;American Idol&rdquo;)</li>
<li>SIBench (Snapshot Isolation)</li>
<li>SmallBank</li>
<li>LinkBench</li>
</ul>
<p>Detailed project information can be found <a href="http://db.cs.cmu.edu/projects/oltp-bench/">here</a>, and the GitHub page is <a href="https://github.com/oltpbenchmark/oltpbench">here</a>.</p>
<p>The project introduction page includes three papers published by the authors, with the one from 2013 being the most important, also linked on the GitHub page.</p>
<p>Based on the GitHub page, the project does not seem to have a high level of attention and has not been very active recently. Most issues and pull requests come from within CMU.</p>
<h2 id="oltp-bench-an-extensible-testbed-for-benchmarking-relational-databases">OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases</h2>
<p>The paper &ldquo;OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases&rdquo; can be regarded as the most detailed introduction to this project.</p>
<p>In the first and second chapters, the authors introduce the motivation for creating this framework, which is to integrate multiple test sets and provide features that simple benchmarking tools do not have, while offering excellent extensibility to attract developers to support more databases.</p>
<p>From the activity on GitHub, it is evident that this extensibility is more about adding database support rather than test sets. However, the number of supported test suites is already quite extensive.</p>
<p>Chapter three introduces the architectural design, with a focus on test suite management, load generators, SQL syntax conversion, multi-client scenarios (similar to multiple sysbench instances stressing a single MySQL), and result collection.</p>
<p>Chapter four discusses the supported test suites. I&rsquo;m only familiar with TPCC and YCSB. The authors classify them from three perspectives:</p>
<ol>
<li>Transaction-focused, such as TPCC and SmallBank</li>
<li>Internet applications, like LinkBench and Wikipedia</li>
<li>Specialized tests, such as YCSB and SIBench</li>
</ol>
<p>Further details can be seen in the table:
[table]</p>
<p>Chapter five describes the demo deployment environment, with subsequent sections introducing the demo&rsquo;s features.</p>
<p>Chapter six uses the demo from the previous chapter to introduce features, analyzed as follows:</p>
<ol>
<li>
<p>Rate control. It seems odd for a benchmarking tool to perform rate control, as the conventional understanding is to push performance as high as possible to gauge system limits. The paper provides an example using the Wikipedia test suite, increasing by 25 TPS every 10 seconds to observe database latency changes.</p>
</li>
<li>
<p>Tagging different transactions in the same test suite for separate statistics – using TPCC as an example to statistically categorize transactions from different stages.</p>
</li>
<li>
<p>Modifying load content, like switching from read-only to write-only loads.</p>
</li>
<li>
<p>Changing the method for load randomness.</p>
</li>
<li>
<p>Monitoring server status alongside database monitoring by deploying an OLTP-Bench monitor on the server.</p>
</li>
<li>
<p>Running multiple test suites simultaneously, such as running TPCC and YCSB concurrently.</p>
</li>
<li>
<p>Multi-client usage, mentioned in chapter three.</p>
</li>
<li>
<p>Repeatability. To prove OLTP-Bench results are genuine and reliable, the authors tested PG&rsquo;s SSI performance using SIBench from the tool on similarly configured machines, achieving results consistent with those in PG&rsquo;s SSI paper.</p>
</li>
</ol>
<p>In summary, rate control and transaction tagging stand out as novel features, while the rest are not particularly special.</p>
<p>Chapter seven is arguably the most valuable part of the article, discussing cloud environments where users might only have database access and not server control. Users may struggle to assess the cost-effectiveness of different cloud database services or configurations due to charges encompassing CPU, storage, network, and asynchronous sync in some architectures. Thus, using benchmarking tools to derive performance and subsequently calculate cost-effectiveness is particularly worthwhile. This chapter compares varying perspectives: different service providers, configurations, comparing databases on the same configuration, and presents the cost-effectiveness outcomes.</p>
<p>In chapter eight, the authors compare OLTP-Bench with other similar tools, providing a favorable self-assessment.</p>
<p>Chapter nine outlines the authors’ future plans, including support for pure NoSQL, additional databases&rsquo; proprietary SQL syntax, generating real-world load distributions from production data, and support for stored procedures.</p>
<p>In conclusion, as the authors mentioned, this is an integrative framework where ease of use and extensibility are key.</p>
<h2 id="usage-summary">Usage Summary</h2>
<p>OLTP-Bench is relatively simple to install and use, especially the deployment. Its cross-platform nature provides a better user experience compared to traditional tpcc and sysbench. Usage is relatively straightforward due to the plethora of test configuration templates provided, allowing easy initiation of tests with simple configuration file modifications. The test results are stable, although certain features mentioned in papers, like server status monitoring, still require exploration.</p>
<p>I tested all 15 test suites on MySQL 5.7 and TiDB, obtaining the following results:
[table]</p>
<p>Its usability is quite evident. As for the ease of secondary development, it should be relatively simple, considering the entire OLTP-Bench project is not particularly large, with around 40,000 lines of code.</p>
<h2 id="other">Other</h2>
<ul>
<li>tpch: While the framework&rsquo;s code appears to support tpch, it proved unusable during practical tests, likely due to incomplete implementation and thus excluded from the README.</li>
<li>Referring to future work mentioned in chapter nine of the paper, especially &ldquo;generating load to match production data distribution,&rdquo; this remains unimplemented, as seen in the codebase.</li>
</ul>
]]></content:encoded>
    </item>
    <item>
      <title>How to Implement MySQL X Protocol on TiDB</title>
      <link>https://blog.minifish.org/posts/how-to-implement-mysql-x-protocol-on-tidb/</link>
      <pubDate>Wed, 16 Aug 2017 00:00:00 +0000</pubDate>
      <guid>https://blog.minifish.org/posts/how-to-implement-mysql-x-protocol-on-tidb/</guid>
      <description>&lt;h2 id=&#34;some-documents-on-mysql&#34;&gt;Some Documents on MySQL&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Client Usage Guide &lt;a href=&#34;https://dev.mysql.com/doc/refman/5.7/en/mysql-shell.html&#34;&gt;MySQL Shell User Guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Server Configuration Guide &lt;a href=&#34;https://dev.mysql.com/doc/refman/5.7/en/document-store.html&#34;&gt;Using MySQL as a Document Store&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Application Development API Guide &lt;a href=&#34;https://dev.mysql.com/doc/x-devapi-userguide/en/&#34;&gt;X DevAPI User Guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Introduction to Server Internal Implementation &lt;a href=&#34;https://dev.mysql.com/doc/internals/en/x-protocol.html&#34;&gt;X Protocol&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;implementation-principle&#34;&gt;Implementation Principle&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Communication between client and server is over TCP and the protocol uses protobuf.&lt;/li&gt;
&lt;li&gt;After the server receives a message, it decodes and analyzes it. The protocol includes a concept called namespace, which specifically refers to whether the namespace is empty or &amp;ldquo;sql&amp;rdquo;, in which case the message content is executed as a SQL statement; if it is &amp;ldquo;xplugin&amp;rdquo; or &amp;ldquo;mysqlx,&amp;rdquo; the message is handled in another way. The other ways can be divided into:
&lt;ul&gt;
&lt;li&gt;Administrative commands&lt;/li&gt;
&lt;li&gt;CRUD operations&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&amp;ldquo;xplugin&amp;rdquo; and &amp;ldquo;mysqlx&amp;rdquo; have the same function, with the latter being the new name for the former, retained temporarily for compatibility.&lt;/li&gt;
&lt;li&gt;The content of &amp;ldquo;mysqlx&amp;rdquo; messages, apart from explicit command content like kill_client, are mostly transformed into SQL statements which the server processes, essentially turning most into a form where the namespace is &amp;ldquo;sql&amp;rdquo;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;implementation-steps&#34;&gt;Implementation Steps&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;Start a new server for TiDB. The relevant configuration parameters such as IP, port, and socket need to be set.&lt;/li&gt;
&lt;li&gt;Implement the reading and writing functionality for message communication.&lt;/li&gt;
&lt;li&gt;Write a process for this new server to establish connections, including authentication, that follows the protocol. Use tcpdump to capture messages between MySQL and the client to derive protocol content, implementing the process by understanding MySQL source code.&lt;/li&gt;
&lt;li&gt;The server should include contents like the Query Context from the original TiDB server, as it primarily translates into SQL for execution.&lt;/li&gt;
&lt;li&gt;Implement the decoding and handling of messages. Although only a sentence, the workload included is substantial.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In &lt;code&gt;mysqlx_all_msgs.h&lt;/code&gt;, all messages are initialized&lt;/p&gt;</description>
      <content:encoded><![CDATA[<h2 id="some-documents-on-mysql">Some Documents on MySQL</h2>
<ul>
<li>Client Usage Guide <a href="https://dev.mysql.com/doc/refman/5.7/en/mysql-shell.html">MySQL Shell User Guide</a></li>
<li>Server Configuration Guide <a href="https://dev.mysql.com/doc/refman/5.7/en/document-store.html">Using MySQL as a Document Store</a></li>
<li>Application Development API Guide <a href="https://dev.mysql.com/doc/x-devapi-userguide/en/">X DevAPI User Guide</a></li>
<li>Introduction to Server Internal Implementation <a href="https://dev.mysql.com/doc/internals/en/x-protocol.html">X Protocol</a>.</li>
</ul>
<h2 id="implementation-principle">Implementation Principle</h2>
<ul>
<li>Communication between client and server is over TCP and the protocol uses protobuf.</li>
<li>After the server receives a message, it decodes and analyzes it. The protocol includes a concept called namespace, which specifically refers to whether the namespace is empty or &ldquo;sql&rdquo;, in which case the message content is executed as a SQL statement; if it is &ldquo;xplugin&rdquo; or &ldquo;mysqlx,&rdquo; the message is handled in another way. The other ways can be divided into:
<ul>
<li>Administrative commands</li>
<li>CRUD operations</li>
</ul>
</li>
<li>&ldquo;xplugin&rdquo; and &ldquo;mysqlx&rdquo; have the same function, with the latter being the new name for the former, retained temporarily for compatibility.</li>
<li>The content of &ldquo;mysqlx&rdquo; messages, apart from explicit command content like kill_client, are mostly transformed into SQL statements which the server processes, essentially turning most into a form where the namespace is &ldquo;sql&rdquo;.</li>
</ul>
<h2 id="implementation-steps">Implementation Steps</h2>
<ol>
<li>Start a new server for TiDB. The relevant configuration parameters such as IP, port, and socket need to be set.</li>
<li>Implement the reading and writing functionality for message communication.</li>
<li>Write a process for this new server to establish connections, including authentication, that follows the protocol. Use tcpdump to capture messages between MySQL and the client to derive protocol content, implementing the process by understanding MySQL source code.</li>
<li>The server should include contents like the Query Context from the original TiDB server, as it primarily translates into SQL for execution.</li>
<li>Implement the decoding and handling of messages. Although only a sentence, the workload included is substantial.</li>
</ol>
<p>In <code>mysqlx_all_msgs.h</code>, all messages are initialized</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>  init_message_factory()
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    server_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Connection<span style="color:#f92672">::</span>Capabilities<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ServerMessages<span style="color:#f92672">::</span>CONN_CAPABILITIES, <span style="color:#e6db74">&#34;CONN_CAPABILITIES&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Connection.Capabilities&#34;</span>);
</span></span><span style="display:flex;"><span>    server_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Error<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ServerMessages<span style="color:#f92672">::</span>ERROR, <span style="color:#e6db74">&#34;ERROR&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Error&#34;</span>);
</span></span><span style="display:flex;"><span>    server_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Notice<span style="color:#f92672">::</span>Frame<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ServerMessages<span style="color:#f92672">::</span>NOTICE, <span style="color:#e6db74">&#34;NOTICE&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Notice.Frame&#34;</span>);
</span></span><span style="display:flex;"><span>    server_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Ok<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ServerMessages<span style="color:#f92672">::</span>OK, <span style="color:#e6db74">&#34;OK&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Ok&#34;</span>);
</span></span><span style="display:flex;"><span>    server_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Resultset<span style="color:#f92672">::</span>ColumnMetaData<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ServerMessages<span style="color:#f92672">::</span>RESULTSET_COLUMN_META_DATA, <span style="color:#e6db74">&#34;RESULTSET_COLUMN_META_DATA&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Resultset.ColumnMetaData&#34;</span>);
</span></span><span style="display:flex;"><span>    server_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Resultset<span style="color:#f92672">::</span>FetchDone<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ServerMessages<span style="color:#f92672">::</span>RESULTSET_FETCH_DONE, <span style="color:#e6db74">&#34;RESULTSET_FETCH_DONE&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Resultset.FetchDone&#34;</span>);
</span></span><span style="display:flex;"><span>    server_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Resultset<span style="color:#f92672">::</span>FetchDoneMoreResultsets<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ServerMessages<span style="color:#f92672">::</span>RESULTSET_FETCH_DONE_MORE_RESULTSETS, <span style="color:#e6db74">&#34;RESULTSET_FETCH_DONE_MORE_RESULTSETS&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Resultset.FetchDoneMoreResultsets&#34;</span>);
</span></span><span style="display:flex;"><span>    server_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Resultset<span style="color:#f92672">::</span>Row<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ServerMessages<span style="color:#f92672">::</span>RESULTSET_ROW, <span style="color:#e6db74">&#34;RESULTSET_ROW&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Resultset.Row&#34;</span>);
</span></span><span style="display:flex;"><span>    server_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Session<span style="color:#f92672">::</span>AuthenticateOk<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ServerMessages<span style="color:#f92672">::</span>SESS_AUTHENTICATE_OK, <span style="color:#e6db74">&#34;SESS_AUTHENTICATE_OK&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Session.AuthenticateOk&#34;</span>);
</span></span><span style="display:flex;"><span>    server_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Sql<span style="color:#f92672">::</span>StmtExecuteOk<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ServerMessages<span style="color:#f92672">::</span>SQL_STMT_EXECUTE_OK, <span style="color:#e6db74">&#34;SQL_STMT_EXECUTE_OK&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Sql.StmtExecuteOk&#34;</span>);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    client_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Connection<span style="color:#f92672">::</span>CapabilitiesGet<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>CON_CAPABILITIES_GET, <span style="color:#e6db74">&#34;CON_CAPABILITIES_GET&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Connection.CapabilitiesGet&#34;</span>);
</span></span><span style="display:flex;"><span>    client_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Connection<span style="color:#f92672">::</span>CapabilitiesSet<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>CON_CAPABILITIES_SET, <span style="color:#e6db74">&#34;CON_CAPABILITIES_SET&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Connection.CapabilitiesSet&#34;</span>);
</span></span><span style="display:flex;"><span>    client_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Connection<span style="color:#f92672">::</span>Close<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>CON_CLOSE, <span style="color:#e6db74">&#34;CON_CLOSE&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Connection.Close&#34;</span>);
</span></span><span style="display:flex;"><span>    client_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Crud<span style="color:#f92672">::</span>Delete<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>CRUD_DELETE, <span style="color:#e6db74">&#34;CRUD_DELETE&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Crud.Delete&#34;</span>);
</span></span><span style="display:flex;"><span>    client_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Crud<span style="color:#f92672">::</span>Find<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>CRUD_FIND, <span style="color:#e6db74">&#34;CRUD_FIND&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Crud.Find&#34;</span>);
</span></span><span style="display:flex;"><span>    client_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Crud<span style="color:#f92672">::</span>Insert<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>CRUD_INSERT, <span style="color:#e6db74">&#34;CRUD_INSERT&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Crud.Insert&#34;</span>);
</span></span><span style="display:flex;"><span>    client_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Crud<span style="color:#f92672">::</span>Update<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>CRUD_UPDATE, <span style="color:#e6db74">&#34;CRUD_UPDATE&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Crud.Update&#34;</span>);
</span></span><span style="display:flex;"><span>    client_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Crud<span style="color:#f92672">::</span>CreateView<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>CRUD_CREATE_VIEW, <span style="color:#e6db74">&#34;CRUD_CREATE_VIEW&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Crud.CreateView&#34;</span>);
</span></span><span style="display:flex;"><span>    client_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Crud<span style="color:#f92672">::</span>ModifyView<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>CRUD_MODIFY_VIEW, <span style="color:#e6db74">&#34;CRUD_MODIFY_VIEW&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Crud.ModifyView&#34;</span>);
</span></span><span style="display:flex;"><span>    client_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Crud<span style="color:#f92672">::</span>DropView<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>CRUD_DROP_VIEW, <span style="color:#e6db74">&#34;CRUD_DROP_VIEW&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Crud.DropView&#34;</span>);
</span></span><span style="display:flex;"><span>    client_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Expect<span style="color:#f92672">::</span>Close<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>EXPECT_CLOSE, <span style="color:#e6db74">&#34;EXPECT_CLOSE&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Expect.Close&#34;</span>);
</span></span><span style="display:flex;"><span>    client_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Expect<span style="color:#f92672">::</span>Open<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>EXPECT_OPEN, <span style="color:#e6db74">&#34;EXPECT_OPEN&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Expect.Open&#34;</span>);
</span></span><span style="display:flex;"><span>    client_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Session<span style="color:#f92672">::</span>AuthenticateContinue<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>SESS_AUTHENTICATE_CONTINUE, <span style="color:#e6db74">&#34;SESS_AUTHENTICATE_CONTINUE&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Session.AuthenticateContinue&#34;</span>);
</span></span><span style="display:flex;"><span>    client_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Session<span style="color:#f92672">::</span>AuthenticateStart<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>SESS_AUTHENTICATE_START, <span style="color:#e6db74">&#34;SESS_AUTHENTICATE_START&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Session.AuthenticateStart&#34;</span>);
</span></span><span style="display:flex;"><span>    client_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Session<span style="color:#f92672">::</span>Close<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>SESS_CLOSE, <span style="color:#e6db74">&#34;SESS_CLOSE&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Session.Close&#34;</span>);
</span></span><span style="display:flex;"><span>    client_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Session<span style="color:#f92672">::</span>Reset<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>SESS_RESET, <span style="color:#e6db74">&#34;SESS_RESET&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Session.Reset&#34;</span>);
</span></span><span style="display:flex;"><span>    client_message<span style="color:#f92672">&lt;</span>Mysqlx<span style="color:#f92672">::</span>Sql<span style="color:#f92672">::</span>StmtExecute<span style="color:#f92672">&gt;</span>(Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>SQL_STMT_EXECUTE, <span style="color:#e6db74">&#34;SQL_STMT_EXECUTE&#34;</span>, <span style="color:#e6db74">&#34;Mysqlx.Sql.StmtExecute&#34;</span>);
</span></span><span style="display:flex;"><span>  }
</span></span></code></pre></div><p>Server and client messages are that many. Client messages are dispatched in <code>xpl_dispatcher.cc</code>.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>ngs<span style="color:#f92672">::</span>Error_code do_dispatch_command(xpl<span style="color:#f92672">::</span>Session <span style="color:#f92672">&amp;</span>session, xpl<span style="color:#f92672">::</span>Crud_command_handler <span style="color:#f92672">&amp;</span>crudh,
</span></span><span style="display:flex;"><span>                                    xpl<span style="color:#f92672">::</span>Expectation_stack <span style="color:#f92672">&amp;</span>expect, ngs<span style="color:#f92672">::</span>Request <span style="color:#f92672">&amp;</span>command)
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>  <span style="color:#66d9ef">switch</span> (command.get_type())
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">case</span> Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>SQL_STMT_EXECUTE:
</span></span><span style="display:flex;"><span>      <span style="color:#66d9ef">return</span> on_stmt_execute(session, <span style="color:#66d9ef">static_cast</span><span style="color:#f92672">&lt;</span><span style="color:#66d9ef">const</span> Mysqlx<span style="color:#f92672">::</span>Sql<span style="color:#f92672">::</span>StmtExecute<span style="color:#f92672">&amp;&gt;</span>(<span style="color:#f92672">*</span>command.message()));
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">case</span> Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>CRUD_FIND:
</span></span><span style="display:flex;"><span>      <span style="color:#66d9ef">return</span> crudh.execute_crud_find(session, <span style="color:#66d9ef">static_cast</span><span style="color:#f92672">&lt;</span><span style="color:#66d9ef">const</span> Mysqlx<span style="color:#f92672">::</span>Crud<span style="color:#f92672">::</span>Find<span style="color:#f92672">&amp;&gt;</span>(<span style="color:#f92672">*</span>command.message()));
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">case</span> Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>CRUD_INSERT:
</span></span><span style="display:flex;"><span>      <span style="color:#66d9ef">return</span> crudh.execute_crud_insert(session, <span style="color:#66d9ef">static_cast</span><span style="color:#f92672">&lt;</span><span style="color:#66d9ef">const</span> Mysqlx<span style="color:#f92672">::</span>Crud<span style="color:#f92672">::</span>Insert<span style="color:#f92672">&amp;&gt;</span>(<span style="color:#f92672">*</span>command.message()));
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">case</span> Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>CRUD_UPDATE:
</span></span><span style="display:flex;"><span>      <span style="color:#66d9ef">return</span> crudh.execute_crud_update(session, <span style="color:#66d9ef">static_cast</span><span style="color:#f92672">&lt;</span><span style="color:#66d9ef">const</span> Mysqlx<span style="color:#f92672">::</span>Crud<span style="color:#f92672">::</span>Update<span style="color:#f92672">&amp;&gt;</span>(<span style="color:#f92672">*</span>command.message()));
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">case</span> Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>CRUD_DELETE:
</span></span><span style="display:flex;"><span>      <span style="color:#66d9ef">return</span> crudh.execute_crud_delete(session, <span style="color:#66d9ef">static_cast</span><span style="color:#f92672">&lt;</span><span style="color:#66d9ef">const</span> Mysqlx<span style="color:#f92672">::</span>Crud<span style="color:#f92672">::</span>Delete<span style="color:#f92672">&amp;&gt;</span>(<span style="color:#f92672">*</span>command.message()));
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">case</span> Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>CRUD_CREATE_VIEW:
</span></span><span style="display:flex;"><span>      <span style="color:#66d9ef">return</span> crudh.execute_create_view(session, <span style="color:#66d9ef">static_cast</span><span style="color:#f92672">&lt;</span><span style="color:#66d9ef">const</span> Mysqlx<span style="color:#f92672">::</span>Crud<span style="color:#f92672">::</span>CreateView<span style="color:#f92672">&amp;&gt;</span>(<span style="color:#f92672">*</span>command.message()));
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">case</span> Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>CRUD_MODIFY_VIEW:
</span></span><span style="display:flex;"><span>      <span style="color:#66d9ef">return</span> crudh.execute_modify_view(session, <span style="color:#66d9ef">static_cast</span><span style="color:#f92672">&lt;</span><span style="color:#66d9ef">const</span> Mysqlx<span style="color:#f92672">::</span>Crud<span style="color:#f92672">::</span>ModifyView<span style="color:#f92672">&amp;&gt;</span>(<span style="color:#f92672">*</span>command.message()));
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">case</span> Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>CRUD_DROP_VIEW:
</span></span><span style="display:flex;"><span>      <span style="color:#66d9ef">return</span> crudh.execute_drop_view(session, <span style="color:#66d9ef">static_cast</span><span style="color:#f92672">&lt;</span><span style="color:#66d9ef">const</span> Mysqlx<span style="color:#f92672">::</span>Crud<span style="color:#f92672">::</span>DropView<span style="color:#f92672">&amp;&gt;</span>(<span style="color:#f92672">*</span>command.message()));
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">case</span> Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>EXPECT_OPEN:
</span></span><span style="display:flex;"><span>      <span style="color:#66d9ef">return</span> on_expect_open(session, expect, <span style="color:#66d9ef">static_cast</span><span style="color:#f92672">&lt;</span><span style="color:#66d9ef">const</span> Mysqlx<span style="color:#f92672">::</span>Expect<span style="color:#f92672">::</span>Open<span style="color:#f92672">&amp;&gt;</span>(<span style="color:#f92672">*</span>command.message()));
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">case</span> Mysqlx<span style="color:#f92672">::</span>ClientMessages<span style="color:#f92672">::</span>EXPECT_CLOSE:
</span></span><span style="display:flex;"><span>      <span style="color:#66d9ef">return</span> on_expect_close(session, expect, <span style="color:#66d9ef">static_cast</span><span style="color:#f92672">&lt;</span><span style="color:#66d9ef">const</span> Mysqlx<span style="color:#f92672">::</span>Expect<span style="color:#f92672">::</span>Close<span style="color:#f92672">&amp;&gt;</span>(<span style="color:#f92672">*</span>command.message()));
</span></span><span style="display:flex;"><span>  }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>  session.proto().get_protocol_monitor().on_error_unknown_msg_type();
</span></span><span style="display:flex;"><span>  <span style="color:#66d9ef">return</span> ngs<span style="color:#f92672">::</span>Error(ER_UNKNOWN_COM_ERROR, <span style="color:#e6db74">&#34;Unexpected message received&#34;</span>);
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>The rest is filling in the gaps.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-text" data-lang="text"><span style="display:flex;"><span>Client::run =&gt; Client::handle_message =&gt; Session::handle_message =&gt; Session::handle_auth_message =&gt; some auth handlers
</span></span><span style="display:flex;"><span>                                                                 =&gt; Session::handle_ready_message =&gt; xpl::dispatcher::dispatch_command =&gt; ngs::Error_code do_dispatch_command =&gt; some crud handlers
</span></span></code></pre></div><p>Mapping between MySQL type and X protocol type</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-text" data-lang="text"><span style="display:flex;"><span>//     ================= ============ ======= ========== ====== ========
</span></span><span style="display:flex;"><span>//     SQL Type          .type        .length .frac_dig  .flags .charset
</span></span><span style="display:flex;"><span>//     ================= ============ ======= ========== ====== ========
</span></span><span style="display:flex;"><span>//     TINY              SINT         x
</span></span><span style="display:flex;"><span>//     TINY UNSIGNED     UINT         x                  x
</span></span><span style="display:flex;"><span>//     SHORT             SINT         x
</span></span><span style="display:flex;"><span>//     SHORT UNSIGNED    UINT         x                  x
</span></span><span style="display:flex;"><span>//     INT24             SINT         x
</span></span><span style="display:flex;"><span>//     INT24 UNSIGNED    UINT         x                  x
</span></span><span style="display:flex;"><span>//     INT               SINT         x
</span></span><span style="display:flex;"><span>//     INT UNSIGNED      UINT         x                  x
</span></span><span style="display:flex;"><span>//     LONGLONG          SINT         x
</span></span><span style="display:flex;"><span>//     LONGLONG UNSIGNED UINT         x                  x
</span></span><span style="display:flex;"><span>//     DOUBLE            DOUBLE       x       x          x
</span></span><span style="display:flex;"><span>//     FLOAT             FLOAT        x       x          x
</span></span><span style="display:flex;"><span>//     DECIMAL           DECIMAL      x       x          x
</span></span><span style="display:flex;"><span>//     VARCHAR,CHAR,...  BYTES        x                  x      x
</span></span><span style="display:flex;"><span>//     GEOMETRY          BYTES
</span></span><span style="display:flex;"><span>//     TIME              TIME         x
</span></span><span style="display:flex;"><span>//     DATE              DATETIME     x
</span></span><span style="display:flex;"><span>//     DATETIME          DATETIME     x
</span></span><span style="display:flex;"><span>//     YEAR              UINT         x                  x
</span></span><span style="display:flex;"><span>//     TIMESTAMP         DATETIME     x
</span></span><span style="display:flex;"><span>//     SET               SET                                    x
</span></span><span style="display:flex;"><span>//     ENUM              ENUM                                   x
</span></span><span style="display:flex;"><span>//     NULL              BYTES
</span></span><span style="display:flex;"><span>//     BIT               BIT          x
</span></span><span style="display:flex;"><span>//     ================= ============ ======= ========== ====== ========
</span></span></code></pre></div><p>The first SQL field information of MySQL:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-text" data-lang="text"><span style="display:flex;"><span>Field   1:  `@@lower_case_table_names`
</span></span><span style="display:flex;"><span>Catalog:    `def`
</span></span><span style="display:flex;"><span>Database:   ``
</span></span><span style="display:flex;"><span>Table:      ``
</span></span><span style="display:flex;"><span>Org_table:  ``
</span></span><span style="display:flex;"><span>Type:       LONGLONG
</span></span><span style="display:flex;"><span>Collation:  binary (63)
</span></span><span style="display:flex;"><span>Length:     21
</span></span><span style="display:flex;"><span>Max_length: 1
</span></span><span style="display:flex;"><span>Decimals:   0
</span></span><span style="display:flex;"><span>Flags:      UNSIGNED BINARY NUM 
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>Field   2:  `connection_id()`
</span></span><span style="display:flex;"><span>Catalog:    `def`
</span></span><span style="display:flex;"><span>Database:   ``
</span></span><span style="display:flex;"><span>Table:      ``
</span></span><span style="display:flex;"><span>Org_table:  ``
</span></span><span style="display:flex;"><span>Type:       LONGLONG
</span></span><span style="display:flex;"><span>Collation:  binary (63)
</span></span><span style="display:flex;"><span>Length:     21
</span></span><span style="display:flex;"><span>Max_length: 1
</span></span><span style="display:flex;"><span>Decimals:   0
</span></span><span style="display:flex;"><span>Flags:      NOT_NULL UNSIGNED BINARY NUM 
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>Field   3:  `variable_value`
</span></span><span style="display:flex;"><span>Catalog:    `def`
</span></span><span style="display:flex;"><span>Database:   `performance_schema`
</span></span><span style="display:flex;"><span>Table:      `session_status`
</span></span><span style="display:flex;"><span>Org_table:  `session_status`
</span></span><span style="display:flex;"><span>Type:       VAR_STRING
</span></span><span style="display:flex;"><span>Collation:  utf8_general_ci (33)
</span></span><span style="display:flex;"><span>Length:     3072
</span></span><span style="display:flex;"><span>Max_length: 0
</span></span><span style="display:flex;"><span>Decimals:   0
</span></span><span style="display:flex;"><span>Flags:      
</span></span></code></pre></div><p>For TiDB:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-text" data-lang="text"><span style="display:flex;"><span>Field   1:  `@@lower_case_table_names`
</span></span><span style="display:flex;"><span>Catalog:    `def`
</span></span><span style="display:flex;"><span>Database:   ``
</span></span><span style="display:flex;"><span>Table:      ``
</span></span><span style="display:flex;"><span>Org_table:  ``
</span></span><span style="display:flex;"><span>Type:       STRING
</span></span><span style="display:flex;"><span>Collation:  ? (0)
</span></span><span style="display:flex;"><span>Length:     0
</span></span><span style="display:flex;"><span>Max_length: 1
</span></span><span style="display:flex;"><span>Decimals:   31
</span></span><span style="display:flex;"><span>Flags:      
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>Field   2:  `connection_id()`
</span></span><span style="display:flex;"><span>Catalog:    `def`
</span></span><span style="display:flex;"><span>Database:   ``
</span></span><span style="display:flex;"><span>Table:      ``
</span></span><span style="display:flex;"><span>Org_table:  ``
</span></span><span style="display:flex;"><span>Type:       LONGLONG
</span></span><span style="display:flex;"><span>Collation:  binary (63)
</span></span><span style="display:flex;"><span>Length:     20
</span></span><span style="display:flex;"><span>Max_length: 1
</span></span><span style="display:flex;"><span>Decimals:   0
</span></span><span style="display:flex;"><span>Flags:      UNSIGNED BINARY NUM 
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>Field   3:  `variable_value`
</span></span><span style="display:flex;"><span>Catalog:    `def`
</span></span><span style="display:flex;"><span>Database:   ``
</span></span><span style="display:flex;"><span>Table:      ``
</span></span><span style="display:flex;"><span>Org_table:  ``
</span></span><span style="display:flex;"><span>Type:       STRING
</span></span><span style="display:flex;"><span>Collation:  utf8_general_ci (33)
</span></span><span style="display:flex;"><span>Length:     1024
</span></span><span style="display:flex;"><span>Max_length: 0
</span></span><span style="display:flex;"><span>Decimals:   0
</span></span><span style="display:flex;"><span>Flags:      
</span></span></code></pre></div>]]></content:encoded>
    </item>
    <item>
      <title>How to Practice Using SQL</title>
      <link>https://blog.minifish.org/posts/how-to-practice-using-sql/</link>
      <pubDate>Thu, 05 Jun 2014 00:00:00 +0000</pubDate>
      <guid>https://blog.minifish.org/posts/how-to-practice-using-sql/</guid>
      <description>&lt;p&gt;The text provided is a detailed set of instructions and queries for practicing SQL using PostgreSQL 9.4 BETA 2, focusing on creating and querying tables related to students, courses, scores, and teachers. Here&amp;rsquo;s a summary:&lt;/p&gt;
&lt;h2 id=&#34;database-structure&#34;&gt;Database Structure&lt;/h2&gt;
&lt;p&gt;The database consists of four tables:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;STUDENT&lt;/strong&gt;: Contains student number (SNO), name (SNAME), gender (SSEX), birthday (SBIRTHDAY), and class (CLASS).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;COURSE&lt;/strong&gt;: Includes course number (CNO), name (CNAME), and teacher number (TNO).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SCORE&lt;/strong&gt;: Records student number (SNO), course number (CNO), and degree (DEGREE).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;TEACHER&lt;/strong&gt;: Holds teacher number (TNO), name (TNAME), gender (TSEX), birthday (TBIRTHDAY), professional title (PROF), and department (DEPART).&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;sample-data&#34;&gt;Sample Data&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Students such as Zeng Hua, Kang Ming, and Wang Fang are stored with specific details, including their class and gender.&lt;/li&gt;
&lt;li&gt;Courses like &amp;ldquo;Introduction to Computers&amp;rdquo; and &amp;ldquo;Operating Systems&amp;rdquo; are associated with teacher numbers.&lt;/li&gt;
&lt;li&gt;Scores are recorded for students across various courses.&lt;/li&gt;
&lt;li&gt;Teachers are described with their professional roles and departments.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;query-problems&#34;&gt;Query Problems&lt;/h2&gt;
&lt;p&gt;Several SQL queries are suggested for practice, such as:&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p>The text provided is a detailed set of instructions and queries for practicing SQL using PostgreSQL 9.4 BETA 2, focusing on creating and querying tables related to students, courses, scores, and teachers. Here&rsquo;s a summary:</p>
<h2 id="database-structure">Database Structure</h2>
<p>The database consists of four tables:</p>
<ol>
<li><strong>STUDENT</strong>: Contains student number (SNO), name (SNAME), gender (SSEX), birthday (SBIRTHDAY), and class (CLASS).</li>
<li><strong>COURSE</strong>: Includes course number (CNO), name (CNAME), and teacher number (TNO).</li>
<li><strong>SCORE</strong>: Records student number (SNO), course number (CNO), and degree (DEGREE).</li>
<li><strong>TEACHER</strong>: Holds teacher number (TNO), name (TNAME), gender (TSEX), birthday (TBIRTHDAY), professional title (PROF), and department (DEPART).</li>
</ol>
<h2 id="sample-data">Sample Data</h2>
<ul>
<li>Students such as Zeng Hua, Kang Ming, and Wang Fang are stored with specific details, including their class and gender.</li>
<li>Courses like &ldquo;Introduction to Computers&rdquo; and &ldquo;Operating Systems&rdquo; are associated with teacher numbers.</li>
<li>Scores are recorded for students across various courses.</li>
<li>Teachers are described with their professional roles and departments.</li>
</ul>
<h2 id="query-problems">Query Problems</h2>
<p>Several SQL queries are suggested for practice, such as:</p>
<ul>
<li>Extracting specific columns like SNAME, SSEX, and CLASS from the STUDENT table.</li>
<li>Listing distinct departments for teachers.</li>
<li>Calculating and sorting grades within the SCORE table.</li>
<li>Performing database operations to find student averages, count of students per class, and comparing scores.</li>
</ul>
<h2 id="advanced-query-exercises">Advanced Query Exercises</h2>
<ul>
<li>Performing set operations and conditional joins to answer complex questions like finding students who scored more than others or comparing teachers&rsquo; scores.</li>
<li>Use of SQL functions like <code>DATE_PART</code>, subqueries, and unions to gather specific data.</li>
</ul>
<h2 id="additional-queries">Additional Queries</h2>
<ul>
<li>Techniques to refine queries for performance, like avoiding the <code>NOT IN</code> method.</li>
<li>Handling conditions like age calculations using <code>AGE(SBIRTHDAY)</code> and filtering by name patterns.</li>
</ul>
<p>Overall, these exercises provide a robust framework for practicing SQL skills on a structured set of sample data, focusing on various database manipulation and retrieval techniques.7. <strong>Query</strong>:</p>
<pre><code>- `SELECT A.TNAME, B.CNAME FROM TEACHER A JOIN COURSE B ON A.TNO = B.TNO WHERE A.TSEX='男';`
- Explanation: Joins teacher and course tables to select male teachers and their course names.
</code></pre>
<ol>
<li>
<p><strong>Query</strong>:</p>
<ul>
<li><code>SELECT A.* FROM SCORE A WHERE DEGREE=(SELECT MAX(DEGREE) FROM SCORE B);</code></li>
<li>Explanation: Selects all columns from the highest score in the score table.</li>
</ul>
</li>
<li>
<p><strong>Query</strong>:</p>
<ul>
<li><code>SELECT SNAME FROM STUDENT A WHERE SSEX=(SELECT SSEX FROM STUDENT B WHERE B.SNAME='李军');</code></li>
<li>Explanation: Selects student names who have the same gender as the student named &lsquo;Li Jun.&rsquo;</li>
</ul>
</li>
<li>
<p><strong>Query</strong>:</p>
<ul>
<li><code>SELECT SNAME FROM STUDENT A WHERE SSEX=(SELECT SSEX FROM STUDENT B WHERE B.SNAME='李军') AND CLASS=(SELECT CLASS FROM STUDENT C WHERE C.SNAME='李军');</code></li>
<li>Explanation: Selects student names who have the same gender and class as the student named &lsquo;Li Jun.&rsquo;</li>
</ul>
</li>
<li>
<p><strong>Two Answers:</strong></p>
<ul>
<li><code>SELECT A.* FROM SCORE A JOIN STUDENT B ON A.SNO = B.SNO JOIN COURSE C ON A.CNO = C.CNO WHERE B.SSEX='男' AND C.CNAME='计算机导论';</code></li>
<li><code>SELECT * FROM SCORE WHERE SNO IN(SELECT SNO FROM STUDENT WHERE SSEX='男') AND CNO=(SELECT CNO FROM COURSE WHERE CNAME='计算机导论');</code></li>
<li>Explanation: Both queries select scores of male students for the course &lsquo;Introduction to Computer Science.&rsquo;</li>
</ul>
</li>
</ol>
]]></content:encoded>
    </item>
  </channel>
</rss>
