[Verilog] Asynchronous Signal Processing: CDC and Metastability

In the last post, while designing the UART Rx module, I mentioned that care must be taken when handling asynchronous signals (Rx) coming from outside.

When working on RTL designs, you'll often hear the term "Clock Domain Crossing" (CDC). This article will cover the metastability issues that arise when using different clock domains, and the most basic way to mitigate them: the 2-FF Synchronizer.

What is CDC (Clock Domain Crossing)?

CDC (Clock Domain Crossing)
CDC (Clock Domain Crossing)

CDC literally means that the signal crosses different clock domains.

It would be great if all the logic inside the chips we build operated on a single clock, but that's not the case with the chips we actually design.

  • External inputs: Button inputs, peripheral communication signals such as UART/SPI/I2C, etc. arrive at timings completely unrelated to our chip's system clock. (Asynchronous)
  • Multiple clocks: The top module may have a basic clock of 100MHz, while specific sub modules may use clocks of different speeds, such as 50MHz.

When the clocks on the source (transmitting) side and the clocks on the destination (receiving) side are different, the signal is transmitted, which is called CDC.

Why is it a problem? : Metastability

You may think, “Why don’t we just connect the signals?” But let’s think about the Setup Time and Hold Time conditions of a Flip-Flop.

  • Setup Time: Before the rising edge, the data must remain stable for a certain period of time.
  • Hold Time: After the rising edge, the data must not change for a certain amount of time.

What if an external signal changes from 0 to 1 at the exact moment our system clock wakes up (the Setup/Hold window)?

Flip-flops cannot determine whether a state is 0 or 1, and they tremble at an intermediate voltage level (metastable state). This is called metastability.

These unstable values ​​can propagate through the circuit in an unknown form, causing the entire system to malfunction (system failure). This is why RTL engineers must be careful when handling asynchronous signals.

Solution: 2-FF Synchronizer (Double Flopping)

The simplest and most powerful way to optimize this situation is to use two flip-flops. This is called a 2-FF synchronizer.

2-FF timing diagram

How it works

  1. Let's assume the first flip-flop (FF1) enters metastability.
  2. However, the second flip-flop (FF2) waits until the next clock rising edge.
  3. During that one clock cycle, the unstable voltage in FF1 gradually settles to a stable state of either 0 or 1. (This is called the settling time.)
  4. FF2 samples the stable value and passes it to its internal logic.

Note that the solution discussed here does not completely solve the Metastability problem, but rather reduces its probability of occurrence.

Verilog code

module synchronizer (
     input  wire clk
    ,input  wire resetn
    ,input  wire async_in // External asynchronous input
    ,output wire sync_out // Synchronized safe output
);

    reg [1:0] shift_reg;

    always @(posedge clk or negedge resetn) begin
        if (!resetn) begin
            shift_reg <= 2'b00;
        end
        else begin
            // Receive bit [0] and shift to bit [1] (Shift)
            shift_reg <= {shift_reg[0], async_in};
        end
    end

    // Use the output of the second flip-flop
    assign sync_out = shift_reg[1];

endmodule

Pulse Synchronizer (Fast to Slow)

The basic 2-FF synchronizer has one fatal weakness: when sending a short, one-cycle pulse from a fast clock to a slow clock.

  • Source: 100MHz (10ns period)
  • Destination: 100MHz (10ns period)

What if the source sends a 10ns pulse, but the destination clock misses that 10ns and samples it? The receiver won't even know the signal arrived.

Solution: Use the Toggle method

At this time, the pulse is converted into a toggle signal and sent. This is because if the signal level is changed, the slow clock will eventually be able to detect the changed level.

Verilog code

module pulse_synchronizer (
    input wire clk_fast,   // Transmitting side (fast clock)
    input wire resetn,
    input wire pulse_in,   // 1-cycle pulse to transmit
    input wire clk_slow,   // Receive side (slow clock)
    output wire pulse_out  // Received 1-cycle pulse
);

    // 1. [Fast Domain] Converting Pulse to Toggle Signal
    reg toggle_reg;
    always @(posedge clk_fast or negedge resetn) begin
        if (!resetn) toggle_reg <= 1'b0;
        else if (pulse_in) toggle_reg <= ~toggle_reg; // Flip every time a pulse comes
    end

    // 2. [Slow Domain] Toggle signal to 2-FF Synchronization
    reg [2:0] sync_reg; 
    always @(posedge clk_slow or negedge resetn) begin
        if (!resetn) begin
            sync_reg <= 3'b000;
        end
        else begin
            // sync_reg[2] is for edge detection, [1] is the value after synchronization is completed
            sync_reg <= {sync_reg[1:0], toggle_reg};
        end
    end

    // 3. [Slow Domain] Edge Detection (Detects when Toggle has changed and restores it back to Pulse)
    // Generate a pulse if the current value (sync_reg[1]) and the past value (sync_reg[2]) are different
    assign pulse_out = sync_reg[1] ^ sync_reg[2]; 

endmodule

Caution

The 2-FF Synchronizer is not a panacea. It should only be used for single-bit control signals (e.g., Enable, Start, etc.).

Do not use on a data bus (multi-bit). For example, if you pass 1-byte data over 2-FF, a delay difference (skew) may occur for each bit, resulting in an incorrect value (glich) being transmitted.

  • Single Bit: Using 2-FF Synchronizer
  • Multi Bit (Data Bus): Use FIFO (Async FIFO) or Handshaking protocol

In-depth: Simple 2-FF is not enough

The 2-FF Synchronizer discussed above is the most effective method for processing single-bit signals. However, real-world SoC designs often require handling multi-bit signals, such as data buses, and there are corner cases that cannot be identified through simulation alone.

The Pitfalls of Multi-bit Signals: Data Incoherency

"Why can't we just add 2-FFs to each bit on the data bus?" This is the most dangerous idea, because there is a signal arrival time difference (skew) between different clock domains.

Problem Situation: Bit Skew

For example, let's assume the path from the Source domain changes from b00 to b11. Due to differences in wiring length or process micro-variations, the two bits may not arrive at the Destination domain at the same time.

  • Case 1: The lower bit arrives first → mistaken for b01
  • Case 2: Higher bit arrives first → mistaken for b10

Eventually, it was sent from 00 to 11, but a random value (glitch) such as 01 or 10 was sampled in the middle, causing the system to malfunction. This is called Data Incoherency.

Solution: Context-dependent synchronization techniques

  1. Gray Code
    • Usage: When passing counters or pointers (such as FIFO Pointer).
    • Principle: Encodes adjacent values ​​so that only one bit changes between them (e.g. 00 -> 01 -> 11 -> 10).
    • Merit: Since only 1 bit changes, there is no intermediate value error due to skew.
  2. Handshake Protocol
    • Usage: When data does not change frequently.
    • Principle: Send a REQ (request) signal and keep the data until the receiving end sends an ACK (response) signal indicating that the data was successfully received.
    • disadvantage: It slows down (increases latency).
  3. Async FIFO
    • Usage: When transferring large data streams at high speed.
    • Principle: Internally, Gray Code pointers and dual-port memory are used to transfer data most securely. Using a verified IP is recommended.

References: Nandland

Similar Posts