In the last RTL CDC article, we learned about the 2-FF Synchronizer that synchronizes single-bit signals and the dangers of multi-bit signals.
Now, it's time to design an Asynchronous FIFO, the "ultimate" building block of CDC theory and one of the most widely used blocks in SoC and FPGA design. It's a crucial gateway for safely transferring large amounts of data between different clock domains.
Beyond simply copying and using code, we will delve into the internal principles of “why on earth do we need to convert pointers to Gray Code” and “what are the criteria for creating Full/Empty flags”.
1. RTL design challenges of asynchronous FIFOs
The principle of FIFO (First-In-First-Out) is simple.
- Write Pointer (wptr): Address to write data to. Increments by 1 when data arrives.
- Read Pointer (rptr): Address from which data will be read. Incremented by 1 when data is output.
- Empty:
wptr == rptr - Full: When
wptrhas completed one lap and caught up withrptr.
In a synchronous FIFO, both pointers share the same clock, making comparison easy. However, in an asynchronous FIFO, the Write Clock domain and Read Clock domain are different.
I need to bring the other party's pointer to my domain and compare them, but this causes two fatal problems.
- Metastability: The value may change at the moment the opponent's pointer is sampled.
- Data Incoherency (Multi-bit Skew): Pointers are composed of multiple bits. For example, when changing from binary 0111 to 1000, four bits change simultaneously. Due to the wiring delay (skew), the receiver may see an inaccurate value, such as 1111 or 0000, for a split second.
The key to solving this problem is Gray Code.
2. Solution: Gray Code
Gray code has the property that only one bit changes when changing to an adjacent number.
- Binary:
0111 (7)->1000 (8)(4 bit change -> Danger!) - Gray:
0100 (7)->1100 (8)(1 bit change -> safe!)
If Gray code is used, even if metastability or skew occurs, the receiver will only perceive it as either "before the value changed (7)" or "after the value changed (8)". This prevents pointer jumps because erroneous values in the middle cannot occur.
So the basic principles of Asynchronous FIFO design are:
“Convert the pointer to my domain into Gray code and pass it to the other party.”
3. The Secret to Creating Full/Empty: Pessimistic Design
The most important concept in CDC design is pessimistic design. 보수적(Pessimistic) 설계.
There is a delay of at least 2-3 clocks while the other party's pointer passes through the 2-FF Synchronizer. In other words, the other party's pointer that I am looking at now is "not the current value, but the past value from a little while ago."
(1) Empty Generation (Read Domain)
Emptyis created in the Read domain.- Condition:
rptr == synchronized_wptr - The Write Pointer keeps increasing, but it may not appear to be increasing in the Read domain due to synchronization delays.
- Result: In reality, even if 1 or 2 pieces of data are entered and the Empty is resolved, it can still be judged as Empty.
- Is it okay? Yes! It's safe because stopping a read when it determines there's no data is only a performance degradation (throughput), not a functional failure (underflow).
(2) Full Generation (Write Domain)
Fullis created in the Write domain.- Full conditions in Gray Code are a bit complicated.
- MSB must be different (one wheel difference)
- The 2nd MSB must also be different (due to the symmetry of Gray Code)
- The remaining bits must be the same
- The Read Pointer continues to read data and free up space, but the Write domain may still appear full due to synchronization delays.
- Result: In reality, there is space, but it is judged to be full and stops writing.
- Is it okay? Yes! It's safe to load Full early, as the goal is to prevent overflow.
4. Verilog Implementation (Full Source Code)
Here are five complete modules of code that you can use right away in practice.
(1) Asynchronous FIFO Top Module
module async_fifo #(
parameter DSIZE = 8, // Data Size
parameter ASIZE = 4 // Address Size (Depth = 2^4 = 16)
)(
input wire wclk, winc, wrst_n,
input wire rclk, rinc, rrst_n,
input wire [DSIZE-1:0] wdata,
output wire [DSIZE-1:0] rdata,
output wire wfull,
output wire rempty
);
wire [ASIZE-1:0] waddr, raddr;
wire [ASIZE:0] wptr, rptr, wq2_rptr, rq2_wptr;
// 1. Dual Port RAM
fifomem #(DSIZE, ASIZE) u_mem (
.wdata(wdata), .waddr(waddr), .wclk(wclk), .wclken(winc & ~wfull),
.rdata(rdata), .raddr(raddr), .rclk(rclk), .rclken(rinc & ~rempty)
);
// 2. Synchronizers (2-FF)
sync_r2w u_sync_r2w (.wclk(wclk), .wrst_n(wrst_n), .rptr(rptr), .wq2_rptr(wq2_rptr));
sync_w2r u_sync_w2r (.rclk(rclk), .rrst_n(rrst_n), .wptr(wptr), .rq2_wptr(rq2_wptr));
// 3. Write Control Logic (Binary -> Gray, Full Logic)
wptr_full #(ASIZE) u_wptr_full (
.wclk(wclk), .wrst_n(wrst_n), .winc(winc),
.wq2_rptr(wq2_rptr),
.wfull(wfull), .waddr(waddr), .wptr(wptr)
);
// 4. Read Control Logic (Binary -> Gray, Empty Logic)
rptr_empty #(ASIZE) u_rptr_empty (
.rclk(rclk), .rrst_n(rrst_n), .rinc(rinc),
.rq2_wptr(rq2_wptr),
.rempty(rempty), .raddr(raddr), .rptr(rptr)
);
endmodule(2) FIFO Memory (Dual Port RAM)
module fifomem #(
parameter DSIZE = 8, // Data Width
parameter ASIZE = 4 // Address Width
)(
input wire wclk, wclken,
input wire [ASIZE-1:0] waddr,
input wire [DSIZE-1:0] wdata,
input wire rclk, rclken,
input wire [ASIZE-1:0] raddr,
output wire [DSIZE-1:0] rdata
);
// 2^ASIZE depth memory array
reg [DSIZE-1:0] mem [0:(1<<ASIZE)-1];
// Write Logic
always @(posedge wclk) begin
if (wclken) mem[waddr] <= wdata;
end
// Read Logic (No Reset needed for memory cells usually)
assign rdata = mem[raddr];
endmodule(3) Synchronizers (2-FF)
A 2-stage flip-flop for safe pointer passing.
// Read Pointer to Write Domain Synchronizer
module sync_r2w #(
parameter ASIZE = 4
)(
input wire wclk, wrst_n,
input wire [ASIZE:0] rptr,
output reg [ASIZE:0] wq2_rptr
);
reg [ASIZE:0] wq1_rptr;
always @(posedge wclk or negedge wrst_n) begin
if (!wrst_n) begin
wq1_rptr <= 0;
wq2_rptr <= 0;
end else begin
wq1_rptr <= rptr;
wq2_rptr <= wq1_rptr;
end
end
endmodule
// Write Pointer to Read Domain Synchronizer
module sync_w2r #(
parameter ASIZE = 4
)(
input wire rclk, rrst_n,
input wire [ASIZE:0] wptr,
output reg [ASIZE:0] rq2_wptr
);
reg [ASIZE:0] rq1_wptr;
always @(posedge rclk or negedge rrst_n) begin
if (!rrst_n) begin
rq1_wptr <= 0;
rq2_wptr <= 0;
end else begin
rq1_wptr <= wptr;
rq2_wptr <= rq1_wptr;
end
end
endmodule(4) Read Control Logic (Empty Generation)
This is the logic for converting Binary to Gray and generating the Empty flag.
module rptr_empty #(
parameter ASIZE = 4
)(
input wire rclk, rrst_n, rinc,
input wire [ASIZE:0] rq2_wptr, // Synced Write Pointer (Gray)
output reg rempty,
output wire [ASIZE-1:0] raddr,
output reg [ASIZE:0] rptr // Read Pointer (Gray)
);
reg [ASIZE:0] rbin;
wire [ASIZE:0] rbin_next, rgray_next;
// 1. Binary Counter Update
// Accept rinc only if it is not empty
assign rbin_next = rbin + (rinc & ~rempty);
assign raddr = rbin[ASIZE-1:0];
// 2. Binary to Gray Conversion
assign rgray_next = (rbin_next >> 1) ^ rbin_next;
// 3. Empty Flag Generation
// Condition: Read Pointer(Gray) == Synced Write Pointer(Gray)
wire rempty_val = (rgray_next == rq2_wptr);
always @(posedge rclk or negedge rrst_n) begin
if (!rrst_n) begin
rbin <= 0;
rptr <= 0;
rempty <= 1; // It should be Empty in Reset state
end else begin
rbin <= rbin_next;
rptr <= rgray_next;
rempty <= rempty_val;
end
end
endmodule(5) Write Control Logic (Full Generation)
This is the logic for converting Binary to Gray and generating the Full flag.
module wptr_full #(
parameter ASIZE = 4
)(
input wire wclk, wrst_n, winc,
input wire [ASIZE:0] wq2_rptr, // Synced Read Pointer (Gray)
output reg wfull,
output wire [ASIZE-1:0] waddr,
output reg [ASIZE:0] wptr // Write Pointer (Gray)
);
reg [ASIZE:0] wbin;
wire [ASIZE:0] wbin_next, wgray_next;
// 1. Binary Counter Update
// Accept winc only when not full
assign wbin_next = wbin + (winc & ~wfull);
assign waddr = wbin[ASIZE-1:0];
// 2. Binary to Gray Conversion
assign wgray_next = (wbin_next >> 1) ^ wbin_next;
// 3. Full Flag Generation
// Condition (Gray Code): MSB and 2nd MSB must be different, and the rest must be the same
wire wfull_val = (wgray_next == {~wq2_rptr[ASIZE:ASIZE-1], wq2_rptr[ASIZE-2:0]});
always @(posedge wclk or negedge wrst_n) begin
if (!wrst_n) begin
wbin <= 0;
wptr <= 0;
wfull <= 0;
end else begin
wbin <= wbin_next;
wptr <= wgray_next;
wfull <= wfull_val;
end
end
endmodule5. Conclusion and Summary
Asynchronous FIFO may look like a data storage container on the surface, but it hides a sophisticated strategy to solve the CDC problem.
- Using Gray Code: Prevents glitching when multi-bit pointers cross domains.
- Pessimistic Design We allow Full/Empty to appear slightly earlier due to synchronization delays, but we never allow it to appear later. (Safety First)
- 2n Depth: To simplify the Full/Empty logic by taking advantage of the symmetry of Gray Code, the depth of FIFO is usually set to a power of 2 (16, 32, 1024, etc.).
References: VLSI verify