[Verilog] UART RTL design 2

Continuing from the previous post, let's continue with the UART RTL design.

UART RTL design

CLK gen RTL design

Next, let's design a clk gen that sets the baud rate. If you look at the block diagram, there is a clk gen for each Tx/Rx, but since their functions are similar, let's design the tx clk gen first.

module uart_tx_clkgen (
     input  wire        presetn
    ,input  wire        pclk

    ,input  wire [ 8:0] bit_div
    ,input  wire [ 2:0] pre_scale
    ,input  wire [ 4:0] bit_mult

    ,output wire        tx_uclk
);

    reg         r_uclk;
    reg  [ 8:0] r_prescale;
    reg  [13:0] r_divisor_1;
    reg  [22:0] r_divisor_2;
    reg  [21:0] r_dividercnt;

    always @(negedge presetn or posedge pclk) begin
        if (!presetn)    r_prescale  <= 9'd0;
        else begin
            case (pre_scale)
                3'd0   : r_prescale <= 9'd2  ;
                3'd1   : r_prescale <= 9'd4  ;
                3'd2   : r_prescale <= 9'd8  ;
                3'd3   : r_prescale <= 9'd16 ;
                3'd4   : r_prescale <= 9'd32 ;
                3'd5   : r_prescale <= 9'd64 ;
                3'd6   : r_prescale <= 9'd128;
                3'd7   : r_prescale <= 9'd256;
                default: r_prescale <= r_prescale;
            endcase
        end
    end

    //r_divisor_1  =  { 2^(pre_scale+1) x (bit_mult + 1) } ]
    always @(negedge presetn or posedge pclk) begin
        if (!presetn) r_divisor_1 <= 14'd0; 
        else          r_divisor_1 <= r_prescale * (bit_mult + 'b1);
    end

    //r_divisor_2  =  [ (bit_div) x  { 2^(pre_scale+1) x (bit_mult + 1) } ]
    always @(negedge presetn or posedge pclk) begin
        if (!presetn) r_divisor_2 <= 23'd0; 
        else          r_divisor_2 <= r_divisor_1 * bit_div;
    end

    //half cycle of serial clock = r_divisor_2 / 2
    always @(negedge presetn or posedge pclk) begin
        if (!presetn)                          r_dividercnt <= 22'd0; 
        else if (r_dividercnt !=  r_divisor_2) r_dividercnt <= (r_dividercnt + 'b1);
        else                                   r_dividercnt <= 22'd0; 
    end

    wire [22:0] r_divisor_2_half;
    assign      r_divisor_2_half = r_divisor_2 >> 1;

    //serial clock generation
    always @(negedge presetn or posedge pclk) begin
        if (!presetn)                             r_uclk <= 'b0;
        else if (r_dividercnt < r_divisor_2_half) r_uclk <= 'b0;
        else                                      r_uclk <= 'b1;
    end

    assign tx_uclk = r_uclk;

endmodule

The formula for setting the baud rate is as follows:

Fbaud = Fpclk / [ 2 ^ ( pre_scale + 1 ) x { bit_div x ( bit_mult + 1 ) } ]

The baud rate is determined by the cycle and register settings of the pclk used in the module.

Baud rate setting
Baud rate setting

The clock (uclk) generated in this way is used by the Tx controller.

    always @(negedge presetn or posedge uclk) begin
        if   ~~~~
        else ~~~~
    end

However, this design divides a single IP into two clock domains, making it difficult to set Synopsys Design Constraints (SDC). Synthesis requires entering all clock information in a constraint file, but using uclk requires additional configuration. Ultimately, it's best to operate with only one clock per IP.

uclk 사용 시 clock domain
uclk 사용 시 clock domain

uclk_en

So, we proceed with the design by creating a uclk enable signal (uclk posedge signal).

    wire        tx_uclk;
    wire        tx_uclk_en;
    
    reg         r_tx_uclk;
    always @(negedge presetn or posedge pclk) begin
        if (!presetn) begin 
            r_tx_uclk <= 1'b0;
        end
        else begin
            r_tx_uclk <= tx_uclk;
        end
    end

    assign tx_uclk_en =  tx_uclk & ~r_tx_uclk;

With this logic, we can create a uclk_en signal of 1 pclk and use it in the Tx controller.

uclk_en signal
uclk_en signal

Tx controller RTL design

Before designing the Tx controller, let's first learn about FSM (finite-state machine).

A communication protocol is a "promise." Because data is exchanged in a mutually agreed-upon order, communication is reliable. Therefore, the controller must communicate according to the protocol.

An FSM is a model that defines the status, input/output, and conditions for moving to the next state for each state when communicating within a finite number of states. Let's take a look at the protocol's states.

Tx controller FSM
Tx controller FSM

When not communicating, it is in IDLE state. When the Tx controller starts communication, it enters START state and sends 8-bit data after 1 uclk. If parity is set, 1 parity bit is sent, otherwise it goes straight to STOP_1. If the communication is set to 2-stop, it goes from STOP_1 to STOP_2 and then to IDLE state. If not set, it just goes to IDLE state.

So, let's design a Tx controller based on this. First, let's select the input and output ports.

module uart_tx_ctrl (
     input  wire        pclk
    ,input  wire        presetn

    ,input  wire        uclk_en

    ,input  wire [ 7:0] uart_data
    ,input  wire        uart_en
    ,input  wire        complete_clr
    ,input  wire        parity_en
    ,input  wire        stop_en

    ,output wire        complete

    ,output wire        uart_out
);

Here, complete is a signal to notify the CPU with intr when the 1-byte transfer is complete. Next is the data type declaration.

    //===================================================================
    // Local Parameters
    //===================================================================
    localparam IDLE        = 3'h0,
               START       = 3'h1,
               DATA        = 3'h2,
               PARITY      = 3'h3,
               STOP        = 3'h4,
               TRANSFINISH = 3'h5;

    reg  [ 2:0] r_cur_st;
    reg  [ 2:0] r_nxt_st;
    reg  [ 7:0] r_bitcnt;

    reg         r_uart;
    reg         r_complete;
    reg  [ 7:0] r_shift;    //shift register

    wire        data_end;   //DATA state end
    assign data_end = (r_bitcnt == 8'h8);

We'll continue with a description of each data type. Next, we'll look at FSM design.

    //cur_st
    always @(posedge pclk or negedge presetn) begin
        if (!presetn)          r_cur_st <= IDLE;
        else if (complete_clr) r_cur_st <= IDLE;
        else                   r_cur_st <= r_nxt_st;
    end

    //FSM
    always @(*) begin
        case (r_cur_st)
            IDLE        : begin
                if (uart_en & uclk_en) begin
                     r_nxt_st <= START;
                end
                else r_nxt_st <= IDLE;
            end
            START       : begin
                if (uclk_en) begin
                     r_nxt_st <= DATA;
                end
                else r_nxt_st <= START;
            end
            DATA        : begin
                if (data_end & parity_en) begin
                     r_nxt_st <= PARITY;
                end
                else if (data_end & !parity_en & stop_en) begin
                     r_nxt_st <= STOP;
                end
                else if (data_end & !parity_en & !stop_en) begin
                     r_nxt_st <= TRANSFINISH;
                end
                else r_nxt_st <= DATA;
            end
            PARITY      : begin
                if (stop_en & uclk_en) begin
                     r_nxt_st <= STOP;
                end
                else if (uclk_en) begin
                     r_nxt_st <= TRANSFINISH;
                end
                else r_nxt_st <= PARITY;
            end
            STOP        : begin
                if (uclk_en) begin
                     r_nxt_st <= TRANSFINISH;
                end
                else r_nxt_st <= STOP;
            end
            TRANSFINISH : begin
                if (complete_clr) begin
                     r_nxt_st <= IDLE;
                end
                else r_nxt_st <= TRANSFINISH;
            end
            default     : begin
                r_nxt_st <= IDLE;
            end
        endcase
    end

Here, you can understand that TRANSFINISH is the stop_1 state and STOP is the stop_2 state.

I don't know why, but most IPs write cur_st(current state) by pushing nxt_st(next state) by 1 clock;;; If anyone knows why, please let me know by email.

Next is the bitcnt (bit counter) design.

    //BITCNT
    wire data_st;
    assign data_st = (r_cur_st == DATA);
    always @(posedge pclk or negedge presetn) begin
        if (!presetn)               r_bitcnt <= 8'h0;
        else if (complete_clr)      r_bitcnt <= 8'h0;
        else if (data_st & uclk_en) r_bitcnt <= (r_bitcnt + 1);
        else                        r_bitcnt <= r_bitcnt;
    end

bitcnt is used to transfer 1 bit at a time while shifting the data to be transmitted for 8 bits in the DATA state. So, let's take a look at the shift register design.

    //shift reg
    always @(posedge pclk or negedge presetn) begin
        if (!presetn)               r_shift <= 8'h0;
        else if (data_st) begin
            if (uclk_en)            r_shift <= {1'b0,r_shift[7:1]};
            else                    r_shift <= r_shift;
        end
        else                        r_shift <= uart_data;
    end

If you design it this way, the data coming from the APB register will be shifted by 1 bit and stored in r_shift. Then how does r_shift go out as an output?

    //UART tx
    always @(posedge pclk or negedge presetn) begin
        if (!presetn) r_uart <= 1'b1;
        else begin
            case (r_cur_st)
                IDLE        : begin
                    if (uart_en & uclk_en) begin
                         r_uart <= 1'b0;
                    end
                    else r_uart <= 1'b1;
                end
                START       : begin
                    if (uclk_en) begin
                         r_uart <= r_uart;
                    end
                    else r_uart <= r_uart;
                end
                DATA        : begin
                    r_uart <= r_shift[0]; //shift register
                end
                PARITY      : begin
                    r_uart <= 1'b0;       //수정 필요 (parity)
                end
                STOP        : begin
                    r_uart <= 1'b1;
                end
                TRANSFINISH : begin
                    r_uart <= 1'b1;
                end
                default     : begin
                    r_uart <= 1'b1;
                end
            endcase
        end
    end

There's one thing that needs to be fixed. If parity is enabled, the calculated parity is output. It might be a good idea to try calculating the parity yourself!!

Finally, complete intr and output port assignment.

    //complete intr
    always @(posedge pclk or negedge presetn) begin
        if (!presetn)               r_complete <= 1'b0;
        else if (complete_clr)      r_complete <= 1'b0;
        else if ((r_cur_st == TRANSFINISH) & uclk_en) begin
                                    r_complete <= 1'b1;
        end
        else                        r_complete <= r_complete;
    end

    assign complete = r_complete;
    assign uart_out = r_uart;

Should we take a look at it all?

module uart_tx_ctrl (
     input  wire        pclk
    ,input  wire        presetn

    ,input  wire        uclk_en

    ,input  wire [ 7:0] uart_data
    ,input  wire        uart_en
    ,input  wire        complete_clr
    ,input  wire        parity_en
    ,input  wire        stop_en

    ,output wire        complete

    ,output wire        uart_out
);

    //===================================================================
    // Local Parameters
    //===================================================================
    localparam IDLE        = 3'h0,
               START       = 3'h1,
               DATA        = 3'h2,
               PARITY      = 3'h3,
               STOP        = 3'h4,
               TRANSFINISH = 3'h5;

    reg  [ 2:0] r_cur_st;
    reg  [ 2:0] r_nxt_st;
    reg  [ 7:0] r_bitcnt;

    reg         r_uart;
    reg         r_complete;
    reg  [ 7:0] r_shift;    //shift register

    wire        data_end;   //DATA state end
	
	//cur_st
    always @(posedge pclk or negedge presetn) begin
        if (!presetn)          r_cur_st <= IDLE;
        else if (complete_clr) r_cur_st <= IDLE;
        else                   r_cur_st <= r_nxt_st;
    end

    //FSM
    always @(*) begin
        case (r_cur_st)
            IDLE        : begin
                if (uart_en & uclk_en) begin
                     r_nxt_st <= START;
                end
                else r_nxt_st <= IDLE;
            end
            START       : begin
                if (uclk_en) begin
                     r_nxt_st <= DATA;
                end
                else r_nxt_st <= START;
            end
            DATA        : begin
                if (data_end & parity_en) begin
                     r_nxt_st <= PARITY;
                end
                else if (data_end & !parity_en & stop_en) begin
                     r_nxt_st <= STOP;
                end
                else if (data_end & !parity_en & !stop_en) begin
                     r_nxt_st <= TRANSFINISH;
                end
                else r_nxt_st <= DATA;
            end
            PARITY      : begin
                if (stop_en & uclk_en) begin
                     r_nxt_st <= STOP;
                end
                else if (uclk_en) begin
                     r_nxt_st <= TRANSFINISH;
                end
                else r_nxt_st <= PARITY;
            end
            STOP        : begin
                if (uclk_en) begin
                     r_nxt_st <= TRANSFINISH;
                end
                else r_nxt_st <= STOP;
            end
            TRANSFINISH : begin
                if (complete_clr) begin
                     r_nxt_st <= IDLE;
                end
                else r_nxt_st <= TRANSFINISH;
            end
            default     : begin
                r_nxt_st <= IDLE;
            end
        endcase
    end

    //BITCNT
    wire data_st;
    assign data_st = (r_cur_st == DATA);
    always @(posedge pclk or negedge presetn) begin
        if (!presetn)               r_bitcnt <= 8'h0;
        else if (complete_clr)      r_bitcnt <= 8'h0;
        else if (data_st & uclk_en) r_bitcnt <= (r_bitcnt + 1);
        else                        r_bitcnt <= r_bitcnt;
    end

    //shift reg
    always @(posedge pclk or negedge presetn) begin
        if (!presetn)               r_shift <= 8'h0;
        else if (data_st) begin
            if (uclk_en)            r_shift <= {1'b0,r_shift[7:1]};
            else                    r_shift <= r_shift;
        end
        else                        r_shift <= uart_data;
    end

    //UART tx
    always @(posedge pclk or negedge presetn) begin
        if (!presetn) r_uart <= 1'b1;
        else begin
            case (r_cur_st)
                IDLE        : begin
                    if (uart_en & uclk_en) begin
                         r_uart <= 1'b0;
                    end
                    else r_uart <= 1'b1;
                end
                START       : begin
                    if (uclk_en) begin
                         r_uart <= r_uart;
                    end
                    else r_uart <= r_uart;
                end
                DATA        : begin
                    r_uart <= r_shift[0]; //shift register
                end
                PARITY      : begin
                    r_uart <= 1'b0;       //수정 필요 (parity)
                end
                STOP        : begin
                    r_uart <= 1'b1;
                end
                TRANSFINISH : begin
                    r_uart <= 1'b1;
                end
                default     : begin
                    r_uart <= 1'b1;
                end
            endcase
        end
    end

    //complete intr
    always @(posedge pclk or negedge presetn) begin
        if (!presetn)               r_complete <= 1'b0;
        else if (complete_clr)      r_complete <= 1'b0;
        else if ((r_cur_st == TRANSFINISH) & uclk_en) begin
                                    r_complete <= 1'b1;
        end
        else                        r_complete <= r_complete;
    end

    assign data_end = (r_bitcnt == 8'h8);
    assign complete = r_complete;
    assign uart_out = r_uart;

endmodule
Simulation result
Simulation result

The above simulation results are for cases where Tx data is 0xA and neither stop_en nor parity_en are set. You can see that the state is being passed for each uclk_en.

0x0(IDLE) -> 0x1(START) -> 0x2(DATA) -> 0x5(TRANSFINISH)

When in the DATA state, you can see that the data is shifted by 1 bit and stored in r_shift, and the tx signal changes accordingly. Finally, when the transfer is complete, complete intr turns on.

Problem

Of course, this controller isn't a full IP, because it sends out an intr for every byte it sends. Typical modules have a FIFO (First In, First Out) memory, so if enabled, data will continue to be sent until the FIFO is empty unless the user intentionally disables it. Once the FIFO is empty, it sends a signal to the CPU requesting data via intr or using DMA (Direct Memory Access).

References: Realtek UART

Similar Posts