Patiently Waiting for the PCBs
In these high-inflation times I try to be frugal where I can. I've ordered boards from China in the past and while the PCB prices themselves are very affordable the $20 or so for express shipping isn't necessarily the cheapest option. This time around rather than ordering the boards in the conventional manner I instead opted to go through AliExpress. This resulted in substantial savings. The total cost for 5 boards including shipping was $18.34 CAD.
The tradeoff of course is that you have to wait much longer to receive the boards. Each time I checked in on the tracking the delivery date seemed so far away.
What to do during this down time?
Adding a 6502 Core
One of my goals continues to be having a RISCV processor at the heart of the system. However after a cursory look at the available open source options it occurred to me that incorporating a RISCV SOC with bootloader capabilities might pose some challenges. So in the initial stages I will use the venerable 6502, which should be sufficient for testing the basic graphics functions. The state of the FPGA design in these early stages can be seen in the following block diagram. In addition to interfacing with the graphics system the 6502 will also have access to ROM and RAM.
The core I settled on was BC6502 which was developed by Rob Finch from Bird Computer.
The GitHub repository for this core is: https://github.com/robfinch/Cores/tree/master/bc6502
To verify the 6502 core's functionality I put together some code to blink an LED. The code was compiled and converted to a .hex file. If everything goes to plan, on power up the ROM will be initialized with our simple test program. The 6502 assembly for the test program is as follows:
;VARIABLES TICK_COUNTER EQU $0; LED_COPY EQU $1; LED EQU $80; ;CONSTANTS ONE_SECOND EQU $3C; .org $FC00 RESET: SEI ; disable IRQs CLD ; disable decimal mode LDX #$FF ; TXS ; This sets stack pointer to $1FF ($100 + $FF) LDA #$1; STA LED_COPY; STA LED; Forever: ;jump back to Forever, infinite loop JMP Forever; NMI: LDX TICK_COUNTER; INX; STX TICK_COUNTER; CPX #ONE_SECOND; BNE EXIT_INTERRUPT; LDA #$0; STA TICK_COUNTER ; Reset TICK_COUNTER LDA LED_COPY; EOR #$1 ; Toggle LED STA LED_COPY; STA LED; EXIT_INTERRUPT RTI; .org $FFFA ;first of the three vectors starts here .dw NMI ;when an NMI happens (once per frame if enabled) the ;processor will jump to the label NMI: .dw RESET ;when the processor first turns on or is reset, it will jump ;to the label RESET: .dw 0 ;external interrupt IRQ is not used
Losing my Open Source Cred
In a perfect world I would have stayed within the confines of the open source tool set. However, old habits die hard and I must admit that I found myself making use of the Intel / ALTERA tool chain for its debug features. The screen shot below shows how the design was debugged using the Signal Tap II embedded logic analyzer.
I should take a moment now to mention that you can't use the Intel / ALTERA tools to directly debug the OrangeCrab. Stepping out of the open source tool chain required me to retarget the design to the QMTECH Cyclone IV development board.
Don't Use Tri-State's for Multiplexors
After ironing out most of my silly design errors I turned my attention to the OrangeCrab once again. Unfortunately the first compile was unsuccessful.
The error indicated that read_address[6] was driven by multiple cells. The read_address signals are in my top level but they originate from the ma_nxt address bus produced inside of bc6502.v.
When I took a closer look I could see that ma_nxt was created using a bunch of tri-state nodes. Here is the original code:
// ma = vector assign ma_nxt = (reset | s_reset | s_reset1 | s_nmi3) ? vec : 16'bz; // ma = ma + 1 assign ma_nxt = (s_reset2 | s_nmi4 | s_jmpi1 | s_ix1 | s_iy1) ? ma + 1 : 16'bz; // ma = pc + 1 assign ma_nxt = ((s_sync & ~any_int) | (s_exec & (absxy | jsr | branch)) | s_rts3) ? pc_reg + 1 : 16'bz; // ma = tmp // abs,y must take precedence over abs,x assign ma_nxt = (s_reset3 | s_nmi5 | s_rti3 | s_rts2 | s_ld_pch | s_iy2 | s_ix2 | s_abs1 ) ? {{di,tmp}+xyz_reg} : 16'bz; // abs : abs,x : abs,y : (zp),y // zero page modes assign ma_nxt = (s_exec & (zpxy | ix | iy)) ? {8'h00,di + xyz_reg} : 16'bz; // zp : zp,x : zp,y : (zp,x) : (zp),y // all zp modes // - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - // all stack modes assign ma_nxt[ABW-1:8] = ((s_exec & (brk|psh|pul|rts|rti)) | s_rts1 | s_rti1 | s_rti2 | s_nmi1 | s_nmi2 | s_jsr1 | s_jsr2 | (s_sync & any_int) ) ? 8'h1 : 8'bz; // ma = sp assign ma_nxt[7:0] = ( s_nmi1 | s_nmi2 | s_jsr1 | s_jsr2 | (s_sync & any_int) | (s_exec & (brk|psh)) ) ? sp_reg : 8'bz; assign ma_nxt[7:0] = ((s_exec & (pul|rts|rti)) | s_rts1 | s_rti1 | s_rti2) ? sp_addo : 8'bz; // branch instr. // ma = pc + (sign extend)disp assign ma_nxt = (s_branch & taken) ? {pc_reg + {{8{tmp[DBW-1]}},tmp}} : 16'bz; // ma = ma assign ma_nxt = ((s_dataFetch|s_update) & grp2m) ? ma : 16'bz; // ma = pc assign ma_nxt = ((s_branch & ~taken) | s_pul | s_afterWrite | (s_exec & (imm | mop)) | ((s_dataFetch|s_update) & (grp0 | grp1 | grp2x)) ) ? pc_reg : 16'bz;
I eliminated this tristate approach by creating ma_select[10:0] and assigning the conditions for setting each of its bits.
Then ma_select was used in a case statement to select the value of ma_nxt. The revised code is shown below:
assign ma_select[0] = (reset | s_reset | s_reset1 | s_nmi3); // ma = ma + 1 assign ma_select[1] = (s_reset2 | s_nmi4 | s_jmpi1 | s_ix1 | s_iy1); // ma = pc + 1 assign ma_select[2] = ((s_sync & ~any_int) | (s_exec & (absxy | jsr | branch)) | s_rts3); // ma = tmp // abs,y must take precedence over abs,x assign ma_select[3] = (s_reset3 | s_nmi5 | s_rti3 | s_rts2 | s_ld_pch | s_iy2 | s_ix2 | s_abs1 ); // zero page modes assign ma_select[4] = (s_exec & (zpxy | ix | iy)); // - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - // all stack modes assign ma_select[10] = ((s_exec & (brk|psh|pul|rts|rti)) | s_rts1 | s_rti1 | s_rti2 | s_nmi1 | s_nmi2 | s_jsr1 | s_jsr2 | (s_sync & any_int)); // ma = sp assign ma_select[8] = (s_nmi1 | s_nmi2 | s_jsr1 | s_jsr2 | (s_sync & any_int) |(s_exec & (brk|psh))); assign ma_select[9] = ((s_exec & (pul|rts|rti)) | s_rts1 | s_rti1 | s_rti2); // branch instr. // ma = pc + (sign extend)disp assign ma_select[5] = (s_branch & taken); // ma = ma assign ma_select[6] = ((s_dataFetch|s_update) & grp2m); // ma = pc assign ma_select[7] = ((s_branch & ~taken) | s_pul | s_afterWrite | (s_exec & (imm | mop)) | ((s_dataFetch|s_update) & (grp0 | grp1 | grp2x)) ); always @ (*) begin case (ma_select) 16'b00000000001: ma_nxt <= vec; 16'b00000000010: ma_nxt <= ma + 1; 16'b00000000100: ma_nxt <= pc_reg + 1; 16'b00000001000: ma_nxt <= {{di,tmp}+xyz_reg}; 16'b00000010000: ma_nxt <= {8'h00,di + xyz_reg}; 16'b00000100000: ma_nxt <= {pc_reg + {{8{tmp[DBW-1]}},tmp}}; 16'b00001000000: ma_nxt <= ma; 16'b00010000000: ma_nxt <= pc_reg; 16'b10100000000: ma_nxt <= {8'h1, sp_reg}; 16'b11000000000: ma_nxt <= {8'h1, sp_addo}; default: ma_nxt <= ma; endcase end
Combinatorial Loops?
After addressing the tri-state issue I was greeted with another compiler error. This one was related to combinatorial loops.
After some googling I discovered that this could have something to do with always blocks set up to construct combinatorial logic. Typically always blocks are used for registers and are triggered on a clock edge. Sometimes though they are used for combinatorial logic. Using a case statement within an always block to create a multiplexor is an example of this.
I began to change the sensitivity lists of all combinatorial always blocks to *.
always @(*)
Even so the compiler error persisted. Fortunately after this change the number of warnings had reduced somewhat.
One of the warnings that remained referred to a signal called 'o'. I traced this to a case statement within the bc6502.v file. I could see that in addition to 'o' the case statement was also used to select the value of 'sc'. The original code appears below.
always @(ir or d or ci) begin case(ir[7:5]) `ASL: begin o <= {d[DBW-2:0],1'b0}; sc <= d[DBW-1]; end `ROL: begin o <= {d[DBW-2:0],ci}; sc <= d[DBW-1]; end `LSR: begin o <= {1'b0,d[DBW-1:1]}; sc <= d[0]; end `ROR: begin o <= {ci,d[DBW-1:1]}; sc <= d[0]; end // A store does not affect the flags so the output can // be set to anything. `STX: o <= d; // Output needs to be set on a load so the flags can // be set. `LDX: o <= d; `DEC: o <= d - 8'd1; `INC: o <= d + 8'd1; endcase end
There was nothing controversial about the code but it did occur to me that it could be simplified further.
Instead of one case statement we could have two; one for each signal. The resulting code is as follows:
always @(*) begin case(ir[7:5]) `ASL: o <= {d[DBW-2:0],1'b0}; `ROL: o <= {d[DBW-2:0],ci}; `LSR: o <= {1'b0,d[DBW-1:1]}; `ROR: o <= {ci,d[DBW-1:1]}; `STX: o <= d; `LDX: o <= d; `DEC: o <= d - 8'd1; `INC: o <= d + 8'd1; default: o <= d; endcase end always @(*) begin case(ir[7:5]) `ASL: sc <= d[DBW-1]; `ROL: sc <= d[DBW-1]; `LSR: sc <= d[0]; `ROR: sc <= d[0]; default: sc <= d[0]; endcase end
And what do you know, after this change I was blessed with an error free compile.
But, did the LED flash when the board was programmed?
The short answer is no.
Initializing On-chip Memory
Thinking back to the OrangeCrab tutorials I seemed to recall that none of the examples covered the use of on-chip memory.
Is it possible that the ROM isn't being initialized with my compiled 6502 code?
When the design was compiled for the ALTERA platform the assembled 6502 code was stored in an Intel format hex file.
The following format was used in verilog to ensure that the ROM was initialized with the .hex file.
(* ram_init_file = "rom_init.hex" *) reg [DATA_WIDTH-1:0] ram[2**ADDRESS_WIDTH-1:0];
It turns out that this method may have been specific to Intel / ALTERA.
The open source tool set instead uses the $readmemh directive to initialize memory.
Once I figured this out I changed the rom_memory.v file to the following:
module rom_memory #( parameter ADDRESS_WIDTH=7, parameter DATA_WIDTH=8 ) ( input clk, input [ADDRESS_WIDTH-1:0] address, output reg [DATA_WIDTH-1:0] q ); // Declare the RAM variable reg [DATA_WIDTH-1:0] ram [0:2**ADDRESS_WIDTH-1]; initial begin $readmemh ("rom_init.hex", ram); end always @ (posedge clk) q <= ram[address]; endmodule
But there was still one more piece of the puzzle remaining. The $readmemh directive does not support an Intel format hex file.
Instead it wants to see a space delimited text file so I had to monkey around a bit to convert the file.
All the events described thus far may only take a moment to read but in actuality they transpired over the course of a number of days.
Needless to say at this point I could hardly bear any more disappointment so I crossed my fingers as I programmed the latest compiled design.
Low and Behold we finally got our blinking LED.
What joy. What relief.
Although there were a number of challenges, this experience helped me to learn about the quirks of the open source tool chain.
Not only that, I persevered and solved each problem that emerged.
Would you believe that the boards arrived on the very day that I finally got the blinking LED design to work?
Sometimes life is funny like that.
We'll take a closer look at the board in the next blog.
Top Comments