Gradient Filter implementation on FPGA : Part 3 Debugging in HDL

16 Jun 2015

This blog is part 3 of a 4 part series of implementing a gradient filter on an FPGA. If you have not already read the earlier parts see the link below to get up to speed before reading this blog. Additionally the user can catch some of our previous blog posts, linked below.

Part 1 and 2 of this blog series

Gradient Filter implementation on an FPGA - Part 1 Interfacing an FPGA with a camera
Gradient Filter implementation on FPGA : Part 2 Implementing gradient Filter

Other FPGA blogs by ValentF(x)

In the previous two parts, we designed modules to interface a camera and then created a gradient filter on the FPGA. One key aspect of using an FPGA is that the designs needs to be valid by construction. When writing software it's fairly easy to write a buggy first version of an application and then debug using step-by-step debugger, or IO (prints on serial, or LEDs) to get working software. On hardware/FPGA you can easily write a hardware description that compiles/synthesizes well but does not work. When this happens you are left with two options:

Use a logic analyzer, either physical (a costly piece of equipment) or soft (a logic analyzer you add to your design in the FPGA) and debug your design outputs.
Re-write everything hoping for the best

The best approach when writing HDL is to design a test for every component you create (if your component is a structure of tested component, you should still write a test for it). This test is implemented as a test-bench. A test-bench is a specific HDL component that cannot be synthesized but that can be executed in a simulated environment. This test-bench generates inputs signals (test vectors) for the device to be tested (Unit Under Test, UUT) and gathers the outputs.

Fig 1 : test-bench used to consist in the device physically connected to test equipment. In HDL all this is simulated on the designer’s computer.

The test-bench can be instrumented to automatically validate the device under test by comparing the outputs for a given set of inputs to a reference (Unit Testing). Test-benches can also be used to test the device during it’s lifetime to make sure it still complies with its initial specification when the designer makes changes to it or one of its sub-components (Regression Testing). Because it is impossible to generate all combinations of test inputs, it is very important to make sure that the chosen set will cover most of the cases (test-coverage).

Fig 2 : Minimal HDL development flow

A test-bench is an independent design and writing a test can sometimes take more time than writing the component itself. A well-design test will save you a lot of time when it comes to loading your design to the device and will help you better understand your component behavior.

In the test-bench the input signal can be generated using the usual VHDL syntax plus an extra set of non-synthesizable functions, mainly for handling timing aspects and IOs. The TextIO package provides an interesting set of functions for handling file inputs/outputs to allow reading/writing values from/to files.

The test-bench can then be executed by a simulator (ModelSim, Isim - xilinx’s free version, GHDL, etc). This simulator interprets your VHDL and simulates the behavior of the FPGA. This simulation can either be functional, or timed. A timed simulation will care about the propagation time in the logic while a functional simulation won't. Because the simulator has to emulate the logic you've written, the simulation can take very long. For example the in the next blog post, we will write a test-bench for the gradient filter that processes a QVGA image (320x240 pixels), this simulation takes ~30min to complete. On bigger systems, the simulation time can be well into the range of hours (for regression testing and unit testing, you'd better run these at night). The simulation process is part of what makes HDL development time very long compared to software. For example, when you have an error in your design, it usually takes a minute to fix in the HDL but minutes/hours to validate the fix. If you compare with the usual software development techniques you'll understand why it is so important to think your design through before implementing it.

In the following we will design a test for the gradient filter component we designed in Part 2 of this blog series. This test-bench will be implemented in VHDL and simulated using ISE’s integrated simulator, ISim (comes for free with the web edition).

Basic testing : Testing the arithmetic part of the Sobel filter for X gradient values

In this first part of the testing we will consider the arithmetic part of the Sobel filter that does the pixel window convolution with the Sobel filter convolution (generic convolution before optimization using DSP blocks). At the heart of the convolution is a Multiply And Accumulate operation that does the multiplication of two 16-bit inputs and adds them with the previous output to generate a 32-bit result. In the following we will test this simple component.The created test will simply stimulate the design with static values to observe for potential bugs in the calculated values.

Generating the test-bench skeleton for the unit under test

ISE comes with a nice feature to auto-generate a template of test-bench for a specific component. This allow to free the designer from the hassle of writing the signal instantiation and component instantiation and concentrate on the test behavior. To do so, in the file navigator right click and select “New Source”. In the wizard, select “VHDL Test Bench” and fill-in the filename and location then click “Next”. In the next window select the component to test (the component must be part of your project) and click finish. Beware that if your component has syntax errors, the generated file won’t be valid. To check syntax, select your component file in the project navigator and click on “Check Syntax” in the process panel.

Once generated the test-bench is composed of three parts :

Signals, constants and component declarations.
Components instantiations and wiring
Clocks generations
Stimuli generation

Parts 1, 2, 3 are auto-generated. ISE auto-detects the system clocks (based on the signal names) and by default generates each clock in a separate process. The clock frequency can be tweaked by setting the constant <clock name>_period. The process looks like this :

clk_process :process begin clk <= '0'; wait for clk_period/2; clk <= '1'; wait for clk_period/2; end process;

This process runs endlessly and does the following :

Sets the clock signal to low
Waits for half the clock period. Note that this wait statement is the kind of non synthesizable statement of VHDL
Sets the clock signal to high
Waits for half the clock period

This process generates a square wave of the configured frequency on the clock signal.

Part 4 is partially generated with comments to help you understand where to write your test code.

stim_proc: process

begin

-- hold reset state for 100 ns.

wait for 100 ns;

wait for clk_period*10;

-- insert stimulus here

wait;

end process;

The first part deals with the system reset. You have the reset signal of your UUT active to force the system into reset and then set the reset inactive just after the “wait for 100 ns ;”. Then there is a 10 clock cycles where the test does nothing and then the fun part starts with “-- insert stimulus here”.

Your stimulus is the sequence of inputs that test the unit. The inputs are generated using traditional assignment operators in HDL and sequencing the inputs is performed by using the wait statement. The wait statement can either be used with time expressed in units picoseconds, nanoseconds, or with a boolean condition using the until statement :

wait for 10 ns ;

wait until clk = ‘1’ ;

Testing MAC16

We have generated the test-bench template for MAC16, now let’s write the test process. We will first write a simple test that will stimulate the MAC16 with two simple values.

stim_proc: process

begin

-- hold reset state for 100 ns.

reset <= '1';

wait for 100 ns;

reset <= '0';

wait for clk_period*10;

-- insert stimulus here

A <= to_signed(224,16);

B <= to_signed(3967,16);

add_subb <= '1' ;

wait;

end process;

After writing the test process, click the “Simulation” check-box in the project navigator window, then select the test bench file and click “Simulate Behavioral” in the process window.

If your test-bench contains no errors, this will launch the ISim tool. After a bit of time you should end-up with the following window.

Use the zoom-out button and the horizontal scroll-bar to get to the beginning of the simulation with an appropriate scale (you should see the clock edges).

To set the signals display format, right click on the “a[15:0]” signal, select “Radix” and “Signed decimal”. Do the same for “b[15:0]” and “res[31:0]”. You should now have the following trace.

If you zoom on the resolution signal between 200ns and 250ns you get the following sequence of results.

888608, 1777216, 2665824, 3554432

As we know the expected behavior of the MAC we can check the result validity :

224*3967 = 888608 -> 888608 + (224*367) = 1777216 -> 1777216+ (224*367) = 2665824 …

At this point if something fails in your design, you can go back to ISE, edit your file and then in ISim press the relaunch button to restart the simulation as in the following image.

Reporting errors

Now that we know that the design works, we can improve the test to automatically report errors. The “assert” statement allows us to report warnings/errors/failures to the designer from the simulation. This report will then help the designer to spot exactly where the problem occurs. In our case we will report a failure if the result of the first MAC cycle differs from what is expected.

stim_proc: process

begin

-- hold reset state for 100 ns.

reset <= '1';

wait for 100 ns;

reset <= '0';

wait for clk_period*10;

-- insert stimulus here

A <= to_signed(224,16);

B <= to_signed(3967,16);

add_subb <= '1' ;

wait for clk_period ;

ASSERT res = (224*3967) REPORT "Result does not match what is expected" SEVERITY FAILURE;

wait;

end process;

In this process if the result is different from the expected result, the simulation will stop. A less critical report would be ERROR or WARNING (won’t stop the simulation) and NOTE would just inform the user. The report message will be printed in the simulator console window.

Test-vectors

So far in our test we have only tested the behavior of the MAC16 component for a single value and we validated by hand the sequence of value. To create a better test that covers more cases, we need to create an input test vector, that is a sequence of inputs, to apply to the module and an output test vector that is the expected results for the aforementioned input sequence. These vectors can either be created as a file to be read by the simulation using the TextIO package or directly coded in the test-bench. For the purposes of this blog post we will implement the second method (the first method is better for large tests).

First we need to declare the array vector types for out inputs and outputs:

type input_vector_operand_type is array(natural range <>) of signed(15 downto 0);

type output_vector_res_type is array(natural range <>) of integer;

Then we need to create the input vectors and expected outputs as follows:

-- test vectors

constant a_vector : input_vector_operand_type(0 to 5) := (

to_signed(0, 16),

to_signed(256, 16),

to_signed(-64, 16),

to_signed(16, 16),

to_signed(0, 16),

to_signed(0, 16)

);

constant b_vector : input_vector_operand_type(0 to 5) := (

to_signed(1034, 16),

to_signed(-1, 16),

to_signed(-89, 16),

to_signed(32000, 16),

to_signed(0, 16),

to_signed(0, 16)

);

constant res_vector : output_vector_res_type(0 to 5) := (

-256,

((-89)*(-64))+(-256),

(16*32000)+((-89)*(-64))+(-256)

);

For the results, the two initial 0 values are to take into account the pipeline of the MAC16 component. This component has a latency of two clock cycles before a change on the inputs impacts the output.

Then we have to write the process that scans those vectors, and report the errors/failures using assert.

stim_proc: process

begin

-- hold reset state for 100 ns.

reset <= '1';

wait for 100 ns;

reset <= '0';

wait for clk_period*10;

-- insert stimulus here

for i in 0 to 5 loop

A <= a_vector(i);--a_vector(i);

B <= b_vector(i);--b_vector(i);

add_subb <= '1' ;

ASSERT res = res_vector(i) REPORT "Result does not match what is expected "&integer'IMAGE(res_vector(i))&" != "&integer'IMAGE(to_integer(res)) SEVERITY FAILURE;

wait until falling_edge(clk) ;

end loop ;

wait;

end process;

The for loop iterates over the range of the test vectors and for each set of inputs, the result of the MAC16 is tested. If the result does not match the assert condition, the simulation will fail and indicate what went wrong.

Now that the base module of our convolution filter has been proven to work, the other components of the sobel filter must be tested. Once the MAC16 is tested we can plan to test the full gradient filter. Testing the filter using hand-designed test vectors can be very painful considering the amount of information needed to be generated in order to test a whole image. In this case debugging at higher level is a better solution and allows us to evaluate the quality of the filter.

Testing the sobel filter using images will be the topic of the next blog post.

This work is licensed to ValentF(x) under a Creative Commons Attribution 4.0 International License.

peepo over 9 years ago

This is a really excellent tutorial as I worked my way through,

but had trouble right at the end: http://peepo.com/media/xilinx_test_bed.png

the simulator finished, but I cannot immediately, clearly and simply reconcile the result with the inputs.

perhaps I missed something?

or perhaps you could add a description?

or maybe I will understand further in the morning?

~:"
- Cancel
- Vote Up 0 Vote Down
- Sign in to reply
- More
- Cancel

Gradient Filter implementation on FPGA : Part 3 Debugging in HDL

This blog is part 3 of a 4 part series of implementing a gradient filter on an FPGA. If you have not already read the earlier parts see the link below to get up to speed before reading this blog. Additionally the user can catch some of our previous blog posts, linked below.

Gradient Filter implementation on FPGA : Part 2 Implementing gradient Filter

Basic testing : Testing the arithmetic part of the Sobel filter for X gradient values