Question about setup and hold times on iCE40 FPGA

To all my FPGA folks out there :slight_smile:

I’m working on understanding metastability, and I know that it comes down to setup and hold time violations. I’ve been using the iCEstick, which has an iCE40HX1K FPGA (here is the datasheet: https://www.latticesemi.com/view_document?document_id=49312).

I’m a little confused about the setup/hold times listed in the datasheet:

I have a few questions:

  1. I only see these times listed for the iCE40HX1K (as well as setup/hold times for the SPI block, but I’m not worries about that). From what I can tell, these list the required setup/hold times when sampling an external signal coming in from a pin. How can I find the setup/hold times for internally routed signals that don’t go through a pin? Or do I assume that the setup/hold times are negligible for internally routed signals?
  2. Why are there negative setup times with no PLL and negative hold times with PLL? Does a -0.23 ns setup time mean that the input signal can transition after the sampling point on the D flip-flop (e.g. the clock edge)?
3 Likes

1: Internal signals from a flipflop inside the chip to another flipflop inside the chip don’t have datasheet numbers on ICE40 but setup and hold should be accounted for in the software timing analysis, either Lattice Icecube or the open source NextPNR assuming both have the same clock source. Lattice documentation is minimal.

2: Without the PLL, there is a significant delay through the global buffer so the clock edge arrives at the input flipflops a nanosecond or so after arriving at the clock input pin. The data only needs to be valid when that clock pulse arrives hence the negative setup implying the data doesn’t have to be valid until after the clock edge. The situation is similar on Xilinx, etc.

With the PLL, I’m not so sure given the lack of datasheet detail. It should be aligning the output of the GCLK buffer with the input clock at the pin. Of course, you could use the phase adjust feature of the PLL to move the sampling point and shift setup/hold.

1 Like

That helps, thank you @dlharmon! I’ll just have to mention that the iCE40 datasheet is not real clear on the setup and hold times for internal signals. I guess the user can somewhat trust the PNR tool to route things effectively to avoid violating setup/hold times for internal signals(?)

Separate but related question: is there a simple way to demonstrate metastability?

I’ve created a basic design that samples a pin, registers the value, and ouputs the same value to another pin. The D flip-flop is clocked on an internally-generated 1 kHz clock. I then generate a 200.5 Hz waveform (with obvious rise and fall times) that’s fed to the input pin. I trigger my scope on the input signal (blue). The output signal (yellow) shifts all around and changes its duty cycle. I’m guess that this is mostly just drift, but would metastability be visible on the output signal? Can you view it with an output pin, or does connecting a pin to the output of a flip-flop alleviate metastability issues?

For modern devices metastability results in longer-than-spec’d settling times (you can’t see the “analog” metastable output easily), but it’s only just longer than spec’d. The problem is this will cause a design to screw up when it “should” work of course… but if you’re talking khz you might not see anything at all.

I had an article/blog post where I demo’d this on a Spartan 6, see https://colinoflynn.com/2020/12/experimenting-with-metastability-and-multiple-clocks-on-fpgas/ . The short version is this figure:

Top = Normal
Middle = Metastable, but lower VCC-INT (exaggerates effect)
Bottom = metastable, normal VCC-INT

3 Likes

@coflynn Wow! That has to be the best article I’ve read on the subject, thank you for the excellent writing and real-life demos. I’ll see if I can get my iCE40 to do something similar. I don’t think I’ve got those nifty delay blocks, but I might be able to use an external waveform generator to accomplish the same thing.

I have written on the subject of metastability here

My article on metastables

With regard to your original question, the general rule of thumb for designers of FPGA (at Xilinx, and maybe at other FPGA vendors) is that if source and destination flip flops use the same clock signal, using the dedicated clock nets, you will never violate the hold time requirement for the destination flip flops. i.e. the design of the FPGA guarantees this, if you follow the global clock net rule.

Violating setup time is a function of the path delay that starts at the source flip flop (clock to output), followed by any non clocked logic and routing delay, and finally the setup time for the destination flip flop. It is the responsibility of the place and route software’s timing analyzer to find all such paths and report the worst. This leads to the resultant max clock rate for placed and routed design. Most FPGA P&R software lets you specify the desired clock frequency, thus letting the P&R software know how hard it needs to work to meet the requirements. When there are multiple clock domains, they need to be analyzed independently (except when the different clocks are directly related). It is the FPGA user’s responsibility to deal with signals that connect the domains, and Colin’s article covers this very well. It is worth noting that the magic FIFO that has an input clock and output clock must internally deal with clock domain crossing for the flag logic and control signals. So that means that even FIFO control logic can go metastable, and so MTBF at the system level depends on how well the FIFO logic was designed. At least the problem has been encapsulated.

So for FPGA users, the best practice is to use the global clock nets for all logic, and if at all possible, have everything clocked of one master clock. This should deal with hold time. Setup time depends on how much logic and routing you have on the slowest path between any two flip flops. The P&R timing analyzer should report this.

1 Like

@coflynn I tried to recreate your test. I have limited equipment: Analog Discovery 2 and a 200 MHz BW scope. I used the AD2 to output 2 squarewaves: 1 MHz (sig_in) and 2 MHz (clock). I can phase shift either to violate the setup and hold times of a flip flop on my FPGA.

Here’s what I saw (yellow: clock, green: sig_in, blue: sig_out):

When setup and hold times are violated, the sig_out (blue) rising edge is delayed and has a lot more randomness to the transition, which tracks with your experiment.

I then implemented the best-practice solution, which is to double-register the input:

module metastability_test (

    // Inputs
    input               clk,
    input               sig_in,    
    
    // Outputs
    output  reg         sig_out = 0
);
    
    // Internal storage elements
    reg     div_sig;
    reg     pipe_0;

    // Register input signal
    always @ (posedge clk) begin
        pipe_0 <= sig_in;
        sig_out <= pipe_0;
    end
    
endmodule

I re-ran the test, and I saw the same results. The sig_out line still showed the same amount of delay and randomness. Is that to be expected? I would have thought that double-registering the input would have reduced the amount of randomness produced by metastability. Is there something I’m missing here?

There is a lot going on in that trace. Using the eyeball-fudge method, that trace is really noisy looking for such a low speed. Just a few off the cuff thoughts:

Given the visual appearance of the clock, this is perhaps just (or also) showing clock noise induced jitter and not illustrating the intrinsic metastability of a FF in this configuration (there are at least two effects involved, actual clock jitter and time spent in the indeterminant logic voltage level which brings in noise from other sources e.g. supply noise)

You have an external clock operating the logic. I know this is part of the “illustrate metastability” test, but depending on who you ask, this is not a valid logic clock. The jitter you see likely has multiple conspiring culprits. Many “long distance” (that is, wires or cables are involved) source-synchronous designs would sample both the external clock and data lines and would likely also filter them. The following article is about button debouncing, but it actually covers the general issue at hand quite well https://zipcpu.com/blog/2017/08/04/debouncing.html (“debouncing” despite its name usually has more to do with time spent noisily floating through an indeterminant logic-level than it does “physical contacts bouncing”)

Because you are using an external clock, the implementation likely generated from the HDL you posted would create an output phase jitter that is twice the clock jitter.

1 Like

Thanks for the heads up. I may not have the equipment to demonstrate metastability, as the clock signals are really not clean. I might have to point to Colin’s page and call it good enough.

I’d also guess it’s maybe some trigger jitter issues there. Before totally giving up you could try seeing if you could hack some better probes onto the signals? Perhaps even some BNC cables could give you better traces.

But the blue trace looks pretty good - so I’m guessing you can get a good signal out of the ice40 board? If so maybe try also routing the “input clock” to an output pin. That might give you a better trace? And in theory they you are seeing what the board is seeing so perhaps it’s a better test!

Unrelated to this I was just doing some stuff with an ice40 board, so I can check to see if I’ve got enough “setup” around tomorrow that I could see if I can confirm what the measurements look like. At least to confirm if you’d see something with your scope or not! I’m not sure my signal gen can do what is needed (before I used all on-board blocks) though so no promises I’ll be useful.

Seconding @coflynn here. Your equipment should be able to show this effect. Delivering the signals via coax and getting some shorter probing lengths (shorter ground loop length) could clean the input signals up in the trace enough to illustrate the effect better.

Also +1 to routing the input clock to an output pin. This should allow you to differentiate (to some degree) noisy probing from the metastability of the logic.

OK so I ran a quick test on my ICE40 board (ICE40 Ulta Breakout Board) and you should be able to get it working. My setup for ref:

The trick is two things:

  1. Setting VCC down to minimum you can.
  2. You can quickly find an area where the output doesn’t transition at all. This area seems to be pretty large, from there you can tune to the metastable demo.

The ‘normal’ looks like this (trace 2 is clock @ 2MHz, trace 4 is data @ 1MHz, with adjustable phase shift between them from signal gen):

The ‘dead output’ looks roughly like this (notice a handful of transitions on the lower blue output trace 1):

Finally if you get the phase just right (I had to go in ~0.01 degree increments):

That should show up on a 200 MHz scope OK. But set vcc lower!. All of the above was with VCC=0.711V.

I basically found you had to program it at a higher (normal ~1.2V VCC) voltage, then drop the voltage until you see the output stop transitioning. Then just go high enough it still works, that is where you run everything. If you go too low it will dump the config ram, but I found that was a lower voltage than where the outputs stop working.

Once you get it working, sweep the phase around more coarsely until you see the output stop. That was pretty easy to find as had like a 1 degree range over which it happened. Then you can fine-tune the phase until you see the metastable effects.

(Unrelated to this I’m working on something running the ice40 at different voltages to characterize propagation delays, so basically I had the setup “almost ready” to do this experiment by chance today, so I was much more helpful than usual…)

2 Likes

@coflynn This is fantastic, thank you! I’ve replicated your setup, but I ran into a wall: I can’t provide a variable VCC power source to the IceStick.

It seems that the IceStick relies on an FTDI FT223H to provide 3.3V, 1.8V, and create the 12 MHz clock signal from a 12 MHz crystal. From what I can tell, that FTDI part needs 5V from the USB lines to operate, and I have no way to isolate its 3.3V output from the rest of circuit.

I might just have to point people to your post and explain why I can’t replicate the setup.

@ShawnHymel Ah shoot was so close! Also my VCC comment was just on setting what I would call the VCC-INT (normally 1.2V supply on the ice40 ultra chip), I left the I/O voltages as normal. But on my board it’s very easy to isolate (it has a resistor you remove) and then feed in an external supply… which is what I was using the board for already (changing that core voltage). So if the IceStick is instead sharing that net it does sound like a hard block (or much larger hassle at least). Unfortunately the PCB designer routing nice clean power nets means they are very hard to hack onto (internal layers or planes)!

It may work at the regular voltage but it’s probably very very finicky by then… and it’s pretty finicky even at the optimal (low) voltage, so I’d imagine frustrating. If useful feel free to steal the above images if you need them as they are at least ice40 (DM me if you need an actual release to use them).

1 Like

@coflynn I tried at regular voltages, but it was too finicky–it would sometimes transition and sometimes not. Even with fine-grain phase control, I wasn’t able to get it to sit in a metastable state. I could cut the 1.8 and 3.3V traces from the FTDI chip and solder in my own supply lines…but I think that’s way too risky on my only IceStick (getting a new one is a crap shoot).

I’m changing the video to do a screenshot of your blog post, and I’ll mention why it doesn’t work on the IceStick. I definitely appreciate the help!

1 Like

Hi @ShawnHymel - I made a breakout for the ICE5LP2K-SG48ITR50, which I used on the Joulescope JS110. I designed it specifically to test voltages and power-up sequencing, which was causing some issues with my design. You could use it to test metastability at different supply voltages using a bench supply.

I know that I built a couple, and they have to be somewhere around here. If you are interested and can wait until the new year, DM me. I can dig it up and send it to you.

2 Likes

That’s super cool! Unfortunately, I need to have the videos recorded, edited, and sent to Digi-Key by the end of the year. However, if I return to this issue in the future (e.g. in another follow-on video or blog post), I might take you up on that offer.

1 Like

To sidetrack this thread - how bad / what were the issues with the sequencing? I noticed the ice40 ultra board does’t worry about sequencing and has this comment in the manual:

The power supplies on the iCE40 Ultra Breakout Board are simplified and suitable for booting from the external SPI flash. The power supply sequencing does not conform to the NVCM boot requirements as specified in iCE40 Ultra Family Data Sheet (FPGA-DS-02028). You may encounter intermittent boot success and/or higher than specified startup currents when attempting to boot from NVCM

Is it just NVCM issues (fine for me as wasn’t planning on using that)?

I investigated this in 2017, so it’s been a while. I never used NVCM either, only RAM programming over SPI. I flipped through my old notebook from the time. I see that I decided to build a breakout board on 9/18/2017 to figure out why programming over SPI was not reliable.

The datasheet says this:


and this:

However, if I remember right, the Lattice reference boards ignored this advice and just power everything up together from LDOs with no ramp rate control.

My notes show me playing with ramp rates and timing, but I didn’t write down the conclusion :frowning:

Although I couldn’t find the solution in my notes, I did find it in the Joulescope schematic. Before the investigation, I used independent filters on VCC and VCCPLL. After the investigation, I used a 100 Ohm resistor from VCCPLL to VCC with a 100 nF cap on VCCPLL, which I think is the recommendation. If I remember right, having VCCPLL ramp up simultaneously to VCC was a problem.

I also had a combination of LDOs and buck converters, and I went all-in with the same LDO family to get more consistent ramp timing.

1 Like

Interesting! So Lattice just does this on their board:

I had found a detailed post explaining issues with this board including that the PLL wouldn’t work. But if that also would have issues with configuration that would also be fairly amazing to me (that they would release the board and just say screw it). So thanks for the insight on your end - good to know I should do a better job than lattice at least it sounds like if I want to avoid future pain!

1 Like