One additional thing. You can see numerous glitches on the commutation output channel (Probe D - Green). These are a real issue with my custom PCB and I think I have the issue figured out.
The issue comes about due to the encoder signal (asynchronous) being sampled into a clocked circuit. The very first thing that happens to the signals when they enter my board is that they are "synchronized" by a pair of flip flops. A flip flop is a edge triggered memory element where the value at the input "D" is latched to the output "Q" when the clock "CLK" rises from low to high.
There is a minimum amount of time that the input "D" must be stable both before and after the rising edge of the clock. This is called the setup and hold time. These are specified in the chip datasheet, for example this Texas Instruments CD74AC175 (Quad D-Type Flip-Flop):
If the input signal violates the setup or hold time,
metastability can occur. Metastability is a state when the flip flop can get temporarily stuck between states or rapidly oscillate until it resolves to either high or low.
The issue is this result is completely random and nondeterministic.
This graphic below shows many samples of the green signal become metastable and how they can take many paths and either settle high or low (before eventually being corrected near the far right edge on the next clock pulse)
The clock that sequences the logic oscillates at a very stable 8.0000MHz. Since there is no way for me to guarantee when the input signal will change, there is a pure probabilistic chance that any given input transition will happen at the same time as the clock edge and become metastable. This is dependent on the circuit clock frequency, the signal frequency, and the proportion of time in which a setup or hold time violation could occur. This calculates to *on average* 240,000 metastable events per second when the motor is running at max speed. But not to worry! We contain this unstable signal between a pair of flip flops called a 2FF synchronizer. This forces a 1 clock cycle delay for the metastable signal to settle and assures a very very high likelihood that there will be no metastability after the synchronizer. Since the value randomly lands high or low, there is a 50% chance that the signal change is delayed by 1 clock cycle into the circuit, this is perfectly acceptable.
In the graphic below, "Din" represents the input to my circuit. CLK-A is the clock in the encoder (of which I have no control over) and CLK-B is the 8.0000MHz clock in my circuit. Each blue box represents the flip flop circuit shown above. You can see the "Din" signal rises at the same time as CLK-B. This causes the output "Ds" to become metastable, however this value settles well before the next clock pulse on "CLK-B". Notice the output "Dout" transitions very cleanly because the metastability has settled at the expense of 2 clock cycles of delay from the input signal "Din".
This synchronization is only needed when a signal is brought in from another clock source or the outside world. Once you are inside the circuit, there is no risk of metastability unless poorly designed.
But hey! I have 2FF synchronizers on all my inputs... Why do I still have glitches?
Well, the key is that each signal randomly settles to a value when metastability occurs. In my signal, 2 inputs can change at the same time (big problem). There is some finite probability that both inputs will go metastable at the same time (actually pretty likely since they transition at the same time and all flip-flops see the same clock). If both do metastable, there is a 25% chance they resolve to the correct values, 25% chance they resolve to old (but valid) values, and a 50% chance that they resolve to one of two illegal value combinations. This 50% chance of an illegal input combination after metastability is what causes the glitches.
If the glitch registers on the first sample of a new input combination, then the error is corrected on the next sample, however if the error occurs on the last sample before an input transition, then the glitch gets latched into memory until the encoder moves 3 more counts in the forward direction so the input sample can be corrected. These are the larger glitches shown in some of my earlier scope traces.
There is not a good solution that completely preserves data integrity without additional handshaking signals, however I realized I have an option that isn't available to most people designing circuits - it is acceptable for me to discard input samples. Since it is impossible to know if a flip-flop is metastable, I assume that the first sample after any input transition is bad and I discard it. If the next sample matched the previous, then we know the sample was good and it is OK to use the data. If the next sample doesn't match the previous, then metastability and data corruption occurred and we throw away a second sample and check the 3rd. In this way we trade a small amount of additional delay (100-200ns) for guaranteed signal integrity.
Here is my circuit modification to the input section of my circuit accomplish this. It adds 3 additional chips (3 channel 2-input XNOR gate, 3 channel 3-input AND gate, and 3 channel 2:1 multiplexer). Unfortunately I'll have to re-order PCBs, but I am going to try to rework my test board first to prove the concept before spending the money on another potentially faulty PCB.