Improving Quality

As you might have noticed I now have merged Gavin’s MSF code into my development branch at Github. So far I did not put it into the master branch as part of a release. The point is that I still did not test it. The same holds true for the contributions of Aido.

Before I merge this stuff into master I want to be sure it works. Hence I ask everyone who wants to contribute please give me some feedback to one or both of the following questions:

  1. Does the MSF Code work for you?
  2. Does the (DCF77) Code work with an Arduino Leonardo?

I also improved the documentation quite a lot.

But the most work went into setting up unit tests for my library. As it turns out I introduced some subtle issues into my code while refactoring. Now I uncover and fix them one by one. Continue reading here to learn about the details.

This entry was posted in Uncategorized. Bookmark the permalink.

26 Responses to Improving Quality

  1. paxyinrs says:

    Keep a great job !
    Project works great with Crystal Mini Pro and do not work because of drift at Ceramic Nano.
    I have just combined Swiss Army Debug Helper with MB Emulator so I can with option “Dn” switch to NTP mode (MB Emulator). That way, with Scope and Debug I first find best position, with lowest noise level then I switch to MB emulation. In future, I will probably add some hardware switch for moving form one mode to another.

  2. Emmanouil A. Neonakis says:

    Great Work.
    I am thinking of using the once per second output as a time base for measuring network frequency.
    Is it in your opinion accurate and stable enough for such a use? In particular how much is the period variation?
    Thank you.

    • In principle yes. The details are somewhat involved though. Once the signal is locked to DCF77 it will stay “exactly” in sync with DCF77. However there is a quantization error of +/- 1ms. Also you have a crystal frequency deviation. Lets say for the sake of example your crystal is running at 16 MHz -50 ppm. Then your second ticks will be ~ 1.00005 s appart. Also every 20 seconds you will get a second tick that is only 0.99895 seconds long (equivalent to a phase glitch of pi/20 with regard to mains frequency). Once the clock is running for a several days it will autotune to this deviation and persist this to flash memory. Afterwards all manufacturing related deviations will be gone. Thus the crystal deviation will be soleley due to againg and temperature deviations. If your are monitoring mains frequency the device will be probably running all the time. Thus due to automatic retuning you will be left with the deviations due to temperature. Or with other words maybe +/-2 ppm. Thus you will get the 1ms jump every 100 seconds. It will nevery be more than 2 ms out of phase. That is the deviations do NOT sum they will always even out to keep in phase with DCF77.

      Since it phase locks to DCF77 (like swissgrid does) this should be sufficient if you know what you are doing.

      With other words: if your algorithms account for this behaviour then it is OK, otherwise not. If you use it for a project I would be happy to learn about the details 🙂 In particular I like to learn about the results.

      • Emmanouil A. Neonakis says:

        Thank you.
        The quantization error of +/- 1ms applies to Arduino Uno or to Due?
        Does the tuning algorithm eliminate the quantization error or only the crystal deviation?
        I will certainly inform you about the details if I finally proceed with this project.
        But, as I lack the necessary instruments, I will not be able to assess the accuracy of the measurement.

        • It can only compensate crystal deviation. The +/- 1ms applies to the Due. For the “Uno” it is +/- 10 ms. Please notice that the Uno is NOT crystal based. You will have to find a clone that actually features a crystal.

  3. Emmanuel A. Neonakis says:

    That unfortunately rules out the UNO. And it is not realistic to replace the DUE crystal with a TCXO.
    By the way do you know if the library has been used on teensy 3.2?
    I have seen that you have also conducted a grid frequency monitoring experiment.

    • Well, there are several things you might want to consider.

      1. If you connect an TCXO to an an Uno it will benefit from it a lot of course. Then my library will work.
      2. The sampling hook (e.g. “void sample_input_pin()) will be called once per millisecond. So if you increase a counter in this hook you can use this for monitoring mains frequency. The 1 ms ticks are not phase locked but benefit from the autotuning. Thus you get 1 ppm precision if you let it run for some days in an environment with stable temperature.
  4. Emmanuel A. Neonakis says:

    Regarding the leonardo, I have tried to compile under arduino-1.8.2. None of examples compiled.
    Have you ever successfuly compiled for leonardo? If yes under what version?
    Actually, even for Uno, mb_emulator and super filter failed to compile. What arduino IDE version are you using?
    Thank you.

    • Leonardo is not supported. As you reported even for Uno mb_emulator and super_filter fail to compile. Obviously these examples are broken and need to be fixed. I created a github issue for it. You can track the status here: Thanks for pointing this out.

      • Emmanouil A. Neonakis says:

        Thank you again.
        ‘broken’ is too strong, I am sure they compile properly under a previous compiler version, just telling which one is an appropriate fix for me.
        You state that leonardo is not supported. Do you think it is fundamentally incompatible or just that you do not find it interesting enough to do it?
        I have looked up dcf77_generator.ino. Compilation fails because leonardo lacks timer_2. But after blindly changing all timer_2 references to timer_3, compilation succeeds.
        It also seems that if one is only interested in the baseband signal, the timer_2 is not needed.
        Your advice on porting the generator (and perhaps the library) to leonardo would be welcome.

        • I never had a Leonardo for testing. Hence I can not really support it. I could put your patch into a dedicated branch. If you would be willing to provide test run logs to me which confirm that it actually works, then I would put it into the main line.

        • I added a branch for Leonardo: would you please verify if it works? I do not want to know if it compiles. I checked this. I want to know if it actually decodes. Preferably I would like to have a log of the swiss army debug helper in mode Dm for at least 15 minutes. Even better would be a log in mode Dm for at least 3 days. Reason: this would also verify that the EEPROM functionality works as it should.

          If you or someone else provide the logs I will put it into the main line. Then you would get automatic updates. for Leonardo as well.

  5. Emmanouil A. Neonakis says:

    You are very fast, I was still trying to figure out the timers’ differences between Uno and Leonardo.
    I will order a receiver and will provide the requested information, but I do not expect to receive it earlier than April 17. Crete is a good test place as it is outside the range. (My >17 years old Eurochron clock succeeds to decode but only during the night. At times more than 50 hours elapsed between synchronizations)
    I was actually looking into the dcf generator. Have you figured out if atmega32u4’s timer 3 supports independent determination of pwm’s frequency and duty cycle (via OCR2A and OCR2B) as atmega328 timer 2 does?
    Thank you.

    • You mean Crete Greece? Whoa, I definitely want to have a debug log of 48 hours if possible. This would give me a lot of insights on how the signal behaves under really poor reception conditions. Of course I know that PTB has lots of theory about this. But what I need is real measurements with typical receivers. With regard to the timer –> the autorative source is always the datasheet: Section 14 “16-bit Timers/Counters (Timer/Counter1 and Timer/Counter3)” says it is possible.

      • Emmanouil A. Neonakis says:

        Yes that Crete. You will get the debug log for as many days as you want. But first I must have a receiver. HKW seems not to accept orders from Greece, I may have to settle for the Conrad module.
        Thanks for the pointer. In the dcf77_generator you change the carrier frequency between to values in order to have an average equal to 77.5kHz. That is not recommended for timer 3 channel A, and only output A is brought to a pin. So timers 1 and 3 will have to be interchanged, pwm on timer1 and modulation on timer 3.

  6. Emmanouil A. Neonakis says:

    While waiting for HKW to respond, I borrowed a UNO and programmed DCF77-generator on it (and modified it to output the time signal on pin 7). Then I programmed the Leonardo with the Swiss_Army_Debug_Helper and connected UNO D7 to Leonardo A5. In scope I see the Leonardo receiving the signal pulses with apparently correct durations but they continously shift to the right.
    The clock state is staying in the useless state. Is this a manifestation of the UNO resonator instability? Do you have any suggestion or is the situation hopeless? A log is posted in
    Thank you

    • This is the result of the resonator imprecision. The instability manifests in the fact that the drift varies in speed. With other words: the UNO sucks at timekeeping. The easiest way is to replace the UNO by something with a crystal. If you want to have it the hard way: obviously the emitted frequency is still OK for the receiver. So you only need to get hold of the drift. Since you only want to test you do not need absolutely correct time. So you might want to phase lock the modulator to the receiver. That is: have the receiver generate a square wave of period length 200ms (toggle a pin every 100 ms). This can be done by counting to 100 in the function which samples the receiver input each millisecond. Then after counting to 100 reset your counter and toggle the sync pin.
      Connect some UNO pin to your sync pin on the receiver side.
      Then on the generator side trigger the modulate function by an interrupt on a level change of the sync pin instead of timer1. Alternatively you could use a level interrupt and generate a signal of period length 100ms.

      So yes, this is possible put might not be worth the effort.

      • Emmanouil A. Neonakis says:

        I have implemented your suggestion and it works. Absolutely no drift any more.
        Code and logfile in
        I have noticed that you have deleted dcf77-leonardo_support branch and some files in the master repository are marked ‘Arduino Leonardo support’. Has you already confirmed the code works on leonardo?
        Thank you again!

        • Yes and no. It seems that the code is good. However I am still missing logfiles. I thought that I will get my feedback earlier if it is included in the master branch. That is: I take the word of Aido that the code is OK and see if I get any negative feedback. Your logfile is highly apreciated. However it looks like synthesized data. No noise and absolutely no drift. Will you be able to provide a real world log for me?

  7. Emmanouil A. Neonakis says:

    I thought it was clear that this log was produced by Swiss-Army-Debugger running on Leonardo fed by a UNO running dcf77-generator. In order to overcome the UNO resonator instability, I have modified both programs as you suggested in your April 13, 2017 at 18:54 response above. So yes in a way the log contains synthesized data.
    As HKW does not ship to Greece, I have ordered the Conrad module. They informed me they have shipped it. I hope it arrives next week and works properly. As soon I receive it, I will let you know and post real world logs.

  8. Emmanouil A. Neonakis says:

    The Conrad module arrived today. Very small 60mm antenna. Any way I connected it to the leonardo (the builtin pullup was not enough, I had to to add a 10kΩ external one). After an hour it is still in state:useless. The Eurochron clock has also failed to syncronize in the last 12hours.
    Let’s hope it will eventually synchronize. leonardo runs Swiss_Army_Debug_Helper.
    The log file is in

    • Thanks for providing this log. The signal would be good enough for my library. It could eventually decode it. However according to the log the crystal frequency is adjusted by -400 ppm.

      EE Precision: 0.0625 ppm
      EE Freq. Adjust: -400.0000 ppm
      Freq. Adjust: -400.0000 ppm

      And in the log I can see that the signal drifts way to much (could be something about ~400 ppm). This indicates that the frequency adjustment in EEPROM is wrong and causing the clock to not lock. Maybe this is because you were feeding it with synthesized signals from a Leornado which was drifting a lot. Erase the EEPROM and capture a new log. Your signal quality is not very good though. It may take several hours or maybe even a day to decode it even if the EEPROM setup is correct. My local simulation shows that it can capture the phase thus it will eventually decode properly.

      One additional technical detail: you prepend the log output with additional information. This makes it harder for me to analyze it. I have written a helper for me that can parse my logs in their original format. If you could send the next log in my standard format I can answer somewhat faster.

      • Emmanouil A. Neonakis says:

        Did as suggested. Clock achieved synced state at 22:00 local time. Log posted in github.


        • Excellent. And thanks a lot for the logs. They exhibit some flaw in the state engine of the clock. This in turn degrades decoder performance slightly. I will investigate this and fix it.

          • Hi Emmanouil. I looked through the logs and found the following issue.

            36514, +-17762-XXX52XXX5---+-68873-197XXXXX5---+--1X82---XXX7XX5174+5X9X93---2X8XX91---+----53---+-XXXX4--1

            Decoded time: 17-04-30 7 06:19:47 CEST .L
            17.04.30(7,0)06:20:46 MEZ 1,1 2 p(9738-525:255) s(156-43:15) m(240-32:29) h(243-19:31) wd(134-97:5) D(246-199:6) M(246-82:23) Y(244-231:1) 126,4,2,3
            Clock state: synced
            Tick: 10

            36515, X31X81----+1474X4-252XXXXX4---+--5XX4---+XXXXX2---+-XXXX3---+------17879XXXX4---+---------+---3XX81-

            Decoded time: 17-04-30 7 06:19:48 CEST ..
            ??.??.??(?,?)??:??:255 MEZ 1,1 0 p(9722-2790:255) s(0-0:0) m(0-0:0) h(0-0:0) wd(0-0:0) D(0-0:0) M(0-0:0) Y(0-0:0) 126,4,2,255
            Clock state: unlocked
            Tick: 156

            36516, 2XX72-2X3-2XX4X6393-16XXXXXX3-29XXXX8X3-+--588X3--+8X9X8651-59XX7XXX--+--------8XXX4----1669X3------

            Decoded time: 17-04-30 7 06:19:49 CEST ..
            ??.??.??(?,?)??:??:255 MEZ 1,1 0 p(9716-502:255) s(6-0:2) m(0-0:0) h(0-0:0) wd(0-0:0) D(0-0:0) M(0-0:0) Y(0-0:0) 126,4,2,255
            Clock state: locked
            Tick: 7

            36517, 28XXX2---21636X1----+-491---19X9XXX1---3X39XX1----48553-46XXX7XXX1-715X7X35---5XXXX7X-----+--155XXX7

            This should not have happened. In order to reproduce and better analyze the issue I run it in my simulator. However there I get

            30706, +-62X98---+---78XX7-+--3181X6-+-----941-+-69X9XX6-+--884XX6-+---93X-3-96XXX7XX6-26-XX78-5-+-XXXXX75-
            Decoded time: 17-04-30 7 06:19:47 CEST ..
            17.04.30(7,0)06:20:47 MEZ 0,0 6 p(2256-200:128) s(195-107:12) m(246-78:23) h(243-24:30) wd(136-97:5) D(241-184:8) M(247-87:22) Y(238-200:5) 121,2,2,9
            output triggered
            Clock state: synced
            Tick: 950
            confirmed_precision ?? adjustment, deviation, elapsed
            0.4375 ppm, @+ , 14.3750 ppm, 31 ticks, 170 min + 34326 ticks mod 60000

            30707, +------6-XXXX2----5XXX25---78XX3X31--8352X56------+-9461XXX5XXX4------+77----818XXX6----X9X5X5---97X
            Decoded time: 17-04-30 7 06:19:48 CEST ..
            17.04.30(7,0)06:20:48 MEZ 0,0 6 p(2246-191:128) s(195-107:12) m(246-78:23) h(243-24:30) wd(136-97:5) D(241-184:8) M(247-87:22) Y(238-200:5) 121,2,2,9
            output triggered
            Clock state: synced
            Tick: 950
            confirmed_precision ?? adjustment, deviation, elapsed
            0.4375 ppm, @+ , 14.3750 ppm, 31 ticks, 170 min + 34426 ticks mod 60000

            30708, 9X7--8XXX6+---8XXX9-+---9X5XX5+--77X3X85+-1X-83X45+-7385XX2-+----1X1--+---15----+----3X8X5+-X45-----
            Decoded time: 17-04-30 7 06:19:49 CEST ..
            17.04.30(7,0)06:20:49 MEZ 0,0 6 p(2244-190:128) s(195-107:12) m(246-78:23) h(243-24:30) wd(136-97:5) D(241-184:8) M(247-87:22) Y(238-200:5) 121,2,2,9
            output triggered
            Clock state: synced
            Tick: 950
            confirmed_precision ?? adjustment, deviation, elapsed
            0.4375 ppm, @+ , 14.3750 ppm, 31 ticks, 170 min + 34526 ticks mod 60000

            30709, +-99X2-XXX6XXX75----+95XX4XX7XXX517-1XX9X67-5-----+1XX5-----9X6----3X8XX94X3----+--------2XXXXX4--74

            So it stays in sync. However I am pretty sure we have the same library. This raises the question why the results differ. My assumption is that this is due to the fact that you send me only part of the log. Thus my simulator picked up at the signal starting from a different point in time. This in turn let to a different (much better) internal state. Unfortunately this makes analysis pretty hard for me as the simulation shows everything OK.

            Could you capture another signal log for mayber 1-2 days including the start sequence (boilerplate)? Hopefully this would make analysis of the issue somewhat easier.

          • What I already know:

            1) State transition is controlled by dcff77.h template DCF77_Local_Clock::process_1_Hz_tick
            2) We are looking into transition synced –> unlocked
            3) The sequence

            if (quality_factor > Clock_Controller::Configuration::quality_factor_sync_threshold) {
            if (clock_state != Clock::synced) {
            clock_state = Clock::synced;
            } else if (clock_state == Clock::synced) {
            clock_state = Clock::locked;

            must have found that quality_factor dropped below the threshold. Probably the Year decoder (“Y”) caused this. In my opinion I am to defensive here. Obviously the year decoder would have been sufficiently good. I could fix this.

            But this is not the primary issue anyway. This should have put the clock into “locked” state not “unlocked”.
            It also triggers the sync lost event handler which executes

            static void sync_lost_event_handler() {

            bool reset_successors = (Demodulator.get_quality_factor() == 0);
            if (reset_successors) {

            reset_successors |= (Second_Decoder.get_quality_factor() == 0);
            if (reset_successors) {

            reset_successors |= (Minute_Decoder.get_quality_factor() == 0);
            if (reset_successors) {

            reset_successors |= (Hour_Decoder.get_quality_factor() == 0);
            if (reset_successors) {

            reset_successors |= (Day_Decoder.get_quality_factor() == 0);
            if (reset_successors) {

            reset_successors |= (Month_Decoder.get_quality_factor() == 0);
            if (reset_successors) {

            Since the demodulator quality factor is obviously 255 and thus != 0 it should not have started to reset all succeeding decoders but it did. So this looks as if it actually called

            static void phase_lost_event_handler() {
            // do not reset frequency control as a reset would also reset
            // the current value for the measurement period length

            Since the clock was still in “locked” mode this would be caused by

            case Clock::locked: {
            if (Clock_Controller::get_demodulator_quality_factor() > unacceptable_demodulator_quality) {
            // If we are not sure about leap seconds we will skip
            // them. Worst case is that we miss a leap second due
            // to noisy reception. This may happen at most once a
            // year.
            local_clock_time.leap_second_scheduled = false;

            // autoset_control_bits is not required because
            // advance_second will call this internally anyway
            tick = 0;
            second_toggle = !second_toggle;
            } else {
            clock_state = Clock::unlocked;
            unlocked_seconds = 0;

            However this implies that Clock_Controller::get_demodulator_quality_factor() > unacceptable_demodulator_quality would have been false. But the QF is 255 and the threshold is 10.

            So given this log file I can not explain what is going on. I am puzzled. I would need a suitably long log that reproduces the issue AND contains everything since the start of the debug helper including the boiler plate.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.