NTP settings on 64-bit Debian with VMWare revisited

March 2009


The NTP problem in seemed to be related to VMWare. (Small recap: the host that VMWare runs on loses interrupt ticks, and at the same time, perhaps by coincidence, NTP keeps picking the hardware timer as reference source, ignoring the higher stratum servers.) This time, the Sarge has changed into Lenny, VMWare 1.x has changed into 2.x, and we retry allowing the hardware clock at a low stratum to see how the NTP daemon behaves.

  1. Uninstall VMWare

    sudo vmware-uninstall.pl

  2. Check ntpq -p output

  3. Edit /etc/ntp.conf

    driftfile /var/lib/ntp/ntp.drift
    statsdir /var/log/ntpstats/
    
    statistics loopstats peerstats clockstats
    filegen loopstats file loopstats type day enable
    filegen peerstats file peerstats type day enable
    filegen clockstats file clockstats type day enable
    
    server ntp1.rug.nl burst iburst prefer
    server ntp2.rug.nl burst iburst
    
    server 127.127.1.0
    fudge 127.127.1.0 stratum 10
    
    restrict 127.0.0.1
    restrict ::1
    
    enable ntp
    enable kernel
    
    multicastclient
    

  4. Restart the NTP daemon

    sudo /etc/init.d/ntp restart

  5. Watch the changes

    At first, ntpq -p gives:

    After a while, this becomes:

    [Note]Note

    The local hardware clock is not the reference source, as it shouldn't be. But mind you, this is with no VMWare running, so if the above hypothesis is correct, we don't expect NTP to pick the local clock. Of course there could be other changes causing the improvement...

  6. Install the new VMWare

    Fetch the new VMWare from the vmware site (register first and be sent some codes), then untar the thing, cd into it and run sudo ./vmware-install.pl. I prefer to install to /usr/local heeding the FSH section of the Debian Policy.

  7. Configure VMWare

    sudo vmware-config.pl

    [Note]Note

    You need kernel headers/sources and a C compiler for that. Since I'm using Debian, I also need to patch the installer with the following, obtained from Ubuntu:

    --- /usr/bin/vmware-config.pl.orig      2008-11-28 12:06:35.641054086 +0100 1
    +++ /usr/bin/vmware-config.pl   2008-11-28 12:30:38.593304082 +0100
    @@ -4121,6 +4121,11 @@
      return 'no';
    }
    
    +  if ($name eq 'vsock') {
    +    print wrap("VMWare config patch VSOCK!\n");
    +    system(shell_string($gHelper{'mv'}) . ' -vi ' . shell_string($build_dir . '/../Module.symvers') . ' ' . shell_string($build_dir . '/vsock-only/' ));
    +  }
    +
    print wrap('Building the ' . $name . ' module.' . "\n\n", 0);
    if (system(shell_string($gHelper{'make'}) . ' -C '
               . shell_string($build_dir . '/' . $name . '-only')
    @@ -4143,6 +4148,10 @@
      if (try_module($name, $build_dir . '/' . $name . '.o', 0, 1)) {
        print wrap('The ' . $name . ' module loads perfectly into the running kernel.'
                   . "\n\n", 0);
    +      if ($name eq 'vmci') {
    +       print wrap("VMWare config patch VMCI!\n");
    +       system(shell_string($gHelper{'cp'}) . ' -vi ' . shell_string($build_dir.'/vmci-only/Module.symvers') . ' ' . shell_string($build_dir . '/../'));
    +      }
        remove_tmp_dir($build_dir);
        return 'yes';
      }
    	    

    1

    You might want to change this to ./bin/vmware-config.pl[.orig] instead if you want to change in the installer directory ahead of installation instead of afterwards.

  8. Starting a Virtual Machine (or two)

    [Note]Note

    At this point, the VMWare services are running, and I see no sign in the logs of lost interrupts yet, and NTP is doing well. With virtual machines running, there is no timer problem any more according to the logs, but the hardware clock loses 5 minutes in 10. After a couple of minutes running, ntpqsays:

    remote           refid      st t when poll reach   delay   offset  jitter
    ==============================================================================
    RN2-R6509-RP.ne 192.36.143.150   2 u   35   64  377    0.001  541650. 427245.
    129.125.3.251   192.36.143.150   2 u   39   64  377    0.001  526264. 423816.
    *LOCAL(0)        .LOCL.          10 l   30   64  377    0.000    0.000   0.001

    The local clock has precedence once more.

[Warning]Warning

And the VMWare server 2.0 web interface is so bloody slow that we can't interrupt a VM during boot any more :(