I’ve always enjoyed using *nix tools, even when working on a Windows machine. In the past I had always used Cygwin, but recently I have been playing around with the MSYS2 project, which is a down stream fork of the Cygwin project. It has a few interesting features, one of the most impressive being a real package manager.

Recently I got a new development machine, so I decided to try a pure MSYS2 setup to see how well it would work. After installing MSYS2 I noticed two particularly strange behaviors:

  • The first issue was that it took an excessive amount of time to start a MSYS2 shell, on the order of 30 seconds. However once the shell was started everything was very snappy.

  • I often put the MSYS2/Cygwin tools in my normal windows console path. They are normally quite useful during my normal development. When executing any of the MSYS2 tools which perform file access (like ls, grep, etc) there was a significant delay during execution. If you let the program continue to execute it would eventually finish, but it was obviously an unproductive work flow…

Initially I didn’t connect the two issues together, as I was seeing them in two entirely separate usage patterns. After digging around in the documentation for a while and coming up with nothing, I finally thought hey lets just use strace! Strace immediately revealed the issue, there were thousands of references to a single function when ever a program which dealt with file access was executing.

$ strace ls
  ... snip ...
  497 5276872 [main] ls 9004 pwdgrp::fetch_account_from_windows: line: <xxxxxxxxxxxxxx1:>
  492 5277364 [main] ls 9004 pwdgrp::fetch_account_from_windows: line: <xxxxxxxxxxxxxx2:>
  520 5277884 [main] ls 9004 pwdgrp::fetch_account_from_windows: line: <xxxxxxxxxxxxxx3:>
  521 5278405 [main] ls 9004 pwdgrp::fetch_account_from_windows: line: <xxxxxxxxxxxxxx4:>
  495 5278900 [main] ls 9004 pwdgrp::fetch_account_from_windows: line: <xxxxxxxxxxxxxx5:>
  496 5279396 [main] ls 9004 pwdgrp::fetch_account_from_windows: line: <xxxxxxxxxxxxxx6:>
  493 5279889 [main] ls 9004 pwdgrp::fetch_account_from_windows: line: <xxxxxxxxxxxxxx7:>
  493 5280382 [main] ls 9004 pwdgrp::fetch_account_from_windows: line: <xxxxxxxxxxxxxx8:>
  497 5280879 [main] ls 9004 pwdgrp::fetch_account_from_windows: line: <xxxxxxxxxxxxxx9:>
  ... snip ...

Catching A Lead

Armed with the actual function call which was wreaking havoc, I hit up Google. I eventually ran across this news group post, where Michael Klemm seemed to be having similar problems as I was. Reading through the mail, Andrey Repin points to a FAQ entry which describes in detail the exact issue I was hitting:

Starting a new terminal window is slow. What’s going on?

There are many possible causes for this.

If your terminal windows suddenly began starting slowly after a Cygwin upgrade, it may indicate issues in the authentication setup.

For almost all its lifetime, Cygwin has used Unix-like /etc/passwd and /etc/group files to mirror the contents of the Windows SAM and AD databases. Although these files can still be used, since Cygwin 1.7.34, new installations now use the SAM/AD databases directly.

To switch to the new method, move these two files out of the way and restart the Cygwin terminal. That runs Cygwin in its new default mode. If you are on a system that isn’t using AD domain logins, this makes Cygwin use the native Windows SAM database directly, which may be faster than the old method involving /etc/passwd and such. At worst, it will only be a bit slower. (The speed difference you see depends on which benchmark you run.) For the AD case, it can be slower than the old method, since it is trading a local file read for a network request. Version 1.7.35 will reduce the number of AD server requests the DLL makes relative to 1.7.34, with the consequence that you will now have to alter /etc/nsswitch.conf in order to change your Cygwin home directory, instead of being able to change it from the AD configuration.

This explains the issue I was experiencing, my machine was joined to a very large AD deployment. The Cygwin/MSYS2 runtime was attempting to enumerate the thousands of users and groups as part of the process of loading the shell process initially, resulting in slow terminal start up.

The Cygwin DLL queries information about every group you’re in to populate the local cache on startup.

The fact that this is an in memory cache, populated on the load of runtime dll explains the second symptom. When not running under the Cygwin/MSYS2 shell, the runtime dll isn’t going to be already loaded and used by the utilities so every invocation of tools like ls or grep results in a separate load of the runtime and as a result an enumeration of the AD groups.

Solution

If you continue reading the FAQ, the steps necessary to cache your users and group memberships is listed. However given my previously discussed situation of being part of a very large AD deployment I only wanted to include my current user info as well as my current users groups to avoid a huge cache (see a related mail thread). To achieve this I did the following:

$ mkpasswd -l -c > /etc/passwd
$ mkgroup -l -c > /etc/group

Once we have the local cache populated, we can configure Cygwin/MSYS2 to read from the files, instead of querying AD. This is achieved by editing /etc/nsswitch.conf and modifying passwd and group sections to read from “files” instead of “files db”, on lines 2 and 3 below:

1
2
3
4
5
6
7
8
# Begin /etc/nsswitch.conf
passwd: files
group: files
db_enum: cache builtin
db_home: cygwin desc
db_shell: cygwin desc
db_gecos: cygwin desc
# End /etc/nsswitch.conf

Once you restart you Cygwin/MSYS2 console, the change should take effect immediately.