We are deploying windows agents using the script provided in the KB article:
However, in our case, these are being deployed on machines that have not had the agent before. We are noticing that upon deployment approximately 30-40% of agents report back as having invalid/unhealthy tokens, the other 60-70% are fine. These are deployed using the same script, many on the same subnet as those that work without issue. Given that we are getting an “Unhealthy” error message, there is communication taking place and we have ruled out firewall issues. We find that if we do an initial revocation of the token, some machines then report back in with valid tokens, but this happens for less than 1% of the affected machines. For the others, we find that if uninstall the client and reinstall multiple times, manually deleting files and restarting the machines, that this resolves the issue for about 80% of the affected machines. The other machines still seem to fail to acknowledge the tokens.
What is confounding is that as stated above, multiple reinstalls of the agents, manually deleting installation folders, sometimes works right away, other times, works after a number of attempts. We cannot seem to find a pattern to the problem or even the seemingly random resolution. We wanted to see if anyone else has encountered this issue.