There are a few resources online explaining how to reboot a machine using Ansible which didn’t work for me. My task would always time out and I had no idea why. Finally I figured it out.
The tasks I was using looked roughly like these:
The problem here is the use of inventory_hostname
. In my inventory, I was
referring to my machines by the name they had on my .ssh/config
. This works
well when invoking Ansible, whose CLI integrates well with OpenSSH. However it
doesn’t work for modules, or at least it doesn’t for wait_for
which I use
above.
After trying some alternatives, I eventually settled for having all the network
information on my inventory. This is, declaring ansible_host
(and possibly
ansible_port
) for each entry, instead of relying on .ssh/config
. Then I
would use ansible_host
in the wait_for
task to indicate the host.
After some additional tweaking, currently I have a reboot
role whose main task
looks like this:
1--- 2- name: Restart machine 3 shell: sleep 2 && shutdown -r now "Maintenance restart" 4 async: 1 5 poll: 0 6 ignore_errors: true 7 8- pause: 9 seconds: 5 10 11- name: Waiting for server to come back 12 local_action: 13 module: wait_for 14 host: '{{ ansible_host }}' 15 port: '{{ ansible_port }}' 16 state: started 17 delay: 10 18 timeout: 60 19 become: false # as this is a local operation
Why sleep 2
, async: 1
and poll: 0
? I have no idea. I have tried a few
things and this is the one that appears to work reliably for me. For now, I’m
sticking with it, until I understand all this a bit better.