6

I'm using Bedrock with Capistrano deploys.
When I use command bundle exec cap staging deploy:check I get an authentication error :

...
D, [2015-05-09T15:39:53.878464 #15636] DEBUG -- net.ssh.authentication.session[1e34a58]: trying publickey
D, [2015-05-09T15:39:53.878464 #15636] DEBUG -- net.ssh.authentication.agent[1e30d2c]: connecting to ssh-agent
E, [2015-05-09T15:39:53.879447 #15636] ERROR -- net.ssh.authentication.agent[1e30d2c]: could not connect to ssh-agent
E, [2015-05-09T15:39:53.879447 #15636] ERROR -- net.ssh.authentication.session[1e34a58]: all authorization methods failed (tried publickey)
cap aborted!
SSHKit::Runner::ExecuteError: Exception while executing as deploy@SERVER_IP: Authentication failed for user deploy@SERVER_IP
Tasks: TOP => git:check => git:wrapper

Capistrano could not connect to ssh-agent on my server.

But I can log in on my server via SSH like this ssh deploy@SERVER_IP without password. I dit all the instructions in Capistrano Authentication & Authorisation Docs page, so I can use command like me@localhost $ ssh deploy@one-of-my-servers.com 'hostname; uptime'.

If I enter command ssh -A deploy@SERVER_IP 'env | grep SSH_AUTH_SOCK' I get result

SSH_AUTH_SOCK=/tmp/ssh-UweQkw7578/agent.7578

Here is my deploy.rb file :

set :application, 'dyxovka-special'
set :repo_url, 'git@github.com:tanzoor/dyxovka-wp-theme.git'
set :branch, :master
set :tmp_dir, '~/tmp'
set :log_level, :info
set :linked_files, fetch(:linked_files, []).push('.env')
set :linked_dirs, fetch(:linked_dirs, []).push('web/app/uploads')

Here is my staging.rb file :

set :stage, :staging
set :deploy_to, -> { "/var/www/vhosts/project/dev" }
server 'SERVER_IP', user: 'deploy', roles: %w{web app}
set :ssh_options, {
  user: 'deploy',
  keys: %w('/c/Users/alexander/.ssh/id_rsa'),
  forward_agent: true,
  auth_methods: %w(publickey),
  verbose: :debug
}
fetch(:default_env).merge!(wp_env: :staging)

Apache's agent forwarding agent instruction is enabled in sshd_config file : AllowAgentForwarding yes

What should do with my config files to make my deploy work?

Windows 8.1
Ruby 2.2.0
Capistrano 3.2.1
Git Bash

apavliukov
  • 87
  • 8

1 Answers1

7

OK so I had the same issue, and I spent way too long working out exactly what is happening here, and the upshot is -

  • for ruby on windows, you must run pagent, not ssh-agent, for Capistrano and agent forwarding to work - in fact pretty much any tool that uses the Ruby net-ssh library on Windows.

And I dont think that will change, at least not for a while.

Agent Forwarding

See An Illustrated Guide to SSH Agent Forwarding for more about agent forwarding, and how the key challenge ends back up on our workstation.

Terminology

  • workstation - the machine (Windowa server/desktop/laptop) our SSH client software is running from, and, most importantly, our PKI private key is stored on (with or without a passphrase)

  • deployment node - the target of our Capistrano deployment task, most like defined in the 'server' key in our config/deploy.rb, or config/deploy/.rb file

  • git repo - where we will pull the code from, first queried via "git ls-remote" - we will access this git repo via SSH, and the deployment node will use agent forwarding to pass the key challenge back to the workstation

  • SSH client software - how we reach out to sshd on remote servers, and which has access to our private key. Might be putty, an OpenSSH ssh client or the net-ssh library in Ruby.

Setup

I have a Windows 7 workstation box, with Git-Bash, and its OpenSSH ssh client, plus the script from Joe Reagle that sets up some environmental variables that say which port and pid the ssh-agent is operating on.

I also have Putty and Pageant, but I focussed, initially, on just the OpenSSH/Git-Bash tools.

I have set up passwordless ssh from the workstation to the deployment node, I have the ssh-agent running, I have my key added through ssh-add, and I have my public key registered as a read-only access key to the git repo.

Basics

So we are trying to use SSH agent forwarding to have Capistrano pull from our Git repo onto our deployment node.

Now we can test this all ourselves by setting up our public SSH key on the deployment node and using, say, the OpenSSH ssh client, to confirm we have passwordless ssh working. Then we can setup ssh-agent by

  1. starting ssh-agent and setting the SSH_AUTH_SOCK and SSH_AGENT_PID as required.
  2. adding our private key to the ssh-agent via ssh-add
  3. add our public key as an authorised key to the git repo
  4. ssh to the deployment node, and from there do a "git ls-remote git@" (or a ssh -T git@)

If everything is setup correctly, this will all work, and so we will think "ok I can do a 'cap deploy:check'" - and it will fail.

What Went Wrong

We will get an error

"Error reading response length from authentication socket"

Who is telling us this ? It isnt immediately clear, but it

  • isn't the git repo

  • it isnt the git client on the deployment node

  • it isnt the sshd daemon on the deployment node, that wants to pass the key challenge back to the workstation.

Its the Ruby ssh client library on the workstation.

How do we know this

In the ssh_options hash in the deploy.rb file, we add the following : verbose: :debug

When we do this we see this message

  • Pageant not running.

Why is Capistrano trying to use Pageant instead of ssh-agent

When running via Capistrano, the ssh client is different to the one you used when verifying things by hand.

When verifying by hand, it was an OpenSSH ssh client. Now it is the net-ssh library in Ruby.

And on Windows, net-ssh has these lines

if Net::SSH::Authentication::PLATFORM == :win32
  require 'net/ssh/authentication/pageant'
end

or

case Net::SSH::Authentication::PLATFORM
when :java_win32
  require 'net/ssh/authentication/agent/java_pageant'
else
  require 'net/ssh/authentication/agent/socket'

So loading pageant is hard-coded into net-ssh. It doesnt even try to see if you are running under a unix-like shell (like git-bash or cygwin), and to then use the unix-domain ssh-agent SSH_AUTH_SOCK

At present net-ssh doesnt try to open a unix-domain named socket. In theory I think it could, through the UNIXSocket class in the stdlib. But I haven't experimented with that on a Windows machine yet.

Leif
  • 919
  • 8
  • 16
  • 1
    It is 2019, and this appears to still be the case with net-ssh. Thank you. Your thorough explanation saved my day. – Yoopergeek Oct 08 '19 at 22:52