Importing files directly from remote SFTP servers can be a repetitive task if handled manually. Instead of running one‑off scripts, you can automate this process by continuously polling a remote directory and downloading new files as they appear. In this post, we’ll walk through building a Ruby script that leverages the Net::SFTP gem for secure, automated file imports.

Prerequisites and setup

To follow along with this guide, ensure that you have the following:

  • Ruby 2.0 or later
  • Net::SFTP gem (version 4.0.0 or later): gem install net-sftp -v '~> 4.0.0'
  • A valid SSH key for passwordless authentication
  • Basic familiarity with Ruby and command‑line operations

Also, make sure that your SFTP server has been configured to accept your SSH key and that you have the necessary permissions to access the target directory.

Building the file poller

The goal is to create a Ruby script that connects to an SFTP server, scans a designated directory for files, downloads them to a local folder, and then waits before polling again. We will incorporate robust error handling, logging, and production-ready features.

Here's a complete example of the Ruby script with modern best practices:

require 'net/sftp'
require 'logger'
require 'json'
require 'tempfile'
require 'fileutils'

# Configuration constants
SFTP_HOST   = ENV.fetch('SFTP_HOST')
SFTP_USER   = ENV.fetch('SFTP_USER')
SFTP_PORT   = ENV.fetch('SFTP_PORT', 22).to_i
REMOTE_DIR  = ENV.fetch('REMOTE_DIR')
LOCAL_DIR   = ENV.fetch('LOCAL_DIR', './downloads')
SSH_KEY     = ENV.fetch('SSH_KEY_PATH')

# Initialize structured logger
logger = Logger.new(STDOUT)
logger.formatter = proc do |severity, datetime, progname, msg|
  JSON.dump(
    timestamp: datetime.iso8601,
    severity: severity,
    message: msg,
    service: 'sftp-poller'
  ) + "\n"
end

def download_file(sftp, remote_file, final_path, logger)
  temp_file = Tempfile.new('sftp-download')
  begin
    sftp.download!(remote_file, temp_file.path)
    FileUtils.mv(temp_file.path, final_path)
    logger.info({ action: 'download_complete', file: remote_file, destination: final_path }.to_json)
    true
  rescue Net::SFTP::StatusException => e
    logger.error({ action: 'download_failed', file: remote_file, error: e.message, code: e.code }.to_json)
    false
  ensure
    temp_file.close
    temp_file.unlink
  end
end

def with_retries(max_attempts: 3, base_delay: 1, logger:)
  attempt = 0
  begin
    attempt += 1
    yield
  rescue Net::SSH::AuthenticationFailed => e
    logger.error({ action: 'authentication_failed', error: e.message }.to_json)
    raise
  rescue Errno::ECONNREFUSED, Net::SSH::ConnectionTimeout => e
    if attempt < max_attempts
      delay = base_delay * (2 ** (attempt - 1))
      logger.warn({ action: 'retry_attempt', attempt: attempt, delay: delay, error: e.message }.to_json)
      sleep delay
      retry
    end
    logger.error({ action: 'max_retries_reached', error: e.message }.to_json)
    raise
  end
end

def poll_sftp(logger)
  with_retries(logger: logger) do
    Net::SFTP.start(SFTP_HOST, SFTP_USER, port: SFTP_PORT, keys: [SSH_KEY]) do |sftp|
      logger.info({ action: 'connection_established', host: SFTP_HOST, directory: REMOTE_DIR }.to_json)

      sftp.dir.foreach(REMOTE_DIR) do |entry|
        next if entry.name.start_with?('.')

        remote_file = File.join(REMOTE_DIR, entry.name)
        local_file  = File.join(LOCAL_DIR, entry.name)

        if download_file(sftp, remote_file, local_file, logger)
          begin
            sftp.remove!(remote_file)
            logger.info({ action: 'remote_file_removed', file: remote_file }.to_json)
          rescue Net::SFTP::StatusException => e
            logger.error({ action: 'remove_failed', file: remote_file, error: e.message }.to_json)
          end
        end
      end
    end
  end
end

# Ensure the local download directory exists
FileUtils.mkdir_p(LOCAL_DIR)

# Set up signal handling for graceful shutdown
@shutdown = false
Signal.trap('TERM') { @shutdown = true }
Signal.trap('INT') { @shutdown = true }

# Main polling loop with graceful shutdown
until @shutdown
  poll_sftp(logger)
  logger.info({ action: 'polling_wait', delay: 60 }.to_json)
  sleep 60
end

logger.info({ action: 'shutdown_complete' }.to_json)

Understanding the script

This production-ready script implements several important features:

  1. Environment-based Configuration: Uses environment variables for sensitive configuration, following security best practices.

  2. Structured Logging: Implements JSON-formatted logging for better integration with log aggregation systems.

  3. Atomic File Operations: Uses temporary files and atomic moves to ensure file integrity during downloads.

  4. Robust Error Handling: Implements specific error types and retries with exponential backoff for transient failures.

  5. Graceful Shutdown: Properly handles termination signals for clean process management.

  6. File Cleanup: Automatically removes successfully downloaded files from the remote server to prevent duplicate processing.

Production deployment

For production environments, consider these deployment options:

Using systemd

Create a systemd service file for reliable process management:

[Unit]
Description=SFTP File Poller
After=network.target

[Service]
Type=simple
User=sftp-user
Environment=SFTP_HOST=example.com
Environment=SFTP_USER=username
Environment=REMOTE_DIR=/remote/path
Environment=LOCAL_DIR=/local/path
Environment=SSH_KEY_PATH=/path/to/key
ExecStart=/usr/bin/ruby /path/to/sftp_poller.rb
Restart=always
RestartSec=60

[Install]
WantedBy=multi-user.target

Using docker

Create a Dockerfile for containerized deployment:

FROM ruby:3.2-slim
RUN gem install net-sftp -v '~> 4.0.0'
WORKDIR /app
COPY sftp_poller.rb .
CMD ["ruby", "sftp_poller.rb"]

Security best practices

  1. SSH Key Management:

    • Rotate SSH keys regularly
    • Use ed25519 keys for better security
    • Store keys securely using environment variables or secure vaults
  2. Network Security:

    • Restrict SFTP access to specific IP ranges
    • Use strong ciphers and key exchange algorithms
    • Implement connection timeouts
  3. File Access:

    • Use minimal permissions for both local and remote files
    • Implement file integrity checks
    • Clean up temporary files properly
  4. Monitoring:

    • Set up alerts for failed downloads and connection issues
    • Monitor disk space usage
    • Track processing metrics

Conclusion

This Ruby script provides a robust foundation for automated SFTP file imports, incorporating modern best practices for security, reliability, and maintainability. The implementation handles common edge cases and provides proper logging for monitoring and debugging.

For those seeking a managed solution, Transloadit offers an SFTP Import Robot that handles these complexities automatically. Learn more about it here.