Deploy Hermes AI Agent on a VPS: Step-by-Step Guide for 2026

Written by Alex I. | May 26, 2026 11:27:22 AM

Your own AI assistant in Telegram, running 24/7 on a VPS using free models — $0 in API fees. The only thing you pay for is the VPS. Here's how to set it up from scratch, even if you've never touched a terminal.

Hermes Agent is an open-source AI agent by Nous Research. It connects to Telegram (and Discord, Slack, WhatsApp — but we'll focus on Telegram), runs on any Linux server, and works with dozens of LLM providers, including free ones (we'll use OpenRouter). Think of it as your personal ChatGPT bot, but self-hosted and fully under your control.

We tested the entire setup on an is*hosting VPS. This guide documents every step, including the errors we hit and how we fixed them.

What You'll Need

An is*hosting VPS (Start plan or higher)
A Mac or Windows computer
A Telegram account
An OpenRouter account (free)
About 20 minutes

Step 1: Order a VPS

Recommended minimum specs are:

CPU: Xeon 2x2.20 GHz, Xeon 3x2.60 GHz, or higher
RAM: 2-4 GB
Storage: 30-40 GB SSD
OS: Ubuntu 24
Location: Whatever's closest to you

On is*hosting you can start with:

Start VPS: $11.99/mo (or $10.19/mo on an annual plan).
Medium VPS: $24.99/mo (or $21.24/mo on an annual plan).

The Lite plan (1 GB RAM) is too tight. Start, Medium and higher plans give comfortable margins. We tried Hermes on Medium VPS in this guide.

After payment, you'll receive an IP address and root password via email. Save both.

Start VPS

VPS configuration right for trying Hermes: 2 GB RAM and 30 GB SSD.

Get VPS

Step 2: Connect to Your Server via SSH

Mac: Open Terminal (Cmd+Space → type "Terminal" → Enter).

Windows: Download Tabby or use PowerShell.

Type:

ssh root@YOUR_VPS_IP_ADDRESS

Replace YOUR_VPS_IP_ADDRESS with the actual IP from your email. Example:

ssh root@37.252.10.8

The first connection will ask: "Are you sure you want to continue connecting?" — type yes and press Enter.

Then paste your password. The password won't show on screen — that's normal, it's a security feature. Just paste and hit Enter.

When you see root@server:~# — you're in.

Step 3: Install Hermes Agent

One command does everything — installs Python, Node.js, Git, and Hermes itself:

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

Wait for it to finish. The installer will show progress as it sets up the dependencies.

Important: The official install URL is on GitHub. Some older guides list hermes-agent.org/install.sh — that URL is broken and will throw a syntax error near unexpected token error. Use the GitHub URL above.

After installation completes, Hermes will automatically launch the Setup Wizard.

Step 4: Run Through the Setup Wizard

The wizard walks you through the configuration. Here's what to pick at each screen:

Setup Type

Select Quick setup — provider, model & messaging and press Enter.

Inference Provider

Choose OpenRouter (100+ models, pay-per-use) from the list.

OpenRouter API Key

The wizard will ask for your OpenRouter API key.

Go to openrouter.ai/keys.
Sign up (Google login works).
Click Create Key.
Copy the key (starts with sk-or-...).

Paste it into the terminal and press Enter.

Model Selection

The wizard shows a list of models with pricing. Pick a free model. Scroll down to find the options marked free. As of May 2026, working free models include:

nvidia/nemotron-3-super-120b-a12b:free — recommended, tested and working
z-ai/glm-5.1
openrouter/owl-alpha

You can also choose "Enter custom model name" at the bottom of the list and type the model identifier directly.

Heads up: Free models on OpenRouter come and go. During our testing, deepseek/deepseek-chat-v3.1:free returned a 404 error (removed from OpenRouter), and google/gemini-2.0-flash-exp:free also failed. The Nvidia Nemotron model worked on the first try. If your chosen model fails, just switch it later with hermes config set model.default MODEL_NAME

Terminal Backend

Select Keep current (local). Your VPS runs commands directly, no containers needed.

Messaging Platform

Select Set up messaging now (recommended).

You can skip the Telegram setup entirely and jump to the finish screen. That happened to us. We'll configure Telegram manually in Steps 5 and 6. For now, continue through the wizard.

Step 5: Create a Telegram Bot

Before (or during) the setup wizard, you need a Telegram bot token:

Open Telegram, search for @BotFather.
Tap Create a New Bot (or send /newbot).
Choose a display name (anything you want).
Choose a username (must end in bot, e.g., myhermes_bot).
BotFather gives you a token. Copy it.

Keep this token safe. Anyone who has it can control your bot.

Step 6: Configure Telegram Gateway

If the wizard asked for your Telegram token — great, you already did this part. If you skipped the step, run:

hermes gateway setup

This reopens the messaging configuration. Choose Telegram:

It will:

Ask for your Telegram bot token — paste the token from BotFather.
Ask for allowed user IDs — this restricts who can use your bot.

To find your Telegram user ID: message @userinfobot in Telegram; it replies with your numeric ID.

Enter your ID to lock the bot to yourself, or leave it empty for open access (you can restrict it later).

Ask about home channel — type Y to use your own chat as the notification channel.
Show platform list — select Done (no need to add Discord, Slack, etc.).

Step 7: Install and Start the Gateway Service

The wizard will ask:

"Start the gateway now?" — answer Y

"Start automatically on login/boot as a systemd service?" — answer Y

"Choose how the gateway should run in the background" — select System service. On a VPS, this is the right choice: the bot starts automatically when the server reboots.

"Run the system gateway service as which user?" — type root

Done. The gateway is now running.

Step 8: Test Your Bot

Open Telegram. Find your bot by the username you created. Send "hi."

If everything works, the bot responds.

Troubleshooting

Bad config, wrong model, garbage characters in YAML — none of these throw clear errors. The fix is always the same loop: check logs, verify config, fix, and restart.

"The Model Provider Failed After Retries"

The most common error. Your bot is running, but the LLM can't respond.

Check the logs first:

journalctl -u hermes-gateway -n 30 --no-pager

What the error codes mean:

HTTP 404: No endpoints found for MODEL_NAME — the model was removed from OpenRouter. Switch models.
HTTP 400 — bad request. It usually means the model field is empty or the provider config is broken. See "Config Corruption" below.
HTTP 429 — rate limit. Wait a few minutes, or switch to a different free model.
HTTP 401 — bad API key. Re-check your OpenRouter key.

Switch to a different model:

hermes config set model.default nvidia/nemotron-3-super-120b-a12b:free
sudo systemctl restart hermes-gateway

During our testing, we went through three models before finding one that worked:

deepseek/deepseek-chat-v3.1:free — 404, model removed from OpenRouter
google/gemini-2.0-flash-exp:free — failed after retries
nvidia/nemotron-3-super-120b-a12b:free — worked immediately

Free models rotate. If yours stops working, check openrouter.ai/models and filter by "free."

Config Corruption: Garbage Characters in config.yaml

This one is sneaky. If you paste text into nano from certain sources, invisible UTF-8 characters can end up at the start of the file. Hermes tries to parse the YAML, fails silently on the corrupted section, and loads without a main model. Everything downstream breaks.

How to detect it:

head -c 20 /root/.hermes/config.yaml | xxd

If the first bytes aren't 6d 6f 64 65 6c (the word model), you have garbage.

How to fix it:

sed -i '1s/^[^a-zA-Z]*//' /root/.hermes/config.yaml

Then verify with hermes config show — the model: block should be visible.

General rule: After any manual edit to config.yaml, always run hermes config show to confirm Hermes actually parsed what you wrote. Visual inspection in nano isn't enough.

api_key_env vs. Actual API Key

In config.yaml, the providers: block has a field called api_key_envThis is the name of the environment variable that holds your key — not the key itself.

Wrong:

providers:
  anthropic:
    api_key_env: sk-ant-abc123...   # ← actual key, WRONG

Right:

providers:
  anthropic:
    api_key_env: ANTHROPIC_API_KEY  # ← variable name, correct

The actual key goes in /root/.hermes/.env. If you paste the raw key into api_key_env, Hermes won't throw an error; it'll silently fail to authenticate and fall through to another provider with no model configured.

provider: auto Cascading Failures

Hermes uses auxiliary models for internal tasks: skill management, session compression, triage, and approval. By default, these are set to provider: auto, which means "use whatever the main model uses."

If your main model config is broken, auto cascades the failure to all auxiliary slots. You'll see HTTP 400 errors flooding the logs even after fixing the main model.

Fix by setting explicit models for auxiliary tasks:

hermes config set auxiliary.compression.provider deepseek
hermes config set auxiliary.compression.model deepseek-v4-flash

Repeat for other slots (title_generation, session_search, etc.) or edit config.yaml directly:

auxiliary:
  compression:
    provider: deepseek
    model: deepseek-v4-flash
  title_generation:
    provider: deepseek
    model: deepseek-v4-flash

Compression Model Warning

If your main model has a large context window (e.g., Claude Sonnet at 500k tokens) but your compression model has a smaller one (e.g., Haiku at 200k), Hermes will warn on every session: "Compression model context smaller than threshold."

Fix:

hermes config set compression.threshold 0.2

This sets the threshold to 20% of the main model's context — well within the compression model's window.

Terminal Backend Set to SSH-to-self

If the setup wizard (or a bad config edit) set terminal.backend to ssh pointing at your own VPS IP, Hermes will try to SSH into itself to run shell commands. This breaks things in creative ways.

Fix:

hermes config set terminal.backend local

Tirith Security Scanner Blocking Safe Commands

Hermes has a built-in security scanner called Tirith that intercepts certain command patterns (like piping curl output into Python). Even with auto-approve enabled, Tirith can still trigger approval prompts.

If your bot's tasks involve fetching data from APIs and you trust the endpoints:

hermes config set security.tirith_enabled false

Note: Tirith and the approval system are separate. Disabling one doesn't disable the other.

VPS

More power, less cost. VPS with NVMe speed, 40+ locations, managed or unmanaged — your choice for Hermes.

Get VPS

Useful Commands

Once Hermes is running, here's your daily toolkit:

sudo systemctl status hermes-gateway     # check if gateway is running
sudo systemctl restart hermes-gateway    # restart after config changes
journalctl -u hermes-gateway -n 50       # view recent logs
hermes config set model.default MODEL    # switch model
hermes gateway setup                     # reconfigure messaging
hermes update                            # update to latest version
hermes doctor                            # diagnose issues

Adding Other Users to Your Bot

This is where Hermes gets confusing. There are two authorization systems, and which one actually controls access depends on your setup.

Method 1: Allowlist in .env (The One That Works)

The TELEGRAM_ALLOWED_USERS variable in /root/.hermes/.env is the real gatekeeper. If this variable exists and has values, it overrides anything in config.yaml.

Check what's there now:

grep -i TELEGRAM_ALLOWED /root/.hermes/.env

To add a user, edit the .env file:

nano /root/.hermes/.env

Find TELEGRAM_ALLOWED_USERS and add the new ID:

TELEGRAM_ALLOWED_USERS=154220189,987654321

Restart:

sudo systemctl restart hermes-gateway

To find the new user's ID: ask them to message the bot, then check logs.

journalctl -u hermes-gateway -n 30 | grep Unauthorized

Method 2: config.yaml (May or May Not Work)

Some Hermes versions read allowed_chats from config.yaml:

telegram:
  allowed_chats: '154220189,987654321'

But if .env has TELEGRAM_ALLOWED_USERS set, config.yaml gets ignored. Don't spend an hour editing YAML variants, wondering why nothing changes — check .env first.

Method 3: Pairing Flow (Advanced)

Hermes has a pairing system where new users message the bot, receive a code, and the owner approves with hermes pairing approve telegram <CODE>But pairing only works when there's no allowlist set. If TELEGRAM_ALLOWED_USERS has any value, pairing is disabled entirely.

Pick one approach and stick with it. Allowlist in .env is the simplest.

Where Hermes Stores Config (The Full Picture)

Hermes has three config locations. The same setting can exist in multiple places. When something doesn't work, check all three:

/root/.hermes/config.yaml — model, provider, auxiliary settings, gateway config.
/root/.hermes/.env — API keys, Telegram tokens, allowed user lists.
Hermes auth store — registered providers (check with hermes auth list).

Free Model Limits

Running on free OpenRouter models means:

~20 requests per minute, ~50 per day per model
No image recognition (vision)
Lower quality compared to Claude or GPT
Free models occasionally disappear or get rate-limited

For casual use and experiments, that's absolutely fine. When you outgrow free limits, top up $10 on OpenRouter and switch to a paid model:

hermes config set model.default anthropic/claude-haiku-4.5
sudo systemctl restart hermes-gateway

Claude Haiku 4.5 costs $1/$5 per million tokens — a few dollars a month for typical personal use.

What You've Built

After following this guide, you have:

A VPS running 24/7 in a data center
Hermes AI Agent with a Telegram bot interface
A free LLM model ($0 in API costs)
Auto-start on server reboot
Access restricted to your Telegram account only

The bot keeps running even when your laptop is off.

What's Next

Hermes is more than a chat interface. Once you're comfortable:

Add skills — Hermes can learn new capabilities from markdown files you write
Set up cron jobs — schedule automated tasks ("send me a daily summary at 9 AM")
Connect more platforms — Discord, Slack, WhatsApp, Signal
Switch to stronger models — Claude, GPT, DeepSeek for production-quality responses
Use local models — run Ollama on a GPU server for fully private AI

Scale your VPS server anytime you hit the resource limits because of new skills or a full database. At is*hosting, you can request it on any VPS.

Full documentation: hermes-agent.nousresearch.com/docs

View full post