Adel B.

Managing Netplan with Django in Production (Revisisted)

Gemini_Generated_Image_3xpluw3xpluw3xpl

Table of Contents


Situation

Changing a server’s IP address remotely is the kind of task that punishes overconfidence. One wrong route, one malformed YAML, and your remote session evaporates.

In field deployments, we don’t always have SSH access. We do have interface swaps, DHCP surprises, static networks that move, and gateways that mysteriously don’t belong to the same subnet.

So we built a Django-based admin workflow that lets operators safely change IP settings with:

This is how it works.


Problem

Field deployments often require:

Manually editing /etc/netplan/*.yaml is risky because:

  1. A YAML mistake can drop connectivity instantly.
  2. Changing the default route can kill your active session.
  3. Old netplan files or cloud-init can conflict with your intended config.
  4. Operators need a guided, predictable workflow — not shell roulette.

We needed:


Investigation

Architecture Overview

Critical Rule

Only the default interface:

This single constraint prevents most catastrophic lockouts.


Solution

Default Interface from config.cnf

The default interface is the source of truth:

def get_default_interface():
    config_parser = configparser.RawConfigParser()
    config_path = os.path.join(settings.BASE_DIR, "config.cnf")

    try:
        config_parser.read(config_path)
        iface = config_parser.get("system", "eth", fallback=None)
        if iface:
            return iface.strip()
    except Exception:
        pass

    return None

Interface Filtering

Only Ethernet-style interfaces are allowed:

def get_interfaces():
    all_interfaces = os.listdir("/host_sys/class/net")
    return [
        iface for iface in all_interfaces
        if iface.startswith(("eth", "ens", "enp"))
    ]

The view hard-validates the selection:

known_interfaces = list(get_interfaces())
if selected_interface not in known_interfaces:
    messages.error(request, f"Unknown interface '{selected_interface}'")
    return redirect("/setting#change_ip")

File Permissions and Ownership

Staged and live YAML files are locked down:

def set_permissions(filepath, mode=0o600):
    os.chmod(filepath, mode)
    os.chown(filepath, 0, 0)  # root:root

This ensures predictable ownership and reduces risk of accidental edits.


Building Netplan Config In Memory

Before touching the filesystem:

interface_config = {
    "network": {
        "version": 2,
        "renderer": "networkd",
        "ethernets": {
            selected_interface: {"optional": True}
        },
    }
}

DHCP Mode (Non-Default Only)

interface_config["network"]["ethernets"][selected_interface]["dhcp4"] = True
interface_config["network"]["ethernets"][selected_interface]["addresses"] = []
interface_config["network"]["ethernets"][selected_interface]["dhcp4-overrides"] = {
    "use-routes": False,
    "route-metric": 500,
}

use-routes: False prevents DHCP from hijacking the default route.


Static Mode

interface_config["network"]["ethernets"][selected_interface]["dhcp4"] = False
interface_config["network"]["ethernets"][selected_interface]["addresses"] = [ip]

If it’s the default interface, inject the default route:

interface_config["network"]["ethernets"][selected_interface]["routes"] = [
    {"to": "0.0.0.0/0", "via": gateway}
]

Staging Only (.yaml.pending)

pending_path = f"/host_netplan/{selected_interface}.yaml.pending"

with open(pending_path, "w") as f:
    yaml.dump(interface_config, f, default_flow_style=False)

set_permissions(pending_path, 0o600)

At this point:

Then the UI redirects to the confirmation page.


Confirmation: The Only Commit Point

On confirm:

  1. Backup live YAML (if exists)
  2. Promote pending → live (atomic)
  3. Write ip_reset signal file (atomic)
  4. Store session state for polling
if os.path.exists(yaml_path):
    shutil.copy(yaml_path, backup_path)

os.replace(pending_path, yaml_path)

atomic_write("/code/ip_reset", sig_content)

Atomic write implementation:

def atomic_write(path: str, content: str):
    tmp_path = f"{path}.tmp"
    with open(tmp_path, "w") as f:
        f.write(content)
        f.flush()
        os.fsync(f.fileno())
    os.replace(tmp_path, path)

If the default interface IP changed, the browser redirects to the new IP.


The watcher service:

It:

The systemd unit ensures resiliency:

Restart=always
RestartSec=2
StandardOutput=append:/home/.../netplan_result.log
StandardError=append:/home/.../netplan_result.log

UI

Interface Status Endpoint

The UI reads watcher-generated JSON:

def interface_status_view(request):
    INTERFACE_CONFIG_PATH = "/code/interface_config.json"
    ...
    return JsonResponse({
        "interface": iface,
        "ip": iface_data.get("ip", ""),
        "gateway": iface_data.get("gateway", "")
    })

This avoids shelling out during requests.


Netplan Log Polling

check_netplan_log:

State matters more than logs. If the interface matches expectation, we treat it as success even if logs are late.


Takeaways


Flow

image

  1. Navigate to Network Settings
  2. Select interface
  3. Choose DHCP or static
  4. Enter IP/CIDR and gateway (if required)
  5. Click Save (stages config)
  6. Review confirmation page
  7. Click Confirm changes
  8. Wait page monitors apply status
  9. Redirect back (or to new IP if default changed)

Downtime is typically just the duration of netplan apply.


Final Notes

This system works because it treats network changes as a transaction:

It reduces lockouts, improves auditability, and gives operators a predictable workflow without SSH access. Why it’s safe: