Skip to main content

DevOps

Explore DevOps methodologies and tools to enhance collaboration between development and operations teams. Learn about CI/CD, infrastructure as code, and monitoring solutions.

65posts

Guides

Posts

DevOps
|10 min read

Stop Using Random UUIDs as Primary Keys: uuidv7() Lands in PostgreSQL 18

Random UUIDv4 primary keys quietly wreck insert speed and bloat indexes on large tables. PostgreSQL 18 ships a native time-ordered uuidv7() that keeps the upsides of UUIDs without the B-tree penalty. Here are the numbers and how to adopt it.

DevOps
|9 min read

Compute That Lives on Your Database Branch

Neon Functions run your code in the same region as your Postgres, on a per-branch URL. To see why that matters I deployed a small API and timed a query from inside the function versus from a machine across the Atlantic: 1.2 ms against 135 ms. Here is how it works, with the real numbers and the repo.

DevOps
|9 min read

Streaming an AI Agent Without a Function Timeout

Long agent loops and long token streams run into the same wall: a serverless function that hits its execution cap and cuts the connection. Neon Functions hold long-lived streaming connections by default. I deployed two endpoints to prove it: one streamed for 90 seconds, the other streamed an agent token by token starting at 466 ms.

DevOps
|10 min read

I Gave an AI Agent a Database, Compute, Storage, and Models From One CLI

An AI agent usually needs four accounts: a database, somewhere to run, object storage, and a model provider. I wired all four from a single Neon credential and had a deployed image-generating agent in a few minutes. Here is the actual build log, the config that ties it together, and the honest caveats.

DevOps
|9 min read

Neon Is Becoming a Backend Platform, Not Just Postgres

In June 2026 Neon added serverless functions, S3-compatible object storage, and an AI gateway to its database. The interesting part is not any one feature, it is the through-line: everything branches with your data. Here is what shipped, what it competes with, and where the seams still show.

DevOps
|8 min read

SpaceX Just Bought Cursor for $60B. What That Means If Your Team Lives in It

SpaceX is acquiring Anysphere, the maker of Cursor, in a $60 billion all-stock deal, the largest acquisition of a venture-backed startup ever. The number is the headline. The real question for engineering teams is what it means to build your daily workflow on a tool whose owner just changed.

DevOps
|8 min read

Your First Serverless LLM Call on DigitalOcean in 10 Minutes

DigitalOcean's Inference Engine gives you an OpenAI-compatible endpoint with pay-per-token pricing and no GPU to manage. Here is the fastest path from zero to a working call, with curl, Python, and Node, every snippet run against the live API.

DevOps
|10 min read

AI SRE Agents: What They Actually Fix, and What They Will Happily Break

AI SRE is now its own category, with every incident vendor shipping an agent that investigates and remediates on its own. Here is the honest split: where these agents genuinely earn their keep, where they are oversold, and the one risk nobody puts on the marketing page.

DevOps
|9 min read

Hetzner Doubled Its Prices Again. The AI Memory Crunch Is Why

On June 15, 2026, Hetzner raised prices on new orders by roughly 99% in Germany and 158% in the US, the latest in a string of 2026 increases. It is not greed and it is not just Hetzner: the AI memory supercycle has reached the infrastructure bill of teams that never touch AI.

DevOps
|10 min read

The US Government Pulled Two Frontier Models Overnight. The Real Lesson Is About Your Stack

On June 12, 2026, an export-control directive forced Anthropic to disable Claude Fable 5 and Mythos 5 for every user worldwide, three days after launch. The policy fight is interesting. The operational lesson for anyone building on a single model provider is more urgent.

DevOps
|9 min read

npm v12 Will Stop Running Install Scripts. We Audited Our Repos to See What Actually Breaks

Starting with npm v12 (estimated July 2026), dependency install scripts will not run unless you allowlist them. We ran the new audit tooling on our own production repos: 65 packages flagged, 4 that matter, and a surprising amount of nothing breaking.

DevOps
|8 min read

OpenTofu 1.12: destroy = false Retires the tofu state rm Ritual

OpenTofu 1.12 lets a resource declare that it should be forgotten instead of destroyed, makes prevent_destroy dynamic, and quietly ends the manual providers lock step. Here is what each change does, plus the footguns the release notes will not warn you about.

DevOps
|12 min read

Neon vs Supabase in Production: We Benchmarked the Operations That Page You at 3am

Two benchmark sessions against Neon and Supabase Pro measured what spec sheets never show: compute resizes cost 39 seconds of real downtime on one platform and zero on the other, read replicas differ by 23x, and branch creation has a tail you should know about.

DevOps
|10 min read

Neon vs Supabase Pricing: What the Same App Costs From Launch to Scale

We priced one application through five growth stages on both platforms using verified June 2026 list prices. The result is three distinct cost regimes, two crossover points, and a surprise: at scale the biggest line item is not the database.

DevOps
|11 min read

Neon vs Supabase Free Tiers: We Benchmarked Both So You Don't Have To

We ran 320 timed operations against the Neon and Supabase free tiers from a same-region client: query latency, project creation, cold starts, and branching. The latency race is a tie, and the real differences are nothing like the marketing.

DevOps
|7 min read

node-postgres Silently Ignores Your TLS Config When the URL Says sslmode

If your connection string contains sslmode=require, the pg library throws away the ssl options object where you loaded your CA certificate, and verification fails with "self-signed certificate in certificate chain". Here is the trap, the fix, and the v9 changes coming.

DevOps
|12 min read

Designing Rate Limiting for APIs: Algorithms, Patterns, and Implementation

A practical comparison of token bucket, leaky bucket, fixed window, and sliding window rate limiting, with copy-paste Redis and FastAPI code, nginx config, and guidance on which one to actually use.

DevOps
|10 min read

Shai-Hulud Reaches PyPI: The Hades Wave That Runs Before You Import It

The Shai-Hulud worm jumped to PyPI on June 7. The Hades wave hides in 19 Python packages, runs at interpreter startup through a .pth hook before you import anything, and steals your CI/CD secrets.

DevOps
|11 min read

Is Valkey Ready to Replace Redis in 2026?

Valkey forked from Redis after the 2024 license change and has matured fast. Here is whether it is production-ready, how the migration works, and whether the AGPL question even applies to you.

DevOps
|13 min read

Zero-Downtime Database Migrations for PostgreSQL in Production

A single ALTER TABLE can take down a busy PostgreSQL database for minutes. This post shows why that happens and how to ship schema changes safely with lock timeouts, the expand-and-contract pattern, and copy-paste SQL recipes for indexes, columns, constraints, and type changes.

DevOps
|12 min read

OpenTelemetry Just Graduated: What to Retire from Your Stack This Quarter

On May 21, 2026, CNCF graduated OpenTelemetry. All three core signals (traces, metrics, logs) are now production-ready, the project is the second-most-active in CNCF after Kubernetes itself, and Anthropic, Bloomberg, Capital One, eBay, and Heroku run it at scale. Here is the decision framework for what proprietary agents you can stop running, what is still risky, and the 90-day adoption checklist.

DevOps
|11 min read

How to Build an Effective On-Call Rotation and Escalation Policy

Your phone buzzed at 3:14 AM for a disk warning that auto-resolved by 3:16. Nobody fixes the alert. The next person on rotation hates their life. Here is how to build on-call schedules, escalation policies, and alert rules that respect your engineers.

DevOps
|10 min read

When the Malicious Hook Is in the Other Manifest: 700+ Repos, 8 Packagist Packages, One package.json Trick

On May 22, 2026, Socket disclosed a Composer supply chain attack that hid an npm-style postinstall command inside package.json on PHP projects. composer.json was clean, the PHP review missed it, and 700+ GitHub repos pulled it in. Here is the exact payload, why ecosystem-boundary blindness keeps catching teams, and how to wire your CI to look at both manifests.

DevOps
|11 min read

node-ipc DNS-Tunneling Supply Chain Attack: Your Egress Firewall Probably Missed This

On May 14, 2026, three malicious versions of the node-ipc npm package shipped a payload that hunts AWS, SSH, kubeconfig, and GitHub CLI credentials, then smuggles them out through DNS TXT queries. Most orgs filter HTTPS egress. Almost nobody filters DNS. Here is what the payload does and how to close the gap.

DevOps
|12 min read

AI Is Reshaping DevOps. The Engineers Are Faster Than the Vendors.

GitHub, Datadog, HashiCorp and friends are moving carefully. The engineers running their stacks are wiring AI into kubectl and pull-request review on a Tuesday afternoon. Here is what is actually changing in 2026, what is not, and where the gap between vendors and the engineers using their tools is widest.

DevOps
|9 min read

AntV npm Compromise: The Shai-Hulud Worm Comes for Your Dashboards (May 19, 2026)

A new Shai-Hulud wave landed at 01:56 UTC on May 19 and rode the @antv maintainer account through 323 packages including echarts-for-react. Here is what got published, what it steals, and the lockfile grep that tells you if you are exposed.

DevOps
|11 min read

TanStack npm Worm: The Supply-Chain Attack With a Dead-Man's Switch

On May 11, 2026, attackers republished 14+ official TanStack packages on npm with a worm that signs itself with valid SLSA provenance and arms a dead-man's switch that wipes your home directory the moment you revoke the stolen GitHub token. Here is what happened, how the payload works, and how to check your machine.

DevOps
|11 min read

Distributed Tracing with OpenTelemetry: From Instrumentation to Visualization

A walkthrough of instrumenting a real service with OpenTelemetry, running the Collector, and finding the slow span in Jaeger when a request hops across five microservices.

DevOps
|9 min read

10 GitHub Repositories That Will Actually Teach You DevOps in 2026

Most "top DevOps repos" lists are recycled awesome-list links. This one is a curated set of repositories that will move the needle on your DevOps skills, with star counts, who each one is for, and how to actually use it.

DevOps
|12 min read

CVE-2026-31431 Copy Fail: A 4-Byte Kernel Write That Escapes Containers

A new Linux kernel bug lets any unprivileged process flip 4 bytes in the page cache and break out of a container. runtime-default seccomp does not block it. Here is what to do.

DevOps
|13 min read

GitOps with Argo CD: Structuring Your Repository for Multi-Environment Deployments

A practical guide to laying out your Git repository for Argo CD across dev, staging, and production. See real folder structures, Kustomize and Helm patterns, and the pitfalls that bite teams in production.

DevOps
|9 min read

The MCP Design Flaw That Exposes 150M Downloads to RCE

Researchers at OX Security disclosed an architectural vulnerability in Anthropic MCP that enables remote code execution across Python, TypeScript, Java, and Rust SDKs. Anthropic calls it "by design." Here is how the flaw works, which tools are affected, and what to do if you use Cursor, Claude Code, LangChain, or anything with an MCP server.

DevOps
|8 min read

The Vercel April 2026 Security Incident: What Happened and What to Do About It

Vercel disclosed a security incident that started with a compromised OAuth app at Context.ai, escalated through a Vercel employee Google Workspace account, and reached internal systems plus customer environment variables not marked sensitive. Here is the attack chain, what was exposed, and what to change in your deployments.

DevOps
|10 min read

How Does It Work So Fast? The Engineering Behind Instant UI Responses

Credit card validation, username checks, autocomplete, URL shorteners - they all feel instant. Here is what is actually happening under the hood in each case.

DevOps
|10 min read

SLOs, SLIs, and Error Budgets: A Practical Implementation Guide

Your service went down at 2 AM and nobody could agree on whether it was "bad enough" to page someone. SLOs, SLIs, and error budgets fix that. Here is how to define, measure, and act on them with real Prometheus queries and alerting rules.

DevOps
|14 min read

Best Claude Code Plugins for DevOps Engineers in 2026

A curated guide to Claude Code plugins built for DevOps workflows - from Terraform validation and Kubernetes troubleshooting to security scanning and CI/CD pipeline optimization.

DevOps
|10 min read

Claude Code: Agents, Commands, Skills, and Plugins Explained

A clear breakdown of the four extension types in Claude Code - what each one does, how they differ, and when to use which. No marketing fluff, just practical explanations with examples.

DevOps
|10 min read

CLI vs MCP: When to Use Each for AI-Powered DevOps

CLI tools and MCP servers both let AI agents interact with your infrastructure, but they solve different problems. Here is when to reach for each one and why the answer is usually both.

DevOps
|14 min read

Building an Internal Developer Platform from Scratch

A step-by-step guide to designing and building an internal developer platform that gives your teams self-service infrastructure, faster deployments, and fewer tickets to the platform team.

DevOps
|14 min read

Coolify: Self-Hosted PaaS on DigitalOcean - Deploy Apps Without Vendor Lock-In

Set up Coolify on a DigitalOcean droplet and get your own Vercel-like platform for deploying Next.js apps, databases, and more - with auto SSL, GitHub auto-deploy, and no per-seat pricing.

DevOps
|11 min read

CVE-2025-55182 React2Shell: 766 Next.js Hosts Breached in 24 Hours

A CVSS 10.0 RCE in React Server Components let attackers breach 766 Next.js hosts in a single day, stealing database credentials, SSH keys, and cloud secrets. Here is how it works, who is affected, and what to do right now.

DevOps
|6 min read

Claude Code Source Leaked via npm Source Maps: Lessons for Every DevOps Team

Anthropic accidentally shipped source maps in their npm package, exposing 512,000 lines of Claude Code source. Here is what went wrong and how to prevent it in your own CI/CD pipeline.

DevOps
|7 min read

The Axios Supply Chain Attack: What DevOps Teams Need to Know

A compromised npm maintainer account led to malicious axios versions deploying a RAT across macOS, Windows, and Linux. Here is what happened, how to check if you are affected, and how to prevent this in your pipeline.

DevOps
|8 min read

Claude Code Hidden Features You Probably Missed

From mobile sessions to automated PR reviews, here are the Claude Code features that most engineers overlook but can seriously level up your workflow.

DevOps
|7 min read

5 DevOps Books Worth Reading in 2026

A curated list of DevOps books that are actually worth your time in 2026, from beginner Linux guides to production Kubernetes patterns and the SRE bible.

DevOps
|11 min read

The 3 Infrastructure Decisions That Determine Your Engineering Velocity

Provisioning model, environment strategy, and deployment surface. Everything else is optimization. Here's how to make these foundational choices without killing your team's momentum.

DevOps
|11 min read

When Kubernetes Is the Wrong Default

Most teams adopt Kubernetes too early. Here's a pragmatic framework for deciding between managed platforms, VMs, and Kubernetes based on your team size and workload characteristics.

DevOps
|12 min read

Build vs Buy in 2026: What Still Makes Sense to Build In-House?

A practical guide to infrastructure decisions: When building in-house makes sense, when it wastes resources, and how to calculate the true cost of engineering time.

DevOps
|10 min read

The Hidden Cost of Overengineering Your First 50 Engineers

Service meshes, multi-cloud strategies, and platform teams sound impressive. But for early-stage companies, they often slow delivery and burn cash. A practical guide to progressive complexity adoption.

DevOps
|15 min read

Infrastructure as Code: A Beginner's Guide to IaC Fundamentals

Learn the fundamentals of Infrastructure as Code - what it is, why it matters, key concepts, popular tools, and best practices for managing infrastructure with code.

DevOps
|6 min read

Heroku is Shutting Down: Top Alternatives for Your Apps in 2026

Heroku has announced it is transitioning to a sustaining model with no new features. Here are the best alternatives to migrate your applications.

DevOps
|12 min read

DevOps vs SysAdmin vs SRE: What's the Difference?

Confused about DevOps, SysAdmin, and SRE roles? This beginner-friendly guide uses real-world analogies to explain what each role does, how they differ, and which path might be right for you.

DevOps
|15 min read

What is DevOps? A Complete Beginner's Guide

New to DevOps? This beginner-friendly guide explains what DevOps is, why it matters, and how it transforms the way software is built and delivered - no technical background required.

DevOps
|14 min read

GitOps: Deploy Docker Containers with GitHub Actions and ArgoCD

Learn how to implement a modern GitOps workflow for Docker deployments. This guide covers building images with GitHub Actions, pushing to container registries, and automated deployments with ArgoCD.

DevOps
|12 min read

Deployment Strategies: Blue-Green, Canary, and Rolling Deployments Explained

Learn how to deploy applications safely using blue-green, canary, and rolling deployment strategies. Understand the theory, trade-offs, and decision-making behind each approach.

DevOps
|12 min read

The Hidden Costs of Over-Automation in DevOps

Automation speeds things up, but too much of it can hide failures, slow incident response, and add fragile layers you have to maintain.

DevOps
|5 min read

The 10 Most Common DevOps Mistakes (And How to Avoid Them in 2025)

Explore the top 10 DevOps mistakes made in 2025 and learn how to avoid them to ensure a smoother DevOps journey.

DevOps
|12 min read

A Day in the Life of a DevOps Engineer

Follow a DevOps engineer through a typical day - from morning deployments to midnight hotfixes. Real challenges, real solutions, and real impact on business operations.

DevOps
|4

Why Your CI/CD Pipeline Is Slower Than It Should Be (and How to Fix It)

Small pipeline changes give big wins. Parallelize jobs, cache dependencies, pin images, reuse build artifacts, and run only the tests you need.

DevOps
|6 min read

What is P99 Latency?

P99 latency measures the response time at the 99th percentile, showing how fast your slowest 1% of requests are. Learn why P99 is more important than average latency for understanding real user experience.

DevOps
|5 min read

Where Does the Convention of Using /healthz for Application Health Checks Come From?

Discover the origins of the /healthz endpoint convention for application health checks and why it has become a standard in modern software development.

DevOps
|12 min read

How I Finally Understood Docker and Kubernetes

Docker and Kubernetes can feel abstract until you see what problems they actually solve. Here's a practical guide to understanding both tools through real examples.

DevOps
|7 min read

Should I Use Vagrant or Docker for Creating an Isolated Environment?

Choosing between Vagrant and Docker depends on your workflow and what kind of isolation you need. This guide walks through real-world use cases to help you decide.