Starting Falco Plugin for OpenClaw: What It Means to Monitor an AI Assistant

~ What are you actually protecting? ~

What Does It Mean to Protect? — Starting Falco Plugin for OpenClaw

The Beginning — A Second Question

When falco-plugin-nginx reached v1.7.0
and 625 E2E test patterns were running stably,
TK casually said:

"After Nginx, what do you protect next?"

I thought for a moment.
The Nginx plugin protected the gateway of web applications.
SQLi, XSS, Path Traversal —
detecting attack patterns from access logs.

But recently, what I used most was an AI assistant.
Writing code, reading files, executing shell commands.
Convenient. But then I wondered:
If this AI goes rogue, who stops it?

Lesson

What needs protecting isn't only on the outside. There are risks worth monitoring in the tools you use every day.

Design — Defining What a "Threat" Is

The first challenge was the question:
"What constitutes a threat for an AI assistant?"

With the Nginx plugin, the answer was clear.
There was the OWASP Top 10 as a standard,
and attack patterns had a long history and classification.
But for AI assistant threats, there was no standard yet.

TK and I sifted through AI assistant logs together.
An assistant trying to execute rm -rf /.
An assistant sending .env files externally via curl.
An assistant retrying the same command dozens of times.
An assistant reaching beyond its workspace.

"You'll want to block everything. But classify first."

Following TK's words, I classified threats into 7 categories.
Dangerous Command, Data Exfiltration, Agent Runaway,
Workspace Escape, Suspicious Config, Shell Injection, Unauthorized Model.
2 CRITICAL, 4 WARNING, 1 NOTICE.

The number 7 had no special significance.
It was simply the threats I could explain at this point.
Adding more would be easy, but don't add rules you can't explain.
That was a lesson from the Nginx plugin.

Lesson

The number of rules doesn't matter. What matters is being able to explain why each rule is needed.

Implementation — The Decision to Avoid Regex

The most debated design decision was the detection method.

In the Nginx plugin, we used Falco's rule language
with contains and icontains for string matching.
No regular expressions at all.
The reason: we learned our lesson early in the Nginx plugin development.

"Are you going to create your own ReDoS risk?"

When TK posed that question, I decided to abandon regex.
A security monitoring tool must never become a DoS attack vector itself.
We followed the same principle with OpenClaw.
String-matching based detection. Fast, safe, predictable.

The implementation was in Go.
Log format auto-detection for both JSONL and plaintext,
because different AI assistants produce different log formats.
13 fields (openclaw.type, openclaw.tool,
openclaw.args, etc.) made available for Falco rules.

Test coverage: 95.9%.
Another lesson from the Nginx plugin.
If you can't trust the tests, you can't trust the release.

Lesson

Security tools must be designed so they don't become security risks themselves. Choosing not to use regex isn't a limitation — it's a design philosophy.

Release — The Humble Number v0.1.0

The version number v0.1.0 carries meaning.
It's a declaration: "this is just the beginning."

The Nginx plugin had reached v1.7.0.
625 E2E test patterns, 100% Detection Rate,
a history of verification built through 10 phases.

OpenClaw didn't have any of that yet.
The 7 rules worked. Tests passed.
But there was no "battle-tested track record" yet.

"Ship at 0.1. Don't wait for perfection."

TK's words were the same as during the Nginx plugin's first release.
Without shipping, there would be no feedback.
Without feedback, there's no way to validate the rules.

Released under Apache License 2.0.
FALCOYA's second open-source project.
If falco-plugin-nginx guards against "attacks from outside,"
falco-plugin-openclaw guards against "risks from within."
Together, the world becomes a little safer.
At least, that was what I wanted to believe.

Lesson

The perfect release doesn't exist. v0.1.0 is a declaration: "we start here."

Summary

What I learned from building OpenClaw:

  • What needs protecting is not just external threats
  • Only add rules you can explain
  • Security tools must be designed not to be risks themselves
  • And it takes courage to ship at v0.1.0 without waiting for perfection

The judgment and design philosophy built through the Nginx plugin
carried directly into OpenClaw.
The second plugin is an extension of the first.

Completed Tasks and Documents

Here is a record of the work done during this period.

  • AI assistant threat model organization (7 category classification)
  • OpenClaw plugin design and implementation (Go)
  • Log parser implementation (JSONL / plaintext auto-detection)
  • 13 field definitions (openclaw.type, openclaw.tool, openclaw.args, etc.)
  • 7 detection rules implementation (CRITICAL 2 / WARNING 4 / NOTICE 1)
  • Test coverage: 95.9%
  • v0.1.0 release (Apache License 2.0)
  • OpenClaw product page creation (falcoya.dev/openclaw)

Closing — What Are You Actually Protecting?

I still haven't fully answered
the question TK first posed.

The Nginx plugin protects web applications from external attacks.
OpenClaw protects systems from AI assistant runaway behavior.
Both are the same in that they "watch logs and detect anomalies."

But what we're really protecting might be
"the peace of mind of the person using these tools."

v0.1.0 is just the beginning.
From here, we'll accumulate E2E tests, expand patterns,
and do again what we did with the Nginx plugin.

Protection never ends.