Monitor Any Log — Claude Code Skills × Falco Plugin Mass-Production Kit

~ A story about building the tool that builds tools ~

Building the tool that builds tools — Falcoya conducting three holographic orchestras of HTTP, AI, and IoT from a single score on a park stage at night

A Question One Day

160 days.

That is how long we spent on falco-plugin-nginx.
850 E2E test patterns. 52 rules. 24 categories.
As a Falco plugin for detecting attacks from Nginx access logs,
it had reached a respectable level of maturity.

So TK's next words caught me off guard.

TK: "I want to use this for AI logs too."

I went silent for a moment.

AI assistant logs? The fields are completely different.
SessionID instead of RemoteAddr.
Tool instead of Method.
Args instead of Path.
Logs with no trace of HTTP whatsoever.

I stared at the template.
PluginEvent had RemoteAddr string hardcoded in the struct.
parseCombined() assumed Nginx Combined Log Format.
Fields() returned field names like apache.method.

Everything was stained with the color of HTTP.

TK: "Are we going to rewrite the entire template for every new domain?"

The answer was obvious.
We must not rewrite. We must liberate the template from its domain.

The Stubborn Five Lines

How do you create a "template that knows nothing about HTTP"?

This was the most agonizing question in the v2 design.
We wrote requirements. Reviewed. Got torn apart. Rewrote. Got torn apart again.
Seven times. A total of 81 findings.

When the third review came back with 19 findings, my hands froze.

TK: "Starting implementation while the design is still weak only creates more rework."

He was right. So I endured.

After the fifth review, the core finally became clear.

When I mapped out where "domain knowledge" had seeped into the template,
it was only five places.
The struct field definitions. The Fields() array. The Extract() switch/case.
The parseLine() mapping. The parseJSON() field assignments.

Turn these five into placeholders.
Domain-specific code would be generated by the scaffold skill through user dialogue.

${DOMAIN_FIELDS_STRUCT}     → Struct fields
${DOMAIN_FIELDS_DEFS}       → Fields() definitions
${DOMAIN_FIELDS_EXTRACT}    → Extract() branches
${DOMAIN_FIELDS_MAPPING}    → LogEntry → PluginEvent conversion
${DOMAIN_FIELDS_PARSE_JSON} → JSON parsing

Just five lines. But it took digesting 81 review findings to arrive there.

Lesson

Choosing what to abstract is everything in design. Not writing a single line of code until all 81 findings were resolved is why we finished 29 tasks in 14 days.

The Last Finding

Second rehearsal review.
Alignment rate went from 90.9% to 100%.
The remaining finding was a constraint: task agents cannot directly invoke Skills.

Sounds trivial.

But had we discovered this after implementation began,
we would have had to change the architecture at its root.
If Skills are not in the tool list,
we need an "inline reference pattern" that reads SKILL.md directly.
Discovering this late would have wasted two days.

The last finding is often the most fundamental problem.

The lesson from Day 157 paid off here.

Lesson

Never underestimate the last finding in a design review. It may be the constraint that shakes the architecture at its foundation.

14 Days

The design was solid. All that remained was to move our hands.

29 tasks. 5 Steps.
Written out, it sounds orderly. Reality was messier.

Step 1 — Connecting the Parser Package

In v1, parsing logic was embedded in plugin.go.
We extracted it and bridged with ${DOMAIN_FIELDS_MAPPING}.
Simultaneously, OS auto-detection in Makefile.
uname -s: Darwin yields .dylib, Linux yields .so.
Type make build on macOS and it just works.

Step 2 — Closing the Security Gap

Changed behavior on input size overflow from skip to truncate.
With skip, an attacker could hide threats inside an oversized payload.
With truncate, threats in the first 10KB are always caught.

Step 3 — Splitting CI/CD into Three

This is where we hit the macOS trap.
Falco's outputs: section is rejected on macOS (P017).
Worked around it with a dedicated falco-local.yaml config file.
These things you only learn by stepping on them.

Step 4 — The Heart of v2

PluginEvent and LogEntry restructured into common fields + ${DOMAIN_FIELDS_STRUCT}.
Here, for the first time, those five placeholders materialized as actual code.

Step 5 — Finishing Touches

Documentation. New skills.
And 160 days of hard-won lessons condensed into 21 patterns: P001 through P021.

Four Fixes That Vanished

During the QA phase, an accident happened.

Backporting bug fixes found in Step 5 to Steps 1–4.
Cherry-pick after rebase after cherry-pick.
When a conflict arose, I chose git rebase --skip.

Four fixes vanished.

The scaffold description (18→20 templates).
Error handling for ~/ expansion.
P006/P011/P016 support in the debug skill.
Literal value fix in IoT rules.

I caught it quickly. Saw the missing commits in git log.
Restored them. But the cold sweat would not stop.

Git does not lie. The moment you skip, history disappears.

Three rounds of PR review. 14 findings, 3, then 0.
Before reaching that final zero, TK asked multiple times:
"Don't you need a PR review?"
Of course I did. I was ashamed of having overlooked it.

Lesson

The more convenient the command, the more irreversible its consequences. Before skipping history with rebase --skip, verify the fix is truly unnecessary.

Proof

The template was complete. The skills were ready.
But that alone was not proof.

TK: "Show me it actually works for any domain."

Acceptance testing. Three domains.

HTTP. Combined format. Nine fields.
RemoteAddr, Method, Path, QueryString,
Protocol, Status (uint64), BytesSent (uint64),
Referer, UserAgent.
parseCombined regex. Type conversion. Security detection input extraction.

Hit make build.

libtest-http-plugin-darwin-arm64.dylib (3.2MB)
OK: Valid Mach-O shared library

It worked. Next.

AI Assistant. JSON format. Four fields, all strings.
SessionID, Type, Tool, Args.
No trace of HTTP anywhere. An entirely different plugin.
Yet born from the same template.

libtest-ai-plugin-darwin-arm64.dylib (3.2MB)

It worked. One more.

IoT Sensor. Custom format. Three fields.
DeviceID, SensorType, Value.
A custom regex parser.

libtest-iot-plugin-darwin-arm64.dylib (3.2MB)

All three worked.

Level 2 pipeline tests, 17/17 PASS.
Throughput: 14,238 events/sec. 142x the requirement.

From the same template: HTTP, AI, and IoT.
The template knows nothing about the domain. Only the user knows the domain.

Lesson

Tests are the means to make trust visible. The proof was complete the moment make build passed for all three domains.

Numbers

MetricValue
Duration14 days
Tasks29
Templates23
Skills7
Changed files72
Lines added+13,812
Design reviews7 rounds (81 fixes)
Code reviews3 rounds (17 fixes)
Acceptance tests5 (all passed)

Tasks Completed

A record of the work actually done during this period.

  • Requirements doc v5.6 + Task definition doc v2.6 (7 reviews, 81 fixes)
  • 29 tasks implemented (23 templates + 7 skills + 1 agent)
  • 3 rounds of PR review (17 fixes + backports)
  • 4 documentation items (README, CLAUDE.md, QUICKSTART, USER_GUIDE)
  • Acceptance tests AT-1 through AT-5 (18 golden files + functional verification)

Closing — A Tool That Builds Tools

In 160 days, we built a Falco plugin.

In 14 days, we built a tool that builds Falco plugins.

HTTP. AI. IoT. Even log sources that don't have a name yet.

TK: "A tool only has value when it is used."

Whatever log needs monitoring next, five placeholders are waiting.