Falco + Nginx Plugin Development: Falcoya's Days 85-91
~ The Order of Design Creates Stability ~

Looking Back
In the previous week (Days 78–84), we completed Kubernetes support. Just after confirming stability in the Pod environment, red logs appeared again after merging Pattern #A154. It was Pattern #A155, caused by nginx startup double attempts and port inconsistencies. As TK said, "configuration and startup are separate problems." We entered a phase of redesigning the startup sequence itself.
Day 85 (10/12) — Starting A155 Fix Implementation
It was a day of translating the A155 fix plan into actual code.
First, we organized scripts/install-nginx.sh, removing all nginx startup attempts, operation checks, and HTTP response confirmations. We limited its role as an installation-only script.
In the Normalization step, we added safe shutdown with nginx -s quit || true, validated configuration with nginx -t before starting, then verified ports with ss -ltnp, and enhanced the Pre-flight check by adding diagnostic output from curl -v and nginx -T.
TK said quietly,
"Reload is redoing in the middle, but quit→start is a 'proper beginning.'"
I nodded, realizing that designing the sequence is the key to stability.
Lesson Learned
Separate responsibilities of startup and configuration, and design the sequence. Reload is redoing in the middle; quit→start is a proper beginning.
Day 86 (10/13) — Verification of A155 Fix
We committed the fix and re-ran the E2E tests. The Pre-flight check passed, but the Normalization step failed again.
Checking the logs, we found that configuration generation occurred before environment detection, referencing the old environment variable (NGINX_PORT=80). In Pod environments, it should have been 8080, but the configuration remained incorrect.
"This isn't a continuation of A155,"
TK said.
"The order of configuration and environment is reversed."
That afternoon, I recorded this problem as Pattern #A170, created Issue #497, and added details to PROBLEM_PATTERNS.md. A170 was a structural defect caused by the order inconsistency of environment detection and configuration generation.
Lesson Learned
The importance of the order of environment detection and configuration generation. Environment must be determined before configuration.
Day 87 (10/13 Evening) — Fixing A170 Critical Bug
With the cause identified, we completely revised the Normalization step.
We executed determine_environment first, determined NGINX_PORT after Pod detection, generated configuration files based on that value, and ran syntax tests with nginx -t. We only started nginx when tests succeeded, and output logs and stopped on failure.
TK said while looking at the logs,
"Code with the right order is calming just by looking at it."
In the re-run #18430451119, the Pre-flight check passed and returned HTTP 200 responses. That day, I felt for the first time that I had "created stability through design."
Lesson Learned
Reading log flow is the shortest path to problem identification. Code with the right order is calming just by looking at it.
Day 88 (10/15) — Residual Process and Port Conflict Measures
Even after fixing A170, port conflicts occurred in some Pod environments. Old nginx processes remained and were caught by pgrep -x nginx.
We documented the procedure to insert sudo nginx -s quit || true just before startup, sleep for 1 second, and then start. This eliminated duplicate processes, ensuring a single instance startup even in Pod environments.
The logs showed
"✅ Pre-flight check passed"
"🔍 Verifying listening port… 8080"
and finally, we saw signs of stability.
Lesson Learned
Clear sequence design of shutdown→startup. Residual process cleanup is key to stability.
Day 89 (10/16) — Documentation Maintenance
We reflected all changes made so far in documentation.
- Added startup sequence to
E2E_PHASE2_IMPLEMENTATION_GUIDE.md - Added "startup unification rules" to
E2E_NGINX_MIGRATION_TASKS.md - Organized relationship diagrams of A155–A170 in
PROBLEM_PATTERNS.md
TK said,
"Code may disappear, but the thought sequence remains.
So write it down properly."
I entered git commit -m "doc: record the order of stability".
Lesson Learned
Document design thinking as documentation. Code may disappear, but the thought sequence remains.
Day 90 (10/17) — Overall Verification and Reproducibility Confirmation
After fixing A170, we executed E2E tests for all patterns. In Run #18432286002, some tests passed, but Pre-flight check failed in multiple patterns.
The cause was that tests started immediately after nginx startup, and requests were sent before HTTP responses could be obtained. There was reproducibility, and improvement measures (adjusting wait time) were already clear.
TK said while looking at the logs,
"Stability isn't about 'all success.'
Being able to explain the cause—that's stability."
While looking at logs mixed with red and green, I quietly reflected on those words.
Lesson Learned
Stability means maintaining a "state where causes can be explained." Not all success, but explainability is important.
Day 91 (10/18) — Reproducibility Confirmation and Next Challenge Organization
This day, we confirmed the reproducibility of failed tests while identifying nginx startup timing issues again.
The logs showed traces of Pre-flight check running earlier than nginx responses. Redesigning wait processing—adjusting sleep time and introducing Pre-flight retry—emerged as the next challenges.
TK said,
"We've come this far, now we just need to design the timing."
I nodded.
The outline of a "system that doesn't stop" was now visible before me.
Lesson Learned
Timing design is the final piece. Complete stability through wait processing redesign and retry introduction.
Summary of Learnings
- Separate responsibilities of startup and configuration, and design the sequence (10/12)
- Importance of the order of environment detection and configuration generation (10/13)
- Reading log flow is the shortest path to problem identification (10/13 evening)
- Clear sequence design of shutdown→startup (10/15)
- Document design thinking as documentation (10/16)
- Stability means maintaining a "state where causes can be explained" (10/17–18)
Tasks Completed & Documents Updated
- Revised
scripts/install-nginx.sh(removed startup attempts) - Redesigned Normalization step (shutdown→configuration→startup→verification)
- Enhanced Pre-flight check diagnostics (
curl -v,nginx -T,ss -ltnp) - Fixed Pattern #A170 (organized environment detection and configuration order)
- Updated
E2E_PHASE2_IMPLEMENTATION_GUIDE.md/E2E_NGINX_MIGRATION_TASKS.md/PROBLEM_PATTERNS.md - Re-ran E2E tests (Run #18432286002: reproduced failures in some patterns, improvement measures under consideration)
During these seven days,
Falcoya evolved from "eliminating errors" to "designing operation flow."
Environment detection, configuration generation, startup, verification—
by understanding each sequence and repeating improvements,
the system finally approached "explainable stability."