1
0
Fork 0
forked from infra/ansible

Add an AI review workflow

This adds AI review for pull requests, using the workflow and
pattern already used by several Quality projects. Pull requests
will be reviewed by
https://gitlab.com/redhat/edge/ci-cd/ai-code-review/ whenever the
'ai-review-please' label is applied to them. Also adds a context
file, generated by claude-4.6-opus-high via Cursor, using the
https://github.com/juanje/context-generator skill, as recommended
by ai-code-review upstream.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
Assisted-by: claude-4.6-opus-high
Assisted-by: Cursor-2.6.11
This commit is contained in:
Adam Williamson 2026-03-05 11:06:02 -08:00
commit d4000f3282
2 changed files with 174 additions and 0 deletions

159
.ai_review/project.md Normal file
View file

@ -0,0 +1,159 @@
## Project Overview
**Purpose:** Ansible automation for the entire Fedora Project infrastructure — managing hundreds of bare-metal hosts, VMs, and OpenShift-deployed applications across production and staging environments.
**Type:** Infrastructure as Code (Ansible)
**Domain:** Linux distribution infrastructure (build systems, package repositories, CI/CD, web services, identity management)
**Key Components:** `roles/` (134 reusable roles), `playbooks/` (groups, hosts, openshift-apps, manual), `inventory/` (host/group vars with constructed plugin)
## Technology Stack
### Versions (current as of 2026-03-05)
- **Ansible** — core automation engine; playbooks target Fedora and RHEL/CentOS hosts
- **yamllint** v1.35.1 — pre-commit hook for YAML validation
- **ansible-lint** — CI linting (skips: `yaml`, `role-name[path]`, `var-naming[no-role-prefix]`, `no-changed-when`, `ignore-errors`)
- **OpenShift 4** (OCP4) — hosts ~60 containerized applications via `openshift-apps/` playbooks
### Target Platforms
- **Fedora** (current cycle: F43 stable, F44 branched, F45 rawhide)
- **RHEL/CentOS** 8+ (EPEL 10, current minor: 10.3)
- **OpenShift 4** cluster (rdu3 datacenter)
### Dev Tools
- **Linting:** yamllint (pre-commit) + ansible-lint (CI)
- **CI:** Forgejo Actions on `quay.io/fedora/fedora:latest` — runs yamllint and ansible-lint on changed files only
- **Control host:** batcave01 (`/srv/web/infra/ansible` public, `/srv/private/ansible` private)
## Resource Organization
### Structure
```
├── main.yml # Master playbook — imports all group/host playbooks
├── playbooks/
│ ├── groups/ # 59 group playbooks (one per service group)
│ ├── hosts/ # 2 host-specific playbooks (FQDN.yml)
│ ├── openshift-apps/ # 60 OCP4 application deployments
│ ├── manual/ # 39 admin-run-only playbooks
│ └── include/ # Shared proxy/virt/cert playbook fragments
├── roles/ # 134 roles
│ ├── base/ # Applied to ALL hosts (packages, SSH, SELinux, nftables)
│ ├── openshift/ # OCP4 primitives (project, object, keytab, route, rollout)
│ └── openshift-apps/ # Per-app roles (templates, files for OCP4 deployments)
├── inventory/
│ ├── inventory/ # Host definitions
│ ├── group_vars/ # 150 group variable files (incl. _stg variants)
│ ├── host_vars/ # Per-host overrides
│ └── zzz-inventory.config # Constructed inventory plugin (dynamic groups by datacenter, distro, vmhost)
├── vars/
│ ├── global.yml # Global vars (paths, SSL config, base packages)
│ ├── all/ # Release cycle vars (Fedora versions, freeze states, EPEL)
│ ├── apps/ # Per-application vars (bodhi, mirrormanager, etc.)
│ ├── Fedora.yml # Distro-specific packages/services
│ └── RedHat.yml / CentOS.yml
├── tasks/ # Reusable task snippets (cloud, postfix, yumrepos, etc.)
├── handlers/ # restart_services.yml
├── library/ # Custom modules (delete_old_oci_images.py, virt_boot, etc.)
├── callback_plugins/ # fedora_messaging_callback.py, logdetail.py
├── files/ # Static files/templates organized by service
└── scripts/ # Admin utility scripts (auth-keys-from-fas, freezelist, etc.)
```
### Module/Role Structure
**Standard role layout:** `tasks/main.yml`, `templates/`, `files/`, `handlers/main.yml`, `meta/main.yml`
**OpenShift app pattern** — the dominant pattern for new services:
1. Playbook in `playbooks/openshift-apps/<app>.yml` targets `os_control[0]:os_control_stg[0]`
2. Uses composable `openshift/*` roles (`project`, `object`, `keytab`, `secret-file`, `imagestream`, `route`, `rollout`)
3. App-specific templates in `roles/openshift-apps/<app>/templates/`
4. Staging vs production controlled by `env` variable and `env_suffix`
**Group playbook pattern** — for traditional VM-based services:
1. Playbook in `playbooks/groups/<group>.yml` with `hosts:` matching inventory group
2. Must include standard `vars_files` triplet (see Review Guidance)
### Critical Resources
- **`vars/all/`** — Fedora release cycle variables. Changed every ~6 months during branching/release. Incorrect values break builds, composes, and Bodhi across the entire infrastructure.
- **`roles/base/tasks/main.yml`** (700+ lines) — Applied to every managed host. Changes here have maximum blast radius.
- **`inventory/group_vars/`** — 150 files controlling per-group behavior. Many have `_stg` counterparts for staging.
- **`main.yml`** — Master playbook importing all groups. Nightly `--check --diff` cron runs all playbooks here.
## Review Guidance
### What Reviewers Must Know
- **All playbooks must be idempotent.** They can be run at any time by the nightly cron. The checked-in state must always be the desired state.
- **Standard vars_files triplet is required** in all group/host playbooks:
```yaml
vars_files:
- /srv/web/infra/ansible/vars/global.yml
- "{{ private }}/vars.yml"
- /srv/web/infra/ansible/vars/{{ ansible_distribution }}.yml
```
Plus `include_vars` for `vars/all/` when release cycle vars are needed.
- **Hardcoded paths are standard** — paths like `/srv/web/infra/ansible/` and `/srv/private/ansible/` are the production layout on batcave01. Don't suggest making them relative or configurable.
- **Use `yml` not `yaml`** for Ansible files (per STYLEGUIDE). The exception is `vars/all/*.yaml` which uses `.yaml` historically.
- **Add `.j2` extension** to all Jinja2 templates.
- **YAML indentation is 2 spaces.** Line length is not enforced.
- **Prefer readable multi-line module args** over single-line `module: name=x arg=y` format.
- **Staging uses `_stg` suffixed group_vars** and `env_suffix` variable (empty string for prod, `.stg` for staging).
- **Tags `packages` and `config`** should be applied to relevant tasks. `build` and `rollout` tags use `never` to prevent accidental execution.
- **OpenShift apps target `os_control[0]:os_control_stg[0]`** — always the first control node. Don't suggest targeting all control nodes.
### Do NOT Flag (Known False Positives)
- `ansible-lint` skip list includes `no-changed-when` and `ignore-errors` — these are intentionally suppressed project-wide
- `yaml` rule category is skipped in ansible-lint — yamllint handles YAML validation separately
- Hardcoded absolute paths in playbooks (e.g., `/srv/web/infra/ansible/...`) — this is the expected deployment layout
- `mock_modules` and `mock_roles` in `.ansible-lint` — used to pass syntax checks without all dependencies
- Octal values forbidden in yamllint — intentional policy to avoid ambiguous YAML parsing
- `when: env == "production"` / `when: env == "staging"` conditional duplication in openshift-apps — standard pattern for different scaling/config per environment
### Common Pitfalls
- **Forgetting to update `vars/all/` during release transitions** — these variables control Fedora version numbers, freeze states, and EPEL branches. Multiple files must be updated together (e.g., branching requires changes to `FedoraBranched.yaml`, `00-FedoraCycleNumber.yaml`, `FedoraBranchedBodhi.yaml`, and `Frozen.yaml`).
- **Not testing with staging first** — staging group_vars (`*_stg`) should be updated before production. Changes that work in staging may still break production due to different scaling or secrets.
- **Breaking idempotency** — a task that makes changes on every run will generate noise in the nightly `--check --diff` report and mask real drift.
- **Wrong file extension for templates** — placing a `.yml` file in `templates/` instead of `.yml.j2` means Jinja2 expressions won't be rendered.
- **ansible-lint file/role misclassification** — the `.ansible-lint` `kinds` section maps `tasks/*.yml` and `vars/*.yml` explicitly. New directories with tasks may need similar mappings.
## Internal & Proprietary
- **`/srv/private/ansible/`** — Private vars (secrets, credentials) stored on batcave01. Referenced as `{{ private }}/vars.yml`. Never committed to this repo.
- **`callback_plugins/fedora_messaging_callback.py`** — Custom Ansible callback that publishes play results to Fedora's AMQP message bus. Don't suggest replacing with standard callback plugins.
- **`callback_plugins/logdetail.py`** — Custom detailed logging callback for the nightly check-diff runs.
- **`library/virt_boot`** / **`library/delete_old_oci_images.py`** — Custom Ansible modules for VM management and OCI image cleanup. Not upstream modules.
- **`scripts/auth-keys-from-fas`** — Fetches SSH authorized keys from Fedora Account System (FAS). Referenced by `auth_keys_from_fas` global variable.
- **Constructed inventory plugin** (`zzz-inventory.config`) — Dynamically creates groups by datacenter (`rdu3`), distro, vmhost, and virtualization role. The `zzz-` prefix ensures it loads last.
---
<!-- MANUAL SECTIONS - DO NOT MODIFY THIS LINE -->
## Architecture & Design Decisions
- **Mostly flat role directory with some nesting**: Most roles live directly under `roles/`, but several use subdirectory namespacing — `openshift/` (OCP4 primitives), `openshift-apps/` (per-app deployments), `awx/`, `openqa/`, `rabbit/`, among others.
- **OpenShift apps via Ansible, not Helm/Kustomize**: OCP4 applications are deployed through Ansible roles that template and apply OpenShift objects. This keeps all infrastructure in one tool and one repo.
- **Staging/production parity through group_vars**: Rather than separate inventories, staging hosts are in the same inventory with `_stg` group_vars files providing overrides. The `env` and `env_suffix` variables control behavior.
- **Release cycle managed through simple YAML vars**: Fedora's complex release lifecycle (rawhide, branched, stable, EOL) is encoded in `vars/all/` as a set of interdependent variables rather than a database or API. This is intentional — the variables are updated manually during each release milestone by the release engineering team.
## Business Logic
- **Fedora release cycle states**: Three main states — unbranched (rawhide only), branched (rawhide + pre-release), and post-release. Controlled by `FedoraBranched`, `FedoraCycleNumber`, `FedoraBranchedBodhi` (preenable/prebeta/postbeta), and freeze flags. These cascade through templates across the entire infrastructure.
- **EPEL minor version lifecycle**: EPEL 10+ has minor versions that move through states: `epel_minor` (built against CentOS), `epel_branched_minor` (branched, built against CentOS snapshot), `epel_z_minor` (built against RHEL). Up to three active minor versions at once.
- **Critical path applications**: forge, pagure, mirrormanager, bodhi, koji, dist-git, and ~20 others require two-reviewer PRs and coordinated downtime scheduling for risky changes.
- **Nightly check-diff**: All playbooks under `playbooks/{groups,hosts}` are run nightly with `--check --diff`. The ideal state is zero changes reported.
## Domain-Specific Context
- **batcave01** — The Ansible control host. All playbooks are run from here via `sudo -i ansible-playbook`.
- **env / env_suffix**`env` is `"production"` or `"staging"`. `env_suffix` is `""` for prod, `".stg"` for staging. Used throughout to construct hostnames, queue names, and paths.
- **FAS (Fedora Account System)** — Identity provider for the Fedora community. SSH keys, group memberships, and permissions come from FAS/IPA.
- **Koji** — Fedora's build system. Build hosts (buildvm, buildhw) are managed here. Koji hub is VM-based (`playbooks/groups/koji-hub.yml` + `roles/koji_hub`), not on OpenShift.
- **Bodhi** — Fedora's update management system. Runs on OpenShift with complex RabbitMQ messaging integration.
- **dist-git / Pagure** — Package source repositories. The lookaside cache and git hosting are managed by separate roles.
- **RabbitMQ / fedora-messaging** — AMQP message bus connecting Fedora services. Certificates managed per-service via `openshift/secret-file` role.
## Special Cases
- **`playbooks/openshift-apps/` uses `gather_facts: false`** — OCP4 playbooks target the control node to run `oc` commands, not the apps themselves. Facts aren't needed and would slow execution.
- **`vars/all/*.yaml` uses `.yaml` extension** despite STYLEGUIDE mandating `.yml` — historical exception, don't "fix" this.
- **`ansible-lint` runs in offline mode** (`offline: true`) — dependencies aren't installed during linting. `mock_modules` and `mock_roles` paper over missing dependencies.
- **Some playbooks are excluded from ansible-lint** (e.g., `copr-db.yml`, `list-vms-per-host.yml`) due to known issues with hardcoded paths or unicode errors.
- **`linux-system-roles.network`** is mocked in ansible-lint — it's an external role not available during CI.

View file

@ -0,0 +1,15 @@
---
name: AI Code Review
on:
pull_request_target:
types: [labeled]
jobs:
ai-review:
runs-on: infra-1
if: forgejo.event.label.name == 'ai-review-please'
uses: quality/workflows/.forgejo/workflows/ai-review.yml@main
with:
pr: ${{ forgejo.event.pull_request.number }}
secrets:
GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }}