An internal analytics platform built at Citi to aggregate and visualise vulnerability exposure across the organisation’s firewall and load balancer estate, enriched with EOV (Evidence of Vulnerability) data scraped from an internal dashboard where no direct database access is available.
Note: Internal tool — no public demo or source code available.
The Problem
Vulnerability data for network infrastructure lived across multiple systems. Some was queryable via SQL; EOV data was only accessible through an internal web dashboard, not the database. Queries were slow. There was no unified view. There were no trend comparisons. Reporting was manual.
Architecture
Data Ingestion
- SQL layer: Long-running queries against internal vulnerability databases, optimised with connection pooling to reduce connection overhead across concurrent sessions
- EOV layer: Custom scraping layer to pull enrichment data from an internal dashboard, since no direct DB access exists
2-Tier Caching: Streamlit Memory + Shared Parquet
The most significant engineering challenge was query performance. Long-running SQL queries made the app unusable on every load. The solution is a two-tier cache:
Tier 1 — Streamlit @st.cache_data: once a DataFrame is loaded into the app process, Streamlit holds it in memory. Filter changes, page switches, and reruns all return the cached object instantly — zero I/O.
Tier 2 — Manifest-based Parquet on shared drive: multiple Parquet files (one per data domain) tracked by a central manifest. When Streamlit’s memory cache is cold (new session or new user), the app reads from the shared drive instead of hitting SQL. The manifest is the single source of truth for the cache layer: it records which files exist, when they were written, their status, their schema version, and the set of UI features each snapshot supports.
Cold cache — on the first run of the day, the app runs SQL, writes Parquet files, and updates the manifest. Distributed: whichever user opens the app first creates the cache for everyone else. No scheduler needed.
Graceful degradation (data) — if today’s Parquet doesn’t exist yet, the app falls back to the most recent available file. Users always get data; the UI shows a timestamp so they can see if it’s from a prior run.
Graceful degradation (UI) — when users step back to historical snapshots, the data shape may have been different: certain domains weren’t yet added, certain columns didn’t exist. Rather than crashing or rendering empty charts, UI sections are conditionally rendered based on the manifest’s features list for that snapshot. If a feature wasn’t present when the snapshot was written, the UI element simply doesn’t appear. Old dates show the dashboard as it existed then; new dates show the full current feature set — with no special casing required.
Custom Time-Series Parquet
A dedicated time-series Parquet structure stores snapshots over time, enabling a custom date lookback feature: users can step back to any previous snapshot date available on disk, making historical trend comparison possible without hitting the database.
Rolling 8-Day Pruned Parquet
A bespoke rolling Parquet file maintains a 7-day window of select columns — specifically engineered so that page-level filters correctly apply to delta values. Without this, global slicers would apply to absolute counts but not to the week-over-week change calculations. The rolling file solves this by keeping the prior-period data available in the same filter context as the current period.
Visualisation
Built with Plotly for interactivity:
- Interactive chart legends — toggle individual series on/off
- Global slicers — filter by asset group, severity, region, or any dimension
- All KPIs and charts update in real time from slicer selections
- Delta comparisons (week-over-week changes) adjust correctly when filters are applied, using the rolling Parquet as the comparison baseline
Tech Stack
| Layer | Technology |
|---|---|
| UI / App | Streamlit |
| Visualisation | Plotly |
| Data processing | Python, Pandas |
| Tier 1 cache | Streamlit @st.cache_data (in-memory) |
| Tier 2 cache | Parquet files on shared network drive |
| Database access | SQL + connection pooling |
| EOV enrichment | Custom internal dashboard scraper |
Engineering Highlights
- Manifest as SST: the manifest is the single source of truth for the cache layer — records file paths, write dates, status, schema version, and declared feature set per domain
- Manifest-driven UI rendering: UI sections check the manifest before rendering; if a feature wasn’t present in the selected snapshot, the element is conditionally excluded rather than crashing or displaying empty charts
- Schema versioning: each manifest entry carries a schema version, enabling the app to reason about data shape across historical snapshots without runtime errors
- Graceful degradation (data): app always serves data — falls back to the latest available Parquet if today’s cache isn’t ready yet
- Graceful degradation (UI): historical lookback is clean — stepping back to a date before a feature existed shows the dashboard as it was then, not a broken version of the current UI
- Distributed cold cache: first-run-creates-cache pattern means no central scheduler needed; the app is self-maintaining
- File-specific troubleshooting: per-domain Parquet files mean a failure in one data source is isolatable and recoverable without invalidating everything else
- Connection pooling: reduces SQL query time by reusing open connections across the session
- Rolling delta Parquet: bespoke solution ensuring filter context is preserved for trend calculations — not a standard pattern
- Date-indexed Parquet store: full historical replay without database queries
Built Solo Under Shifting Requirements
This project was built entirely alone. Neither my manager nor the other team member works in Python, so there was no code review, no pair programming, and no shared technical context to lean on. All architecture decisions, debugging, and implementation were done independently.
There was no sprint planning. Requirements were discussed in a weekly call where the director could — and frequently did — change direction entirely: new data sources needed, new KPIs, new visualisations, new comparisons. The manifest-based architecture was a direct response to this: adding a new data domain means one new file and one manifest entry. Adding a new UI feature means adding its name to the manifest’s feature list on write — old snapshots don’t have it, the UI handles that automatically, and no migration or backward-compatibility work is needed.
The result is a system where requirements could shift every single week — at the data layer, the cache layer, and the UI layer — without any of the three breaking each other.
Designing for adaptability from the start turned out to be the right call.
Status
In active development and use within Citi’s network security function.