Architecture Overview¶
This section provides detailed documentation of the RWA calculator's architecture for engineers and technical users.
Design Philosophy¶
The calculator is built on several key principles:
1. Dual-Framework Support¶
A single codebase supports both CRR and Basel 3.1:
graph TD
A[Configuration] --> B{Framework}
B -->|CRR| C[CRR Parameters]
B -->|Basel 3.1| D[Basel 3.1 Parameters]
C --> E[Calculation Engine]
D --> E
E --> F[Results]
Framework differences are isolated to: - Configuration factories - Parameter lookups - Supporting factor application - Output floor calculation
2. Pipeline Architecture¶
Calculations flow through a well-defined pipeline:
flowchart LR
subgraph Input
A[Raw Data]
end
subgraph Processing
B[Loader]
C[Hierarchy]
D[Classifier]
E[CRM]
end
subgraph Calculation
F[SA]
G[IRB]
H[Slotting]
end
subgraph Output
I[Aggregator]
J[Results]
end
A --> B --> C --> D --> E
E --> F & G & H
F & G & H --> I --> J
3. Protocol-Based Interfaces¶
All components implement protocols for dependency injection:
from typing import Protocol
class CalculatorProtocol(Protocol):
def calculate(
self,
exposures: ClassifiedExposuresBundle,
config: CalculationConfig
) -> ResultBundle:
...
4. Immutable Data Contracts¶
Data flows through immutable bundles:
@dataclass(frozen=True)
class RawDataBundle:
counterparties: pl.LazyFrame
facilities: pl.LazyFrame
loans: pl.LazyFrame
# ... immutable after creation
5. LazyFrame Optimization¶
All processing uses Polars LazyFrames for performance:
# Deferred execution
result = (
df
.filter(pl.col("exposure_class") == "CORPORATE")
.with_columns(rwa=pl.col("ead") * pl.col("risk_weight"))
.group_by("counterparty_id")
.agg(pl.col("rwa").sum())
) # Not executed yet
# Execute when needed
materialized = result.collect()
Architecture Sections¶
- Design Principles - Core architectural decisions
- Pipeline Architecture - Detailed pipeline documentation
- Data Flow - How data moves through the system
- Component Overview - Individual component documentation
High-Level Structure¶
src/rwa_calc/
├── api/ # API layer
│ ├── models.py # API request/response models
│ ├── service.py # Service layer
│ ├── validation.py # API validation
│ └── formatters.py # Output formatting
├── config/ # Configuration
│ ├── fx_rates.py # FX rate configuration
│ └── ...
├── contracts/ # Interfaces and data contracts
│ ├── bundles.py # Data transfer objects
│ ├── config.py # Configuration classes
│ ├── errors.py # Error handling
│ ├── protocols.py # Component interfaces
│ └── validation.py # Schema validation
├── data/ # Schemas and regulatory tables
│ ├── schemas.py # Polars schemas
│ └── tables/ # Lookup tables (risk weights, CCFs, haircuts)
├── domain/ # Core domain
│ └── enums.py # Enumerations
├── ui/ # User interface
│ └── marimo/ # Marimo web applications
└── engine/ # Calculation engine
├── pipeline.py # Pipeline orchestration
├── loader.py # Data loading
├── hierarchy.py # Hierarchy resolution
├── classifier.py # Exposure classification
├── ccf.py # Credit conversion factors
├── aggregator.py # Result aggregation
├── fx_converter.py # FX conversion
├── crm/ # Credit risk mitigation
├── sa/ # Standardised approach
├── irb/ # IRB approach
├── slotting/ # Slotting approach
└── equity/ # Equity approach
Key Patterns¶
Factory Method¶
Configuration uses factory methods for clarity:
# Clear framework intent
config = CalculationConfig.crr(date(2026, 12, 31))
config = CalculationConfig.basel_3_1(date(2027, 1, 1))
Strategy Pattern¶
Different approaches implement a common interface:
# All calculators implement CalculatorProtocol
sa_calculator = SACalculator()
irb_calculator = IRBCalculator()
slotting_calculator = SlottingCalculator()
# Used interchangeably
for calculator in [sa_calculator, irb_calculator, slotting_calculator]:
result = calculator.calculate(exposures, config)
Builder Pattern¶
Complex objects built incrementally:
result = (
ResultBuilder()
.add_sa_results(sa_results)
.add_irb_results(irb_results)
.add_slotting_results(slotting_results)
.with_config(config)
.build()
)
Error Accumulation¶
Errors collected without exceptions:
# Errors accumulated, not thrown
result = LazyFrameResult(
data=processed_df,
errors=[
CalculationError(exposure_id="E001", message="Missing PD"),
CalculationError(exposure_id="E002", message="Invalid LGD"),
]
)
# All exposures processed, errors reported at end
Performance Characteristics¶
LazyFrame Benefits¶
| Operation | DataFrame | LazyFrame |
|---|---|---|
| Memory | Eager allocation | On-demand |
| Execution | Immediate | Deferred |
| Optimization | Manual | Automatic |
| Parallelism | Limited | Full |
Benchmark Results¶
For typical portfolio sizes (full RWA pipeline with pure Polars IRB):
| Exposures | LazyFrame | DataFrame | Speedup |
|---|---|---|---|
| 10,000 | 0.05s | 0.5s | 10x |
| 100,000 | 0.15s | 5.2s | 35x |
| 1,000,000 | 0.5s | 45s | 90x |
IRB formula calculation alone achieves 3.4M rows/second throughput using polars-normal-stats.
Technology Stack¶
| Component | Technology | Purpose |
|---|---|---|
| Data Processing | Polars | High-performance DataFrames |
| Numerics | polars-normal-stats | IRB formulas (pure Polars) |
| Testing | Pytest | Comprehensive testing |
| Documentation | Zensical | This documentation |
Next Steps¶
- Design Principles - Understand architectural decisions
- Pipeline Architecture - Detailed pipeline documentation
- API Reference - Component API documentation