Intermediate Schemas¶
This page documents the schemas for intermediate data structures produced during pipeline execution. Each schema represents the state of exposure data after a specific pipeline stage.
Why these schemas matter: Column names accumulate through the pipeline — each stage adds columns to the exposure records rather than replacing them. Understanding what each stage contributes is essential for debugging calculation results and writing correct joins.
Source of truth: All schemas are defined in
src/rwa_calc/data/schemas.py. Column names throughout the codebase use_referencesuffixes (never_id).
Raw Exposure Schema¶
After the loader unifies facilities, loans, and contingents into a single exposure stream.
Each record gets an exposure_reference synthesised from the original entity key:
- Loans:
loan_reference→exposure_reference - Contingents:
contingent_reference→exposure_reference - Facility undrawn:
facility_reference+"_UNDRAWN"→exposure_reference
Source: RAW_EXPOSURE_SCHEMA in data/schemas.py, HierarchyResolver._unify_exposures() in engine/hierarchy.py
| Column | Type | Description |
|---|---|---|
exposure_reference |
String |
Unique identifier (synthesised from loan/contingent/facility reference) |
exposure_type |
String |
"loan", "contingent", or "facility_undrawn" |
product_type |
String |
Product classification |
book_code |
String |
Portfolio/book classification |
counterparty_reference |
String |
Foreign key to counterparty |
value_date |
Date |
Origination date |
maturity_date |
Date |
Contractual maturity date |
currency |
String |
Exposure currency |
drawn_amount |
Float64 |
Drawn balance (0 for facility undrawn records) |
interest |
Float64 |
Accrued interest (adds to on-balance-sheet EAD) |
undrawn_amount |
Float64 |
Undrawn commitment (limit − drawn for facilities) |
nominal_amount |
Float64 |
Total nominal (for contingents) |
lgd |
Float64 |
Internal LGD estimate (A-IRB modelled, if available) |
beel |
Float64 |
Best estimate expected loss |
seniority |
String |
"senior" or "subordinated" — affects F-IRB supervisory LGD |
risk_type |
String |
"FR", "MR", "MLR", "LR" — determines CCF (CRR Art. 111) |
ccf_modelled |
Float64 |
A-IRB modelled CCF (0.0–1.5) |
is_short_term_trade_lc |
Boolean |
Short-term LC for goods movement — 20% CCF under F-IRB |
is_buy_to_let |
Boolean |
BTL property lending — excluded from SME supporting factor |
original_currency |
String |
Currency before FX conversion (audit trail) |
original_amount |
Float64 |
Amount before FX conversion (audit trail) |
fx_rate_applied |
Float64 |
Rate used for conversion (null if no conversion) |
Resolved Hierarchy Schema¶
After hierarchy resolution, exposures gain counterparty hierarchy, facility hierarchy, rating inheritance, and lending group columns. This is the most column-rich intermediate stage, adding 14+ columns to each exposure record.
Source: RESOLVED_HIERARCHY_SCHEMA in data/schemas.py, HierarchyResolver.resolve() in engine/hierarchy.py
Counterparty hierarchy columns¶
| Column | Type | Description |
|---|---|---|
counterparty_has_parent |
Boolean |
Whether counterparty is part of an org hierarchy |
parent_counterparty_reference |
String |
Immediate parent in org structure |
ultimate_parent_reference |
String |
Top-level parent (for group-level analysis) |
counterparty_hierarchy_depth |
Int8 |
Levels from ultimate parent (0 = top) |
Rating inheritance columns¶
| Column | Type | Description |
|---|---|---|
rating_inherited |
Boolean |
Whether rating came from a parent counterparty |
rating_source_counterparty |
String |
Counterparty whose rating was used |
rating_inheritance_reason |
String |
"own_rating", "parent_rating", "group_rating", "unrated" |
Per-type rating columns¶
The hierarchy resolver resolves internal and external ratings independently via
_build_rating_inheritance_lazy(). Each rating type is inherited separately
(own → parent → ultimate parent), producing per-type columns:
| Column | Type | Description |
|---|---|---|
internal_pd |
Float64 |
Best internal PD (own or inherited from parent) |
internal_rating_value |
String |
Internal rating grade |
external_cqs |
Int8 |
Best external CQS (own or inherited from parent) |
external_rating_value |
String |
External rating grade |
internal_model_id |
String |
Model ID from rating inheritance (links to model_permissions) |
Derived convenience columns:
| Column | Type | Description |
|---|---|---|
cqs |
Int8 |
Alias for external_cqs (CQS is an external-only concept) |
pd |
Float64 |
Alias for internal_pd (PD is an internal-only concept) |
rating_value |
String |
Coalesce of external then internal rating value |
has_internal_rating |
Boolean |
internal_pd IS NOT NULL — gates IRB approach eligibility |
has_external_rating |
Boolean |
external_cqs IS NOT NULL |
Facility hierarchy columns¶
| Column | Type | Description |
|---|---|---|
exposure_has_parent |
Boolean |
Whether exposure is child of a facility |
parent_facility_reference |
String |
Parent facility reference |
root_facility_reference |
String |
Top-level facility in hierarchy |
facility_hierarchy_depth |
Int8 |
Levels from root facility (0 = top) |
Lending group columns¶
| Column | Type | Description |
|---|---|---|
lending_group_reference |
String |
Lending group parent if applicable |
lending_group_total_exposure |
Float64 |
Aggregated exposure across lending group |
lending_group_adjusted_exposure |
Float64 |
Excludes residential RE for retail threshold test |
residential_collateral_value |
Float64 |
Residential RE collateral securing this exposure |
exposure_for_retail_threshold |
Float64 |
This exposure's contribution (excl. residential RE) |
Classified Exposure Schema¶
After classification, each exposure has a regulatory exposure class, approach assignment,
and entity flags. The classifier joins counterparty attributes (prefixed cp_*) and derives
classification in five phases.
Source: CLASSIFIED_EXPOSURE_SCHEMA in data/schemas.py, ExposureClassifier.classify() in engine/classifier.py
| Column | Type | Description |
|---|---|---|
exposure_reference |
String |
Unique exposure identifier |
exposure_type |
String |
"loan", "contingent", or "facility_undrawn" |
counterparty_reference |
String |
Foreign key to counterparty |
currency |
String |
Exposure currency |
drawn_amount |
Float64 |
Drawn balance |
interest |
Float64 |
Accrued interest |
undrawn_amount |
Float64 |
Undrawn commitment |
seniority |
String |
"senior" or "subordinated" |
risk_type |
String |
CCF category ("FR", "MR", "MLR", "LR") |
ccf_modelled |
Float64 |
A-IRB modelled CCF |
is_short_term_trade_lc |
Boolean |
Short-term LC for goods movement |
is_buy_to_let |
Boolean |
BTL property lending flag |
exposure_class |
String |
Regulatory exposure class (see values below) |
exposure_class_reason |
String |
Explanation of classification decision |
approach_permitted |
String |
"standardised", "foundation_irb", "advanced_irb" based on IRB permissions |
approach_applied |
String |
Actual approach used for this exposure |
approach_selection_reason |
String |
Why this approach was selected |
cqs |
Int8 |
Credit Quality Step (1–6, 0 for unrated) |
pd |
Float64 |
Probability of default (for IRB exposures) |
rating_agency |
String |
Source of external rating |
rating_value |
String |
Original rating value |
is_sme |
Boolean |
SME classification flag |
is_retail_eligible |
Boolean |
Meets retail criteria |
Valid exposure_class values:
central_govt_central_bankinstitutioncorporatecorporate_smeretail_mortgageretail_qrreretail_otherspecialised_lendingequitydefaultedpsemdbrglaother
Valid approach_applied values:
standardised— Standardised Approachfoundation_irb— Foundation IRBadvanced_irb— Advanced IRBslotting— Slotting Approach
CRM Adjusted Schema¶
After CRM processing, exposures include the full EAD waterfall: provisions → CCF → collateral → guarantees → final EAD. The CRM processor also determines LGD values (supervisory for F-IRB, modelled for A-IRB with optional floors).
Source: CRM_ADJUSTED_SCHEMA in data/schemas.py, CRMProcessor.apply_crm() in engine/crm/processor.py
EAD calculation columns¶
| Column | Type | Description |
|---|---|---|
drawn_amount |
Float64 |
Original drawn balance |
interest |
Float64 |
Accrued interest |
undrawn_amount |
Float64 |
Undrawn commitment |
ccf_applied |
Float64 |
Credit conversion factor applied |
converted_undrawn |
Float64 |
undrawn_amount × ccf_applied |
gross_ead |
Float64 |
drawn_amount + interest + converted_undrawn |
Collateral impact columns¶
| Column | Type | Description |
|---|---|---|
collateral_gross_value |
Float64 |
Total market value before haircuts |
collateral_haircut_applied |
Float64 |
Weighted average haircut percentage |
fx_haircut_applied |
Float64 |
FX mismatch haircut (8% or 0%) |
collateral_adjusted_value |
Float64 |
Net collateral value after haircuts |
ead_after_collateral |
Float64 |
EAD after collateral deduction |
Guarantee impact columns¶
| Column | Type | Description |
|---|---|---|
guarantee_coverage_pct |
Float64 |
Percentage of exposure guaranteed |
guaranteed_amount |
Float64 |
Amount covered by guarantee |
ead_after_guarantee |
Float64 |
Portion not guaranteed |
Final EAD and LGD columns¶
| Column | Type | Description |
|---|---|---|
final_ead |
Float64 |
Final EAD for RWA calculation |
lgd_type |
String |
"supervisory" (F-IRB) or "modelled" (A-IRB) |
lgd_value |
Float64 |
LGD for calculation |
lgd_floor |
Float64 |
Applicable LGD floor (Basel 3.1 only) |
lgd_floored |
Float64 |
max(lgd_value, lgd_floor) |
Pre/Post CRM columns (regulatory reporting)¶
These columns support COREP dual-view reporting — pre-CRM shows the original borrower exposure, post-CRM shows the split between borrower (unguaranteed) and guarantor (guaranteed).
Source: CRM_PRE_POST_COLUMNS in data/schemas.py
| Column | Type | Description |
|---|---|---|
pre_crm_counterparty_reference |
String |
Original borrower reference |
pre_crm_exposure_class |
String |
Original exposure class before substitution |
post_crm_counterparty_guaranteed |
String |
Guarantor reference for guaranteed exposures |
post_crm_exposure_class_guaranteed |
String |
Derived from guarantor's entity_type |
is_guaranteed |
Boolean |
Whether exposure has effective guarantee |
guaranteed_portion |
Float64 |
EAD covered by guarantee |
unguaranteed_portion |
Float64 |
EAD not covered by guarantee |
guarantor_reference |
String |
Foreign key to guarantor counterparty |
pre_crm_risk_weight |
Float64 |
Borrower's RW before guarantee substitution |
guarantor_rw |
Float64 |
Guarantor's RW (SA lookup or IRB-calculated) |
guarantee_benefit_rw |
Float64 |
RW reduction from guarantee |
rwa_irb_original |
Float64 |
IRB RWA before guarantee substitution |
risk_weight_irb_original |
Float64 |
IRB RW before guarantee substitution |
guarantee_method_used |
String |
"SA_RW_SUBSTITUTION", "PD_SUBSTITUTION", or "NO_GUARANTEE" |
is_guarantee_beneficial |
Boolean |
Whether guarantee reduces RWA |
guarantee_status |
String |
Detailed status (incl. non-beneficial flag) |
Specialised Lending Schema¶
For slotting approach exposures. The slotting category and type are determined during classification based on counterparty and product attributes.
Source: SLOTTING_RESULT_SCHEMA in data/schemas.py, SlottingCalculator in engine/slotting/
| Column | Type | Description |
|---|---|---|
exposure_reference |
String |
Exposure identifier |
| ... | ... | (all CRM adjusted columns carried forward) |
sl_type |
String |
Type of specialised lending (see values below) |
slotting_category |
String |
Supervisory category (see values below) |
is_hvcre |
Boolean |
High Volatility CRE indicator |
remaining_maturity_years |
Float64 |
Remaining maturity for CRR maturity-band differentiation |
Valid sl_type values:
project_financeobject_financecommodities_financeipre— Income-producing real estatehvcre— High volatility CRE
Valid slotting_category values:
stronggoodsatisfactoryweakdefault
Transformation Examples¶
Hierarchy Resolution¶
import polars as pl
# Input: counterparties with parent relationships
counterparties = pl.DataFrame({
"counterparty_reference": ["C001", "C002"],
"entity_type": ["corporate", "corporate"],
})
# Org mappings define the hierarchy
org_mappings = pl.DataFrame({
"parent_counterparty_reference": ["C001"],
"child_counterparty_reference": ["C002"],
})
# After resolution, hierarchy columns are added to exposures
resolved_exposure = {
"exposure_reference": "L001",
"counterparty_reference": "C002",
"counterparty_has_parent": True,
"parent_counterparty_reference": "C001",
"ultimate_parent_reference": "C001",
"counterparty_hierarchy_depth": 1,
"rating_inherited": True,
"rating_source_counterparty": "C001",
"rating_inheritance_reason": "parent_rating",
}
Classification¶
# After classification, exposures get regulatory class and approach
classified = {
"exposure_reference": "L001",
"counterparty_reference": "C002",
"exposure_class": "CORPORATE_SME", # Turnover < EUR 50m
"exposure_class_reason": "corporate with turnover < 50m EUR",
"approach_applied": "SA",
"approach_permitted": "SA",
"is_sme": True,
}
CRM Application¶
# After CRM, exposures have the full EAD waterfall
crm_adjusted = {
"exposure_reference": "L001",
"drawn_amount": 10_000_000,
"ccf_applied": 0.5,
"converted_undrawn": 2_500_000,
"gross_ead": 12_500_000,
"collateral_gross_value": 8_000_000,
"collateral_haircut_applied": 0.02, # 2% for 1-5yr govt bond
"collateral_adjusted_value": 7_840_000,
"ead_after_collateral": 4_660_000,
"final_ead": 4_660_000,
}