Dev Productivity & Tools
Lakehouse BRD — Chapter 1: Overview
Strategic goals, project rationale, data ecosystem, Lakehouse scope, and success criteria for a unified data platform.
2026-03-172 min read
1.1 Strategic goals
The Lakehouse project is a core initiative in the organization’s digital transformation plan. Overall goals:
- Build a unified data platform serving multiple purposes: operations, reporting, risk management, partners, and AI/ML.
- Meet growing requirements for access speed, accuracy, and data quality control.
- Ensure legal compliance, in particular applicable personal data protection regulations.
Aligned with organization-level strategy, the Lakehouse system contributes to: better data-driven decisions; stronger credit and operational risk control; deeper digital partner integration and API adoption; and higher AI capability.
1.2 Project rationale
When data systems remain fragmented, common issues include:
- Slow data access: Manual aggregation of reports from many sources.
- Lack of standardization and lineage: Hard to trace data origin.
- Limited AI/ML scale: Models depend on manually aggregated data.
- Constrained partner integration: APIs cannot support real-time queries.
- Personal data compliance risk: Insufficient masking and access control.
1.3 Data ecosystem overview
| Main component | Typical systems |
|---|---|
| CRM | In-house (may inherit existing) |
| Core Lending | Internal business applications |
| Payment | E-wallet, bank partner integration |
| Risk Engine | Python + SQL + dashboard |
| Reporting | Power BI + Excel rollups |
| DWH | MSSQL DWH, BigQuery (or equivalent) |
| API Layer | OpenAPI via Gateway |
| Logging / Audit | Distributed; needs consolidation |
1.4 Lakehouse application scope
| Subsystem / Business area | Application scope |
|---|---|
| Operations reporting | Multi-dimensional: by store, region, staff, product group |
| Financial analysis | Revenue, cost, profit, margin |
| Credit risk | Scoring, overdue debt, LTV analysis |
| Customer analysis | Lifecycle, behavior, target segments |
| AI & ML | Segmentation, risk prediction, up-sell |
| Partner integration | B2B API, controlled data sharing |
| Data control | Data masking, audit log, lineage, ownership |
1.5 Definition of success
| Criteria group | Success criteria |
|---|---|
| Technical | Process >10M records/day, ingest latency <5 min, query <3s |
| Business | Real-time operations dashboard covering all service points |
| ML & AI | Clean, complete data for at least 3 AI models in operations |
| Governance | 100% of tables have a data steward and full metadata |
| Legal | Compliance with applicable data protection regulations and internal security standards |
