Machine Learning in Finance
Application of algorithms that learn from financial data to make predictions and automate decisions.
FAQs
What is the difference between supervised and unsupervised machine learning in finance?
Supervised learning trains models on labeled historical examples (e.g., loans labeled 'defaulted' or 'repaid') to predict outcomes for new cases. The model learns which features (credit score, income, debt ratio) predict the labeled outcome. Common supervised learning in finance: credit scoring, fraud detection, churn prediction. Unsupervised learning finds patterns in unlabeled data without predefined outcomes—it discovers structure the analyst didn't specify in advance. Common unsupervised learning in finance: customer segmentation (clustering customers by transaction behavior), anomaly detection (identifying unusual transactions without pre-defined fraud rules), and market regime identification (detecting shifts between bull, bear, and volatile market states).
What is alternative data in machine learning for finance?
Alternative data refers to non-traditional data sources used as inputs to machine learning models in financial applications—data beyond conventional financial statements, price data, and credit bureau information. Sources include: satellite imagery (analyzing parking lot activity to estimate retail sales before earnings), credit card transaction data (tracking consumer spending patterns at specific merchants), mobile location data (measuring foot traffic at retail locations), shipping container data (monitoring trade flows), social media sentiment, job postings (signaling company hiring and growth plans), web scraping (pricing data, product launches), and sensor data (utility consumption as a proxy for industrial activity). Alternative data commands significant market value because it can provide investment signal before traditional financial reporting reflects business developments.
How do regulators view machine learning models in credit underwriting?
Regulators require that ML credit models comply with fair lending laws (ECOA, FCRA) and be explainable to applicants who are denied credit. Under the Equal Credit Opportunity Act, lenders must provide adverse action notices explaining specific reasons for credit denial—which requires the model to produce interpretable factors, not just a score. Regulators scrutinize whether ML models create disparate impact on protected classes even if protected characteristics aren't used directly as inputs (proxy variables can create discriminatory outcomes). The OCC's Model Risk Management guidance (SR 11-7) requires rigorous model validation. Interpretable models (logistic regression, decision trees, scorecard models) or model-agnostic explanation tools (SHAP, LIME) are used to satisfy explainability requirements for complex ML models.
Related Terms
Large Language Model
AI system trained on vast text data to understand and generate human language across many tasks.
Natural Language Processing
AI field enabling computers to understand, interpret, and generate human language from text or speech.
Neural Network
Computational system loosely inspired by brain neurons, capable of learning complex patterns from data.
AI Hallucination
AI model generating confident but factually incorrect or fabricated information not grounded in reality.