Question 1

How many examples are needed for effective few-shot prompting?

Accepted Answer

The optimal number of examples depends on task complexity, model capability, and available context window. Simple, well-defined tasks (straightforward classification, standard format extraction) often work well with 2–3 examples. More complex tasks (multi-step reasoning, nuanced classification, format with many fields) benefit from 5–10 examples. Very long documents reduce the practical number of examples because each example consumes significant context window. Research shows diminishing returns beyond 8–10 examples in most cases—more examples rarely improve performance proportionally and may introduce conflicting patterns. The quality and representativeness of examples matters far more than the exact number; a few diverse, precisely specified examples outperform many similar or ambiguous ones.

Question 2

What makes a good few-shot example for financial document extraction?

Accepted Answer

Effective few-shot examples for financial extraction have: exact input text (the same document structure the model will encounter), precisely formatted output (exactly the JSON structure, field names, value formats expected), handling of ambiguous cases (demonstrating what to do when a field isn't present or is unclear—null vs. 'N/A' vs. excluding the field), numerical formatting specifications (number of decimal places, handling of thousands separators, currency symbols), and representation of edge cases likely to appear in production (non-standard date formats, negative values, scaled values where '2.5B' means 2,500,000,000). Examples that don't represent real edge cases provide false confidence and fail at deployment time when those cases appear.

Question 3

Can few-shot learning work for classifying financial risk factors?

Accepted Answer

Yes—few-shot learning is effective for financial risk factor classification, particularly when categories are well-defined and examples clearly distinguish between them. For a model classifying 10-K risk factor paragraphs by type (market risk, credit risk, operational risk, regulatory risk, etc.), providing 2–3 labeled examples per category with clear, representative text demonstrating each category's defining characteristics improves classification accuracy substantially. The key challenge is: categories may have fuzzy boundaries (a paragraph can involve both market and credit risk), real examples may not be cleanly representative, and the model's training data may not align well with the company's specific taxonomy. Human-reviewed 'gold standard' examples that clearly represent each category are worth the investment for production systems.

Few-Shot Learning

FAQs

How many examples are needed for effective few-shot prompting?

What makes a good few-shot example for financial document extraction?

Can few-shot learning work for classifying financial risk factors?

Related Terms

Zero-Shot Learning

Prompt Engineering

Fine-Tuning

Large Language Model

Tools for this concept

Workday Adaptive Planning

Prophix

Jedox