C5: Math & Derivations

AI can show its work — but can you check it?

~50 min Econ Workflows Math required

Learning Objectives

By the end of this module, you should be able to:

Identify the types of mathematical tasks where AI is genuinely helpful (intuition, formatting, study scaffolding) versus where it is unreliable (novel derivations, verifying conditions)
Recognize common failure patterns in AI-generated math: sign errors, unchecked conditions, and plausible-but-wrong steps
Apply a verification framework to check AI-produced derivations using boundary cases, dimensional analysis, and special cases
Use AI as a math study tool that supports your learning rather than replacing it

AI and Math: Impressively Formatted, Dangerously Unreliable

Ask an AI to derive the Slutsky equation, and you’ll get a beautifully typeset sequence of steps with clean LaTeX, clear notation, and a confident conclusion. It will probably be correct — this is a textbook staple, and the model has seen hundreds of versions in its training data.

Now ask it to derive the comparative statics of a slightly unusual model — say, an optimization problem with a non-standard constraint or an uncommon functional form. The output will look exactly as polished. The LaTeX will be just as clean. The steps will flow just as smoothly. But there’s a meaningful chance that one of those steps is wrong.

This is the core problem with AI and math: the formatting is always confident, whether or not the content is correct. A wrong derivation in neat LaTeX looks more convincing than a correct derivation in messy handwriting.

Economist’s Analogy

Think of AI-generated math like a regression table produced by code you didn’t write. The numbers are nicely formatted, the standard errors look reasonable, and the stars are in the right place. But if you don’t understand the specification — what’s being estimated, what the identifying assumption is, whether the standard errors are clustered correctly — the pretty table tells you nothing. Pretty math is the same: the formatting is not evidence of correctness.

What AI Is Good At

Explaining intuition

This may be AI’s single best math-related capability. Ask:

“What does the envelope theorem mean intuitively? Explain it like I understand optimization but haven’t seen this theorem before.”

AI excels here because it has processed thousands of explanations of the envelope theorem — textbooks, lecture notes, forum posts, blog entries. It can synthesize these into a clear, plain-language explanation that connects the formalism to economic reasoning.

Similarly:

“Why does the Lagrange multiplier represent the shadow price of the constraint? What’s the economic intuition?”

These “explain the concept” prompts are low-risk and high-value. The model is doing what it does best — producing fluent text that synthesizes patterns from its training data — and the task is one where plausibility and correctness tend to align.

Showing steps

“Walk me through the derivation of the Slutsky equation step by step, starting from the expenditure minimization problem.”

For well-known derivations, AI will typically produce a correct step-by-step walkthrough. This is useful for studying — you can see the derivation laid out more explicitly than most textbooks, and you can ask follow-up questions like “Why did you substitute $h(\mathbf{p}, u) = x(\mathbf{p}, e(\mathbf{p}, u))$?” to drill into specific transitions.

The key caveat: verify each step yourself. Don’t just read it and think “that looks right.” Work through each transition with pencil and paper. The point is to use AI as scaffolding for your own understanding, not as a substitute for it.

LaTeX formatting

This is purely mechanical and AI handles it well:

“Convert this handwritten math to LaTeX: max utility of x to the alpha times y to the 1 minus alpha subject to px times x plus py times y equals m”

Output:

\[\max_{x, y} \; x^{\alpha} y^{1-\alpha} \quad \text{s.t.} \quad p_x x + p_y y = m\]

AI is also excellent at fixing LaTeX syntax errors, converting between notation styles, and reformatting equations for slides versus papers. These are tasks where correctness is easy to verify by inspection.

Checking your work (first pass)

You can use AI as a first-pass sanity check:

“Here’s my derivation of the demand function for $x$ from Cobb-Douglas utility. Is each step correct?”

AI will often catch algebra errors, misapplied rules, and sign mistakes. This is useful — but treat it as a first-pass screen, not a proof verifier. AI catches errors by pattern-matching against what correct derivations look like, not by logically verifying each step. It can miss subtle errors that “look right,” and it can flag correct steps that use unusual approaches.

Analogies and connections

“How is the firm’s cost minimization problem structurally similar to the consumer’s expenditure minimization problem?”

AI is genuinely good at drawing parallels between mathematical structures. It can point out that the firm minimizes $wL + rK$ subject to $f(L,K) = q$ while the consumer minimizes $p_1 x_1 + p_2 x_2$ subject to $u(x_1, x_2) = \bar{u}$, and that the duality results carry over. These structural insights can deepen your understanding of why the same mathematical tools appear across different economic models.

Where AI Fails with Math

Sign errors and algebra mistakes

This is the most common failure mode. AI will carry a negative sign correctly through five steps and then drop it on the sixth. It will expand a product correctly but combine terms incorrectly. It will differentiate a function properly on line 3 and then substitute the result with the wrong sign on line 7.

Consider a simple example. Suppose we’re finding the comparative static effect of a tax $t$ on output, and the derivation involves:

\[\frac{\partial q^*}{\partial t} = -\frac{f'(L^*)}{f''(L^*)} \cdot \frac{\partial w_{\text{eff}}}{\partial t}\]

AI might correctly derive the numerator and denominator but lose the negative sign, flipping the direction of the effect. In a 15-line derivation, this kind of error is easy to miss — especially because everything else looks perfect.

The Pattern

AI math errors rarely look like errors. The wrong step is formatted identically to the right steps. There is no hesitation, no asterisk, no hedge. Every line looks equally confident. You have to check each transition yourself.

Incorrect application of theorems

AI knows the statements of theorems. It does not reliably check whether the conditions are satisfied.

For example, AI might invoke the implicit function theorem to sign a comparative static without verifying that the relevant function is continuously differentiable. It might apply the second welfare theorem without checking convexity of preferences. It might use L’Hopital’s rule without confirming that the original expression is actually an indeterminate form.

In an intermediate micro or econometrics context, this matters most with:

Second-order conditions: AI frequently skips verification that a critical point is actually a maximum (or minimum) rather than a saddle point
Constraint qualification: In constrained optimization, the Lagrangian conditions require constraint qualification (e.g., the constraint gradient must be non-zero). AI almost never checks this.
Existence and uniqueness: AI will produce “the solution” to an optimization problem without verifying that a solution exists or is unique

Plausible-but-wrong steps

This is the most insidious failure. AI can produce a derivation where each individual step looks reasonable but the transitions don’t actually follow logically. The output reads smoothly, the notation is consistent, the conclusion is stated confidently — but somewhere in the middle, there’s a step that doesn’t follow from the previous one.

This happens because the model is generating text that looks like a correct derivation, not executing a logical proof. Each line is produced by predicting “what typically comes next in a math derivation that looks like this.” Usually that prediction is correct. Sometimes it isn’t.

Overconfidence

This deserves its own section because it compounds every other problem. When you make a sign error, you might notice because you hesitate — “wait, should this be positive?” AI never hesitates. Every step is delivered with identical confidence, whether it’s trivially correct or subtly wrong.

Key Insight

AI has no internal uncertainty signal for math. A human doing a derivation will slow down at hard steps, double-check algebra, and feel uncertain when something seems off. AI generates every step at the same “temperature.” This means you cannot use the model’s confidence as a signal of correctness — because confidence is always 100%.

Economics Math Examples

Let’s work through concrete cases to see where AI succeeds and where it breaks down.

Example 1: Consumer optimization (AI usually gets this right)

Problem: A consumer maximizes $U(x,y) = x^{\alpha} y^{1-\alpha}$ subject to $p_x x + p_y y = m$. Find the demand functions for $x$ and $y$.

This is a textbook problem. AI will almost certainly produce the correct derivation:

Set up the Lagrangian: $\mathcal{L} = x^{\alpha} y^{1-\alpha} + \lambda(m - p_x x - p_y y)$
First-order conditions: \[\frac{\partial \mathcal{L}}{\partial x} = \alpha x^{\alpha - 1} y^{1-\alpha} - \lambda p_x = 0\] \[\frac{\partial \mathcal{L}}{\partial y} = (1-\alpha) x^{\alpha} y^{-\alpha} - \lambda p_y = 0\]
Divide the FOCs to get the MRS condition: \[\frac{\alpha}{1-\alpha} \cdot \frac{y}{x} = \frac{p_x}{p_y}\]
Solve with the budget constraint to get: \[x^* = \frac{\alpha m}{p_x}, \qquad y^* = \frac{(1-\alpha) m}{p_y}\]

AI handles this reliably because it has seen thousands of versions in its training data. The Cobb-Douglas demand function is one of the most common derivations in economics.

Sanity checks you should still run:

Does demand increase with income? Yes — $\partial x^*/\partial m = \alpha / p_x > 0$.
Does demand decrease with own price? Yes — $\partial x^*/\partial p_x = -\alpha m / p_x^2 < 0$.
Do expenditure shares sum to 1? $p_x x^*/m + p_y y^*/m = \alpha + (1-\alpha) = 1$. Yes.

Even when the derivation is correct, running these checks builds good habits.

Example 2: A subtler problem (where AI may stumble)

Problem: A worker chooses hours of labor $L$ to maximize $U = C^{\gamma} (T-L)^{1-\gamma}$ where $C = wL + V$ (wage income plus non-labor income $V$), and $T$ is total time. Derive the labor supply elasticity with respect to the wage.

This is still a standard-ish problem, but it involves more steps and some algebra that’s easy to get wrong. Here’s where to watch AI carefully:

Step 1: FOC. The worker maximizes: \[U = (wL + V)^{\gamma}(T - L)^{1-\gamma}\]

Taking the derivative with respect to $L$ and setting it equal to zero: \[\gamma w (wL + V)^{\gamma - 1}(T-L)^{1-\alpha} = (1-\gamma)(wL+V)^{\gamma}(T-L)^{-\gamma}\]

Watch for: AI may write $1-\alpha$ where it should be $1-\gamma$ (mixing up parameter names). It may also have sign errors when differentiating $(T-L)^{1-\gamma}$ because of the implicit negative from the chain rule.

Step 2: Solving for $L^*$. After simplification, optimal labor supply is: \[L^* = \gamma T - \frac{(1-\gamma)V}{w}\]

Watch for: AI should get the algebra right on a good day, but check the sign on the $V/w$ term. A positive $V$ (non-labor income) should reduce labor supply (income effect), so the negative sign is correct.

Step 3: The elasticity. The labor supply elasticity with respect to the wage is: \[\varepsilon_{L,w} = \frac{\partial L^*}{\partial w} \cdot \frac{w}{L^*} = \frac{(1-\gamma)V}{w^2} \cdot \frac{w}{L^*} = \frac{(1-\gamma)V}{wL^*}\]

Where AI might go wrong: This is where it gets tricky. AI will often produce a correct-looking elasticity formula but fail to discuss what it means. Notice that:

If $V = 0$ (no non-labor income), the elasticity is zero — labor supply doesn’t respond to the wage
If $V > 0$, the elasticity is positive — higher wages increase labor supply
The magnitude depends on the ratio of non-labor income to labor income

An AI that gets the algebra right might still say something misleading about interpretation — for example, claiming this represents “the substitution effect” when it actually captures both substitution and income effects through the role of $V$.

Why This Matters

In economics, getting the math right is necessary but not sufficient. You also need to interpret the result correctly. AI is better at algebra than interpretation, so the most likely failure mode is: correct formula, wrong economic meaning.

Example 3: Comparative statics and second-order conditions

Problem: A firm maximizes profit $\pi = p \cdot f(L) - wL$ where $f'(L) > 0$ and $f''(L) < 0$. Find how optimal labor demand changes when the wage increases, and verify the second-order conditions.

AI will typically set up the FOC correctly: \[p \cdot f'(L^*) = w\]

And then apply the implicit function theorem: \[\frac{\partial L^*}{\partial w} = \frac{1}{p \cdot f''(L^*)}\]

Since $f''(L^*) < 0$ (diminishing returns), this is negative — higher wages reduce labor demand. So far, so good.

Where to check AI carefully:

Did it verify the SOC? The second-order condition requires $p \cdot f''(L^*) < 0$, which holds by assumption. AI often states the SOC result without actually checking it, or skips it entirely. If the problem didn’t assume $f''(L) < 0$, would AI notice that the SOC might fail?
Did it check that the implicit function theorem applies? The IFT requires that the FOC function is continuously differentiable and that the derivative with respect to $L$ (i.e., $pf''(L^*)$) is non-zero at the optimum. AI almost never mentions these conditions.
Try a modification. Ask AI to redo the problem with $f(L) = L^{\beta}$ for general $\beta$. Does AI notice that when $\beta \geq 1$, the SOC fails and there is no interior maximum?

This kind of follow-up question is a good stress test. AI handles the mechanics of well-posed problems but often misses the conditions that make the problem well-posed in the first place.

A Verification Framework for AI Math

When AI hands you a derivation, don’t just read it and nod. Run these checks:

1. Check boundary cases

Does the formula give sensible results at extreme values?

If a price goes to zero, does demand go to infinity (for normal goods with standard preferences)?
If income goes to zero, does demand go to zero?
If a parameter goes to 1, does the general case reduce to a familiar special case?

2. Check dimensions and units

Economists sometimes call this “units analysis,” and it catches a surprising number of errors.

An elasticity must be dimensionless (a ratio of percentages)
A marginal effect must have units of $\Delta y / \Delta x$
A price must be in $/unit, a quantity in units

If AI produces an elasticity that has units of dollars, something went wrong.

3. Check special cases

Does the general result reduce to a known result in a familiar special case?

Does a CES demand function become Cobb-Douglas as $\sigma \to 1$?
Does a general equilibrium result reduce to the partial equilibrium result when cross-effects are zero?
Does a dynamic model’s steady state match the static model’s solution?

4. Work backwards

Does the answer actually satisfy the original equation?

Plug the optimal $x^*$ and $y^*$ back into the budget constraint: does it hold?
Plug the optimal $L^*$ back into the FOC: is the FOC satisfied?
Substitute the demand function into the indirect utility function: do you get the expected form?

5. Check signs

Does the result have economically sensible comparative statics?

A price increase should reduce demand for a normal good (law of demand)
Higher wages should increase labor supply through the substitution effect (holding utility constant)
An increase in marginal cost should reduce output
A positive income shock should increase demand for normal goods and decrease it for inferior goods

If any sign contradicts economic intuition, either the derivation is wrong or you’ve found something interesting that needs explanation.

Economist’s Analogy

This verification framework is like running specification checks on a regression. You don’t just run the regression and report the coefficient — you check robustness, examine residuals, test overidentifying restrictions, and see if the results survive placebo tests. Math derivations deserve the same skepticism. The “plausibility check” is not optional.

AI as a Study Tool vs. a Crutch

There is a productive way and an unproductive way to use AI for mathematical economics.

The good workflow

Try the problem yourself first. Get as far as you can. If you solve it completely, great — now you can use AI to check your work.
When you get stuck, ask for a targeted hint. Not “solve this for me” but “I’m stuck on the step where I need to go from the FOC to the demand function. What technique should I use here?”
After the hint, continue yourself. Work through the rest of the problem with the hint in mind.
Compare your final answer to AI’s. If they differ, figure out where and why.

The bad workflow

Paste the problem into AI.
Copy the solution.
Turn it in (or “study” by reading it once).
Learn nothing.

This is not just an academic integrity issue — it’s a learning issue. Mathematical economics builds cumulatively. If you don’t understand the Lagrangian method in intermediate micro, you will be lost in graduate micro. If you can’t do comparative statics by hand, you won’t understand structural estimation. There is no shortcut.

The “explain it to me” test

Here’s a simple rule: if you can’t explain why each step follows from the previous one — in your own words, without looking at the AI output — you don’t understand the derivation. Reading a correct solution is not the same as being able to produce one. Recognizing correct math is easier than generating correct math, and exams test the latter.

The Fluency Trap

Reading AI-generated math feels like understanding it. The steps flow logically, the notation is clean, and each line seems to follow naturally from the last. This creates an illusion of comprehension. You might think “I totally follow this” when what you really mean is “I can’t find an error.” Those are not the same thing. Test yourself by closing the AI output and re-deriving from scratch.

Exercise: Testing AI on Economics Math

Part 1: Solve it yourself (~15 min)

Choose one of the following problems (or use one from your current coursework):

Option A (Intermediate Micro): A consumer has utility $U(x,y) = \ln(x) + \ln(y)$ and faces prices $p_x$, $p_y$ with income $m$. Derive the Marshallian demand functions and the indirect utility function.

Option B (Intermediate Micro/Macro): A firm has production function $f(L) = AL^{\alpha}$ with $0 < \alpha < 1$ and faces output price $p$ and wage $w$. Find the profit-maximizing labor demand and show how it changes when the wage increases.

Option C (Econometrics): Show that the OLS estimator $\hat{\beta} = (X'X)^{-1}X'y$ is unbiased under the assumption $E[\varepsilon \mid X] = 0$.

Attempt the problem. Write out your steps. Note where you get stuck.

Part 2: Ask AI to solve it (~5 min)

Paste the same problem into your AI tool. Compare:

Did you and the AI take the same approach?
Did AI include steps you skipped?
Did you catch any errors in the AI’s solution?
Did the AI skip any steps that you think are important (e.g., checking SOCs)?

Part 3: Introduce an error (~10 min)

Now deliberately modify the problem in a subtle way and see if AI notices:

Change a sign in the utility function (e.g., $U(x,y) = \ln(x) - \ln(y)$) and see if AI comments on the unusual specification
Give a production function that violates the standard assumptions (e.g., $f(L) = AL^2$, which has increasing returns) and see if AI flags that the profit maximum doesn’t exist
Ask AI to check a derivation where you’ve deliberately introduced a wrong step — does it catch the mistake?

Part 4: Test AI as a checker (~10 min)

Write out a derivation with one deliberate error — for example, swap two terms in a key equation. Paste it into AI and ask “Is each step in this derivation correct?” Record whether AI catches your error, misses it, or incorrectly flags a correct step as wrong.

For example, you might write:

“Here’s my derivation of the demand function from $U = x^{0.5} y^{0.5}$ with budget constraint $p_x x + p_y y = m$. Is each step correct?”

Step 1: $\mathcal{L} = x^{0.5} y^{0.5} + \lambda(m - p_x x - p_y y)$

Step 2: $\frac{\partial \mathcal{L}}{\partial x} = 0.5 x^{-0.5} y^{0.5} - \lambda p_x = 0$

Step 3: $\frac{\partial \mathcal{L}}{\partial y} = 0.5 x^{0.5} y^{-0.5} - \lambda p_y = 0$

Step 4: Dividing: $\frac{y}{x} = \frac{p_y}{p_x}$ (this is the deliberate error — the ratio should be flipped)

Step 5: Substituting into the budget constraint…

The question: does AI catch your intentional mistake? In our experience, AI catches obvious errors (wrong formula entirely) but often misses sign and ratio errors — exactly the kind of errors that matter most.

What This Exercise Reveals

Most students find that AI is a decent first-pass checker — it catches maybe half of intentional errors. But it also sometimes “finds” errors that aren’t there, suggesting corrections that would make the derivation wrong. This is because the model is pattern-matching, not proving. It’s useful, but it’s not a proof assistant.

Discussion Questions

If AI can always produce the derivation for you, what does it mean to “know” the math? Is the ability to execute a derivation by hand still valuable, or is understanding the intuition and being able to verify results sufficient?
Consider two students: one who does every derivation by hand and one who uses AI to generate solutions and then carefully verifies each step. Will they perform differently on an exam? In a research setting? Is one approach better, or does it depend on the context?
AI is better at well-known derivations (textbook problems) than novel ones (research-level problems). What does this imply about the value of mathematical training as you move from coursework to original research?

Key Takeaways

AI math looks confident whether or not it’s correct. The clean LaTeX and smooth step-by-step flow create an illusion of rigor. Every derivation — right or wrong — comes out looking like a textbook solution. You cannot use formatting quality as a signal of mathematical correctness.
AI is better at common problems than novel ones. Textbook derivations are usually right because the model has seen many versions. Non-standard problems are where errors creep in — and those are exactly the problems that matter most in research and advanced coursework.
Use AI for intuition, formatting, and first-pass checking — not as a proof verifier. The highest-value uses are explaining concepts, converting to LaTeX, and catching obvious errors. The lowest-value use is blindly trusting a full derivation.
Build verification habits now. Check boundary cases, check units, check special cases, work backwards, check signs. These habits will serve you whether you’re checking AI output, a coauthor’s work, or your own derivations at 2 AM before a deadline.

For instructors: This module works well as an in-class workshop where students interact with AI tools in real time. The exercise is designed so that Parts 3 and 4 produce unexpected results — AI will miss some deliberate errors and “find” some non-errors, and the class discussion about why is where the real learning happens.

Adaptation for intermediate micro: Focus on Examples 1 and 2 (consumer optimization, labor supply) and skip Example 3 (comparative statics with implicit function theorem) if students haven’t covered it. The verification framework is appropriate at any level.

Adaptation for econometrics: Replace the micro examples with matrix algebra and OLS derivations. The same failure modes apply — AI can typeset $\hat{\beta} = (X'X)^{-1}X'y$ beautifully but may mishandle the conditions under which it’s unbiased, consistent, or efficient.

Assessment idea: Have students submit a “verification report” for an AI-generated derivation. Give them a derivation that contains 2-3 subtle errors and ask them to identify and correct each one. Grade on the quality of their mathematical reasoning, not on whether they found all the errors.

Connection to other modules: This builds directly on A1 (why AI produces confident-but-wrong output) and C2 (the distinction between mechanical tasks and analytical judgment). The verification framework here parallels the code-checking framework in C2 — in both cases, the core skill is reading AI output critically rather than accepting it at face value.

Prerequisite math: Students should be comfortable with constrained optimization (Lagrangians), basic calculus (partial derivatives, chain rule), and ideally some exposure to comparative statics. The module is pitched at students in intermediate micro/macro or introductory econometrics.