Saturday, December 13

Multi-factor Productivity (MFP): The Solow Residual

The Puzzle

In the late 1990s, something remarkable was happening to Australian productivity. Multi-factor productivity growth – the portion of economic growth not explained by adding more capital or labour – was running at over 2% per year. A decade later, it had collapsed to near zero, and by some measures had turned negative. What happened?

This question matters enormously for policy. Productivity growth is, in Paul Krugman's memorable phrase, "not everything, but in the long run it's almost everything." It determines living standards, fiscal sustainability, and the scope for wage growth without inflation. Yet when we try to measure it, we encounter a fundamental problem: productivity isn't directly observable. We can only infer it as a residual – the growth left over after accounting for measurable inputs.

This residual approach, pioneered by Robert Solow in 1957, has become the workhorse of productivity measurement. It has also been called "the measure of our ignorance." Understanding both its power and its limitations is essential for anyone working with macroeconomic data.


A Brief History

In 1956, Robert Solow published a deceptively simple model of economic growth that would earn him the Nobel Prize. The following year, he applied it empirically to US data from 1909 to 1949. His finding was startling: only about 12% of the growth in output per worker could be attributed to capital accumulation. The remaining 88% was... something else.

"I am using the phrase 'technical change' as a shorthand expression for any kind of shift in the production function. Thus slowdowns, speed-ups, improvements in the education of the labor force, and all sorts of things will appear as 'technical change.'"

– Robert Solow, 1957

Moses Abramovitz, reviewing similar findings, was more blunt. He called the residual "a measure of our ignorance about the causes of economic growth." This tension – between the residual as a genuine measure of technological progress and the residual as a statistical dustbin – persists today.

The method has evolved since 1957. Dale Jorgenson and Zvi Griliches refined input measurement. Economists developed growth accounting frameworks that distinguish between labour quality and quantity, ICT and non-ICT capital, and multiple types of intermediate inputs. But the core insight remains: we measure productivity by subtraction, not direct observation.


The Framework

The standard approach begins with a Cobb-Douglas production function, which assumes output (Y) is produced by combining capital (K) and labour (L) with some level of technology or efficiency (A):

$$Y = A \cdot K^{\alpha} \cdot L^{(1-\alpha)}$$

Here α represents capital's share of income – typically around 0.30 in developed economies. Taking logarithms and differentiating with respect to time converts this into a growth accounting equation:

$$g_Y = g_A + \alpha \cdot g_K + (1-\alpha) \cdot g_L$$

Rearranging to solve for the residual:

$$g_{MFP} = g_Y - \alpha \cdot g_K - (1-\alpha) \cdot g_L$$

This says: MFP growth equals GDP growth minus the weighted contributions of capital and labour growth. The weights are the capital and labour factor income shares. This is growth accounting, not growth theory: it decomposes observed outcomes into factor contributions but does not identify the underlying causal mechanisms driving productivity change.


What the assumptions imply

This framework embeds strong assumptions. Constant returns to scale means doubling all inputs exactly doubles output. Competitive factor markets means capital and labour are paid their marginal products, so income shares equal output elasticities. Hicks-neutral technical change means technology improves all factors proportionally.

When these assumptions fail – and they often do – the residual captures the violation as well as true efficiency gains. Market power, increasing returns, factor hoarding, and mismeasurement all end up in the same bucket.


What's Actually in the Residual?

The MFP residual is often loosely called "technology" or "efficiency," but it captures far more than genuine innovation. Understanding what's in there is crucial for interpretation.


Things we want to measure

Technological progress: New production methods, better machines, improved processes. This is what we hope the residual mainly reflects.

Reallocation effects: Resources moving from less productive to more productive uses. When workers shift from low-productivity agriculture to high-productivity manufacturing, aggregate MFP rises even if no individual firm becomes more efficient.

Knowledge spillovers: Ideas developed in one firm or sector that benefit others. These are a genuine source of growth not captured in measured inputs.


Things we'd rather separate out

Capacity utilisation: In recessions, firms use capital and labour less intensively. This shows up as falling MFP, even though underlying efficiency hasn't changed. The 2020 COVID recession in Australian data illustrates this dramatically.

Measurement error: If we undercount capital services or mismeasure labour input, the residual absorbs the error. This is particularly problematic for intangible capital – software, databases, organisational knowledge – which is often poorly measured or not measured at all.

Quality improvements: A laptop today is vastly more capable than one from 2000, but if we don't fully adjust for quality in our capital stock measures, some of that improvement leaks into MFP.

Terms of trade effects: For commodity exporters like Australia, rising export prices increase GDP without any change in physical production efficiency. When iron ore prices soared in the 2000s, this flattered measured productivity; when they fell, productivity appeared to collapse.

Labour composition: A workforce with more education and experience is more productive. Unless we adjust for this (as the ABS does in some measures), these gains appear as MFP rather than labour input.


Application: The Australian Experience

Applying this framework to Australian data from 1985 to 2025 reveals a striking story. The inputs – GDP, capital stock, and hours worked – show generally upward trends, with a dramatic interruption in 2020.

The data comes from three ABS sources: National Accounts (5206) for GDP chain volume measures, the Modellers' Database (1364) for net capital stock, and Labour Force (6202) for hours actually worked. Using hours rather than employment matters: it captures changes in part-time work and average hours that simple headcounts miss.


The productivity boom and bust

When we calculate the raw MFP residual and smooth it with an HP filter (λ=1600, standard for quarterly data), a clear pattern emerges. Trend MFP growth rose steadily through the early 1990s, peaked at around 2.1% per annum in 1997-98, then declined almost continuously until hitting roughly -0.2% by 2007-08.

The 1990s surge is often attributed to microeconomic reforms: tariff reductions, deregulation, privatisation, enterprise bargaining, and competition policy. It may also reflect delayed benefits from IT adoption, favourable industry composition effects, and currency dynamics – the AUD moved significantly through this period, affecting tradeable sector incentives in ways that likely influenced measured productivity.

The 2000s decline is more contested. Several factors likely contributed: the mining investment boom absorbed capital but took years to generate output; drought affected agriculture severely; terms of trade volatility complicated measurement; the AUD appreciated dramatically (the TWI rose from around 50 in 2001 to 80 by 2011), reshaping tradeable sector dynamics; and perhaps the low-hanging reform fruit had been picked.

A modest recovery appeared in the 2010s, with trend MFP growth returning to around 0.6% by 2018-19. Post-COVID, the picture is cloudier. The pandemic created extreme volatility in hours worked and capacity utilisation, making trend extraction unreliable at recent endpoints.


Decomposing GDP growth

Breaking GDP growth into contributions from capital, labour, and MFP reveals the changing composition of Australian growth. In the 1990s, MFP contributed roughly one percentage point annually to growth of around 4%. By the 2000s and 2010s, the MFP contribution had shrunk dramatically, with capital deepening (particularly mining investment) and labour force growth doing most of the work.

This matters for sustainability. Growth driven by capital accumulation eventually faces diminishing returns. Growth driven by labour force expansion is constrained by demographics and participation. Only MFP growth – doing more with what we have – can sustain rising living standards indefinitely.


Sensitivity to the capital share

One underappreciated issue is sensitivity to the assumed capital share (α). Testing values of 0.25, 0.30, and 0.35 shows the results are reasonably robust for broad trends, but levels and turning points can shift noticeably. With α = 0.25, more of the residual is attributed to capital, reducing measured MFP growth during capital-deepening periods. With α = 0.35, the opposite occurs.

In practice, most Australian analyses use α between 0.28 and 0.33, based on observed income shares. The choice is not arbitrary, but income shares themselves fluctuate with commodity prices and economic conditions.


A Cautionary Tale: The Output Gap

It's tempting to extend this framework to estimate potential output and the output gap – the difference between actual and potential GDP. If we have trend MFP, we can construct trend factor inputs, combine them through the production function, and generate a potential output series.

The results look plausible at first glance. The output gap shows recessions (1990-91, 2020) as large negative gaps, and boom periods as small positive gaps. But a simple cross-check reveals a problem.

First, however, allow me to explain the re-anchoring that appears on this chart. When you construct potential output by accumulating trend growth rates, small errors compound over time. The HP filter can allow potential GDP to drift persistently above or below actual GDP, implying implausibly large and sustained output gaps. Re-anchoring addresses this by periodically forcing potential output to equal actual output at selected "anchor points" – typically cycle peaks before major downturns (1990, 2000, 2008, 2019 in the Australian case). The logic is that at a cyclical peak, the economy is likely operating near capacity, so the gap should be approximately zero. This is admittedly ad hoc: it imposes the assumption that gaps close at peaks rather than testing it. But without some form of re-anchoring, the output gap becomes dominated by cumulative drift rather than genuine cyclical variation.


The Phillips curve test

If our output gap is economically meaningful, it should correlate with inflation. Positive gaps (demand exceeding supply) should produce above-target inflation; negative gaps should produce disinflation. This is the Phillips curve relationship.

Plotting our estimated output gap against trimmed mean inflation reveals a weak, unstable relationship. The correlation is only about 0.16 – barely distinguishable from noise. The 5-year rolling correlation swings wildly from -0.8 to +0.8, suggesting no stable structural relationship.

The message is clear: output gaps derived purely from filtered production function residuals are "notional only." They're not disciplined by inflation outcomes, so persistent gaps don't generate the price pressures we'd expect. For policy-relevant output gap estimates, prefer models that incorporate inflation information – my joint output gap and NAIRU model, for instance.


Known Limitations

Any honest treatment of the Solow residual must confront its limitations head-on.

HP filter end-point bias: The Hodrick-Prescott filter, widely used to extract trends, is known to perform poorly at sample endpoints. The filter effectively "looks ahead" using future data, which obviously isn't available at the end of the sample. Recent trend estimates should be treated with particular caution and will revise as new data arrives.

Potential circularity: We use factor income shares to back out productivity, but income shares might themselves be affected by technology and market structure. If technology is skill-biased, or if firms have market power, income shares become endogenous and the decomposition becomes less clean.

Aggregation issues: The aggregate production function is a useful fiction. Individual firms have heterogeneous technologies, and aggregation can create spurious productivity patterns even when no individual firm's efficiency changes.

Capital measurement: Converting heterogeneous capital goods (buildings, vehicles, computers, intellectual property) into a single stock measure requires assumptions about depreciation, retirement, and service flows. These assumptions affect both the capital stock level and its growth rate.

Labour input: Hours worked is better than employment counts, but still ignores effort, skill, and composition. Quality-adjusted labour input measures exist but add complexity and their own measurement challenges.


Conclusion

The Solow residual remains, nearly 70 years after its introduction, the standard approach to productivity measurement. Its elegance is undeniable: with minimal data (output, capital, labour, and an assumed production function), we can decompose growth into interpretable components.

Its limitations are equally undeniable. The residual is not "technology" or "efficiency" in any pure sense. It's everything not explained by measured inputs – including measurement error, capacity utilisation, terms of trade effects, and model misspecification.

For Australian data, the framework reveals important patterns: the 1990s productivity surge, the 2000s slowdown, the modest 2010s recovery. These align with known reform periods and structural changes. But the specific numbers should be held lightly. They're sensitive to methodology, and derived quantities like output gaps lack the discipline of inflation cross-checks.

The honest practitioner uses the Solow residual for what it does well – providing a consistent framework for thinking about growth sources over time – while acknowledging what it cannot do: identify "true" technology separately from the many other factors that affect our ability to transform inputs into outputs.

As Abramovitz knew, measuring productivity is largely measuring our ignorance. The goal is to be ignorant in a structured, comparable, and informative way.


Technical Notes

Scope: This is a basic aggregate Cobb-Douglas decomposition, not a KLEMS-style analysis. Capital is not decomposed by asset type (ICT vs structures vs machinery), labour is not quality-adjusted for education or experience, and intermediate inputs are not included. These refinements matter – the ABS and Productivity Commission produce more granular measures – but the aggregate approach is sufficient to illustrate the core framework and its limitations.

Data sources: ABS 5206.0 (National Accounts, GDP chain volume measures), ABS 1364.0.15.003 (Modellers' Database, net capital stock CVM), ABS 6202.0 (Labour Force, monthly hours worked converted to quarterly). Sample period 1985Q1 to latest available.

Parameters: Capital share α = 0.30. HP filter smoothing parameter λ = 1600 (standard for quarterly data). Sensitivity tests use α ∈ {0.25, 0.30, 0.35}.

Inflation cross-check: Trimmed mean CPI from ABS 6401.0. Target band 2-3% with midpoint 2.5%.

Code availability: Full Python implementation available as a Jupyter notebook with data fetching, decomposition, filtering, and visualisation.

No comments:

Post a Comment