introduction
Small rounding errors in automated markets can cause prices to fluctuate, pools to deplete, or create exploitable conditions. Balancer’s StableSwap vulnerability is an example of how minor mathematical inconsistencies can lead to meaningful risks. This article explains how differential fuzzing using Python can reveal inconsistencies between high-precision calculations and Solidity implementations.
Balancer official report
root cause
The pool receives the swap amount and returns the amount in other tokens. Precision issues arise when contracts are multiplied by factors. Multiplication loses resolution and produces output that is less than the mathematically correct value.
How to correct rounding
There are two approaches to solve the problem:
- Rounds input tokens in the correct direction.
- Consistently use the same values for both calculation and transfer.
math in exchange
StableSwap Immutability
StableSwap immutability is defined as follows:
A * n^n * S + D = A * D * n^n + D^(n+1) / (n^n * P)where:
AIt is an amplification factor that curves towards constant summation behavior.nNumber of tokens in the pool.SSum of balance: Σ(x_i)DSimilar to total liquidity, it is immutable.PIt is the product of the balance: Π(x_i)
The contract uses this relationship to calculate the output token amount when one balance changes.
For more on the math, see the Cyfrin Updraft Curve Course.
memo: In the balancer code, amplificationParameter It represents A multiplied by n^(n−1) rather than A alone. You should use this extended value in your tests to avoid false mismatches.
In this article, we focus on testing the correctness of mathematics through differential fuzzing rather than explaining the mathematical structure itself. Some code summaries may contain AI-generated comments and should be reviewed carefully.
The appendix provides an example Python test for a balancer pool.
Tests include:
- Manually guided fuzzing (MGF) logic
- Contract calls and Python equivalents
- named function
*pure_math_quietusing pythonDecimalInput for accurate calculations
How differential testing works:
- Python calculates values with high precision using:
Decimal - Solidity calculates the same value.
- The test compares the two results and determines whether the deviation is acceptable.
function get_token_balance_pure_math_quiet Use Python sqrt() It is a more direct and accurate solution than the Newton iteration used in Solidity.
Important points of fuzzing
Standard fuzzing searches for random values within a defined range. If the scope does not include extreme cases, the fuzzing campaign will not expose them. High-quality fuzzing requires values that reach extreme states, but the time spent testing these values may not always reveal problems efficiently.
Edge case testing
Two fuzzing approaches are useful: First, normal conditions are tested with typical values. Second, test edge cases with extreme values to see where the system breaks.
Edge case testing answers essential questions. Can users withdraw even if an overflow occurs? Can the protocol be paused and resumed properly? The system must handle errors and recover without causing further losses.
A common mistake is to stop testing when an edge case fails. Instead, keep exploring to figure out which components continue to work and which don’t.
It is possible to combine general and special case tests, but this may introduce unnecessary complexity. In general, clear tests are better than complex tests.
test environment
Integration tests use actual external protocols. These tests run slowly, but demonstrate real-world behavior. A fork test creates a local copy of the mainnet state. Fork testing is much faster and allows fixing conditions that would never occur on mainnet.
Mainnet states rarely contain extreme values. When testing edge cases in a fork, you have to create the extreme conditions yourself.
Manually guided fuzzing (MGF) helps achieve these results. With MGF, testers direct fuzzing to specific scenarios that need to be explored.
Off topic: ERC-4626 fuzzing
The ERC-4626 vault standard includes precise accounting treatments for deposits, withdrawals, and stock issuances. Large withdrawals may affect rounding and potentially result in loss of funds. Fuzzing these stores requires logic to monitor balance, supply, and sharing behavior. This makes ERC-4626 testing more challenging than standard fuzzing.
conclusion
Testing the swap functionality requires two checks: The math must produce the correct output and the contract must transmit the correct amount. Differential fuzzing helps check both points.
Good fuzzing should reflect the mathematical formulas that define the system. Python is well suited for this task because it provides high-precision tools. Differential testing compares theoretical results with the behavior of the contract.
Several aspects need to be checked. Immutability must remain correct. Output calculations must match when calculating in different directions. All intermediate values must be cleanly verified. This is the only reliable way to ensure that the math and implementation match.
Manually Guided Fuzzing (MGF) combines human insight with automated exploration. This approach finds subtle bugs that random fuzzing often misses.
appendix
https://gist.github.com/meditationduck/5b51b49b23cda2220672bdd004f131b9
