Re-Thinking Carbon Offset Qualification

Carbon offset standards to date have applied a threshold-based approach to variables like offset additionality and other criteria. A project either is able to pass the threshold and become an offset, or it’s not. This is similar to the results of a criminal trial, which finds someone either innocent or guilty. They don’t find them “perhaps guilty” or “perhaps innocent”

There are three big problems with approaching offsets using a threshold approach to qualifying as a carbon offset.

  • First, they can be very subjective, given that they're based on evaluating the reasonableness or veracity of a project’s baseline, and involve deciding how to weigh missing information and inevitable uncertainties. Whether a project passes or fails the test could depend on who's doing the evaluation, and what side of the bed they got of that morning.

  • Second, once a ton is approved for the carbon offset market, there is no way for potential purchasers of that ton to differentiate between that ton and other ton in terms of how well they satisifed the various offset criteria. Once it’s an offset, it’s as good as any other offset. This is why so much attention ends up focusing on the “co-benefits” of offsets, but co-benefits don’t tell you anything about how well the ton satisfies the criteria of additionality, permanence, and no leakge.

  • Third, and unfortunately for the integrity of carbon markets, almost all the forces at work in carbon markets tend to encourage the pass/fail threshold to be set as low as possible. This allows more projects into the market to generate more offsets so that offset prices can be kept down and more people can claim to have offset their emissions. Once one offset provider heads “toward the bottom,” others tend to have to follow in order to maintain their market share.

Efforts to "fix" carbon offsets have commonly focused on setting out to do a better job of defining the threshold for an acceptable carbon offset, usually using the same ideas of “additionality, permanence, and leakage.” But it is now clear that a key problem with carbon offsets is not so much the way quality thresholds are being implemented, it's the threshold approach itself. When a project that just misses getting over the threshold is "bad," and a project that just manages to get over the threshold is "good," project developers will go to great lengths to tell the right "project story" that gets them over the threshold. And since there is no way for purchasers to differentiate between a project that just squeeks by the threshold, as opposed to one that leaves it far behind, there is no way for the market to self-police in terms of favoring higher quality offsets.

This suggests another way to "fix" carbon offsets when it comes to assessing their additionality and overall quality as a climate change mitigation tool. What if we focused on evaluating how confident we are in the additionality, permanence, and lack of leakage associated with a ton, and turned that information into a “quality score” that can be communicated to the offset market. Since we can never be 100% sure about an offset's additionality (or lack thereof), why not communicate how sure we are and let purchasers decide whether they're comfortable with that confidence level?

How would such a scoring method work? On a 1 to 1000 scale, for example, a score of 900 would suggest a very high quality offset. A project that scores 200 stands a much lower chance of being a quality offset (it's not impossible, but it's much less likely than for the project scoring 900). A carbon offset buyer should be willing to pay more for the project that scores 900, and the market might on its own filter out projects that score 200. Not by disqualifying them through a least common denominator threshold test, but by offering a transparent look at how they perform against the agreed-upon criteria associated with carbon offset quality.


For regulated markets policy-makers could set the threshold for acceptable offsets at a specific level, perhaps require a minimum score of something like 700. Yes, moral hazard might cause projects that really should score 680 to receive a higher score (perhaps, say, 710). It would be very difficult, however, for a 400 score project to be scored at 700.

The graphic above shows how this scoring system applies to riparian reforestation, where many projects are likely to receive higher scores based on the underlying additionality continuum.

This second graphic shows that for energy efficiency, on the other hand, it would be more difficult to get a high score. Reflecting the reality that there are lots of reasons people are pursuing energy efficiency other than carbon markets.


Such a scoring system was developed more than a decade ago, and demonstrated that the reliability of an offset “confidence score” will almost always be higher than the reliability of a threshold-based approval. One view of the spreadsheet used to implement the scoring system is provided below, although it does not include the detailed questions used to score the different variables shown in the spreadsheet, nor does it lay out the trick used to adequately weight the additionality criterion, without having it overwhelm all of the other criteria in the scoring process.


When the methodology was originally developed it was embraced by key players in the carbon trading firm for which it was developed, but was rejected by others over fears that too many projects of the firm's carbon offset projects would score quite low.

For a more in-depth White Paper exploring this scoring approach, contact us at info@climatographer.com .