
You can’t govern what you can’t see
Every conversation about AI governance eventually arrives at the same uncomfortable truth: the people building these systems don’t fully understand them either.
That’s not a criticism. It’s a structural feature of how modern AI works. But it reframes the regulatory pressure that’s been building, and accelerating, in ways that most founders aren’t accounting for.
The EU AI Act’s transparency provisions take effect August 2. The instinct is to treat that as a compliance deadline, something to hand off to legal and revisit in Q3. That’s the wrong instinct. The European Commission’s draft guidelines on Article 50 extend principles that have been visible since the Digital Services Act required disclosure of whether a human or automated system made a consequential account decision. The AI Act isn’t a new direction. It’s the next layer of a framework that has been under construction for years, and the organizations that understood the DSA’s intent, not just its text, are already ahead on AI Act readiness.
The more important question isn’t whether you comply by August 2. It’s whether you’re reading where regulation is headed, or just where it stands today.
The black box problem isn’t getting smaller
The reason AI governance regulation keeps moving in this direction is that the black box problem is real and it isn’t getting smaller.
Anthropic’s CEO Dario Amodei has been direct about this. His case for interpretability research isn’t framed around regulatory risk, it’s the most consequential unsolved problem in the field. Models trained on human-generated content absorb human tendencies, including the incentive structures and power-seeking behavior embedded in that data. Without visibility into what a model is actually doing, you have no reliable way to distinguish sound design from a fluke. The outputs might look right. The underlying logic might not be.
Reagan’s Cold War principle — trust, but verify — applies cleanly here. Interpretability is the verification mechanism. Without it, AI governance is a set of policies applied to a system no one can fully audit. That’s not governance. It’s aspiration.
Most AI governance frameworks miss this. They treat transparency as a deliverable, something you produce for regulators or include in a vendor pitch, rather than as infrastructure that makes everything else function. Yale’s Chief Executive Leadership Institute mapped the governance challenge across eight variables: transparency, accountability, bias, data privacy, reversibility, shareholder impact, scope, and regulatory prescription level. These aren’t new concepts. They’re existing business and legal principles applied to AI systems, which means most organizations already have the foundation for a workable AI governance framework. The accountability variable is worth sitting with specifically, because it addresses the “it wasn’t me, it was the AI” problem directly. Someone has to own the outcome when a system behaves unexpectedly. That ownership has to be defined before an incident, not negotiated afterward.
The human side of explainability is what most frameworks underweight
McKinsey’s work on AI trust makes the point precisely: explainable AI doesn’t reduce the need for skilled humans, it raises the bar for them. A model can surface its reasoning. But if the person evaluating that reasoning lacks the domain knowledge to interrogate it, the explanation is theater. A CFO using AI to develop financial projections needs someone who understands the underlying assumptions, the business context, and the limits of the model’s training data. Without that, the output isn’t a projection. It’s a liability with a confidence score attached.
The same model should speak differently to a CFO than to a risk analyst reviewing identical data, because what they need to evaluate the output intelligently is different. Explainability calibrated to the wrong audience produces a particular kind of risk: outputs that sound right to people who don’t know enough to push back.
None of this is theoretical. New research from Gong across more than 2,000 business leaders found that 46% of planned AI investments are currently stalled due to trust concerns, with lack of explainability and model transparency cited as the primary barriers. The instinct is to treat that as a deployment problem, something you solve by training people on the interface or running better pilots. It isn’t. It’s a design problem. Trust doesn’t emerge naturally once a system is deployed. It has to be built into the system from the start, on both sides: with the employees who have to use it every day and with the customers and regulators who need to understand what it’s doing and why.
The organizations treating interpretability as infrastructure, something designed in from the beginning rather than added when someone asks for it, are the ones that will move fastest. They’ll close enterprise deals that require AI governance documentation. They’ll absorb regulatory pressure that stops their less-prepared competitors. They’ll catch misaligned behavior before it becomes an incident.
The black box is a technical problem, a trust problem, a governance problem, and increasingly, a market access problem. The window to build ahead of it is still open, but the companies that wait to see how enforcement shakes out are making the same bet they made on the DSA, and it isn’t a better bet the second time.
If you can’t explain what your system is doing, you can’t scale it. And the clock is running.
About the author : Charles

Charles Costa, MLIS is a researcher, strategist, and founder of Lexora Labs, where he works on AI adoption, knowledge management, and the future of expert






