Sustainable AI Models: Who’s Leading the Way?

As artificial intelligence becomes a core part of business operations, its environmental footprint is attracting growing attention. Training and running large language models (LLMs) consumes vast amounts of electricity and water, not only during the training phase but also throughout their daily use.

With AI adoption accelerating across industries, businesses face a new responsibility: choosing models that balance performance with sustainability. Using sustainable AI practices, from selecting energy-efficient architectures to optimising deployment, has become essential for organisations aiming to reduce their environmental impact. Yet, not all providers are equally transparent about their impact, which makes it difficult for organisations to make well-informed choices.

Beyond the sheer environmental costs, there is also the reputational dimension. Companies that integrate AI into their workflows without accounting for sustainability may face criticism from customers, employees, and regulators. Much like the conversation around supply chains and renewable energy, AI sustainability is becoming part of a broader conversation about corporate responsibility. A proactive stance today could make the difference between being seen as an innovator or a laggard tomorrow.

The Transparency Gap

Currently, many leading AI companies, such as OpenAI (GPT-4, GPT-5), Google DeepMind (Gemini), and Anthropic (Claude), emphasise that they operate certain systems on cloud infrastructure that is carbon-neutral. While this signals some progress, these companies typically stop short of publishing detailed lifecycle assessments.

Independent estimates suggest that a single medium-sized GPT-5 response may use up to 40 watt-hours of electricity, which is roughly eight times more than GPT-4. But without official disclosures, companies using these models cannot accurately measure the environmental impact of their own AI-driven services.

As AI systems become embedded in customer-facing workflows, investors, regulators, and consumers have been demanding visibility into environmental costs. Just as most companies now audit and publish their supply-chain emissions, pressure is mounting to extend the same accountability to digital operations. In the near future, AI emissions reporting may become a formal requirement, not just a best practice.

Photo by Moritz Lange on Unsplash

Energy Efficiency: What the Numbers Say

Recent studies and reports give us a clearer, though still imperfect, picture of energy efficiency across major AI models:

Google Gemini: A single text prompt consumes about 0.24 Wh of electricity, emits roughly 0.03 g CO₂, and uses about 0.26 mL of water. An independent study estimates a 33× improvement in efficiency and a 44× reduction in carbon footprint compared to the previous year based on one average Gemini App text prompt.

Google uses market-based carbon accounting and report only data center electricity, excluding upstream emissions (power plant, water usage, or chip manufacturing). While independent studies can estimate energy use by combining hardware specs, token throughput, and datacenter efficiency (PUE). In the end these are still approximations.

GPT-4o and Other LLMs: A short query typically uses around 0.42 Wh, roughly 40% more than a Google search. An active daily user (8 queries) consumes ~3.7 Wh, scaling up to ~9.7 Wh for longer queries. Models like Claude-3.7 Sonnet are considered eco-efficient, while o3 and DeepSeek-R1 are extremely energy-intensive — over 33 Wh per long prompt, more than 70× the usage of GPT-4.1 nano.

Open-Weight Models: A query to LLaMA 3.1 8B consumes ~57 joules (0.016 Wh) for the model alone, doubling when cooling overhead is included. By contrast, the much larger LLaMA 3.1 405B model can require ~1.86 Wh per query.

Training Costs: GPT-3’s training consumed about 1,287 MWh of electricity, equivalent to powering ~120 U.S. homes for a year, and emitted roughly 552 metric tons of CO₂. BERT-Large required ~287 MWh, and AlphaGo about 1,000 MWh. More efficient models like BLOOM kept training emissions closer to 25 tons of CO₂.

But exact numbers vary widely depending on hardware, datacenter efficiency, and local energy sources. So actual efficiency can swing by 2–10×. While inference efficiency improves every year, total energy use keeps rising because adoption is exploding.

The same model can also be 3–5× more efficient depending on the GPU type (H100 vs A100), datacenter cooling, or grid energy mix (renewables vs coal-heavy regions). Water usage also depends heavily on local climate, a Gemini query in Arizona will consume more cooling water than one in Norway.

Emerging Leaders in Sustainable AI

Several challengers are starting to differentiate themselves by publishing credible environmental data or by adopting innovative model designs:

  • Mistral AI has gone further than most by releasing a full lifecycle audit of its Large 2 model, covering carbon emissions and water usage per prompt. This sets a new bar for transparency in the industry.
  • DeepSeek claims its models use only one-tenth of the compute required by Meta’s LLaMA 3.1, which directly translates into lower electricity consumption and reduced cooling demands.
  • Microsoft (BitNet and Azure) is innovating with ultra-efficient 1.58-bit models and building greener infrastructure in Azure data centres, helping reduce the overall footprint of enterprise AI.
  • Hugging Face is democratising AI with open-source models and tools, while also providing transparency and encouraging efficiency-first designs across its community-driven platform.

Comparing Chatbot Options: LLaMA vs BitNet vs MoE

For companies evaluating chatbots, the sustainability trade-offs become clearer when comparing these different approaches:

Llama
Generating a single query with LLaMA-3-70B consumes approximately 1.7 watt-hours (Wh). While Meta has made strides in offsetting carbon emissions by utilising renewable energy sources during training, the sheer scale of these models is the challenge.

BitNet
Microsoft’s BitNet introduces a more energy-efficient architecture by employing 1-bit precision in its computations. This allows BitNet to operate with up to 96% less energy compared to traditional full-precision models. Plus, BitNet’s compatibility with standard CPUs, reduces the need for specialised, energy-intensive hardware. However, there are trade-offs:

  1. Accuracy may be slightly lower on tasks requiring fine-grained language understanding or numerical reasoning.
  2. Its ecosystem is smaller, with fewer pre-trained models and community resources.
  3. Hardware and framework compatibility can affect energy savings.

It may be less suitable for very large or highly specialised models, and it is less mature in real-world deployments.

Mixture of Experts (MoE)

MoE models activate only a subset of parameters during inference, improving energy efficiency without compromising performance. Google’s GLaM demonstrates that MoE architectures can be more environmentally sustainable than dense models.

Depends on the Use Case

For sustainability alone, BitNet is highly attractive, and MoE models also offer significant efficiency gains. However, practical considerations such as performance, ecosystem support, and hardware compatibility may lead organisations to choose LLaMA or MoE in certain cases.

Why It Matters for Business

AI adoption is no longer only about what a model can do, it is also about how sustainably it does it. For businesses committed to ESG goals, model choice directly impacts their carbon footprint. Opting for smaller models, quantised architectures, or transparent vendors can reduce both operational costs and environmental impact. This also strengthens ESG reporting and reduces reputational risks. Practical considerations include:

Right-sizing models: Choosing a smaller or task-specific model can reduce emissions by up to 90% while still meeting business needs.

Vendor selection: Working with providers who disclose lifecycle data offers confidence and accountability.

Operational efficiency: Practices like batching queries, caching frequent responses, or scheduling compute during low-carbon grid times can further cut emissions.

Hardware and infrastructure choices: Microsoft’s Azure sustainability initiatives can significantly lower compute-related emissions. Their goal was 100% renewable energy by 2025 (its not clear if they’ve met that goal).

Open-source innovation: Platforms like Hugging Face encourage efficient, reusable models that reduce redundant compute.

Looking Ahead: Regulation and Responsibility

Governments are beginning to recognise the hidden costs of AI. In the EU, draft regulations may soon require companies to disclose the environmental impact of advanced AI systems. This means that transparency and efficiency will not only be market differentiators but potentially legal requirements. Businesses that adapt early will be better positioned to thrive.

At the same time, consumer expectations are shifting. As climate awareness grows, end-users increasingly want to know whether the digital services they use align with sustainable practices. Organisations that can demonstrate leadership in this space are more likely to earn long-term trust.

We may also see new industry alliances emerge. Just as organisations have formed coalitions around renewable energy and ethical supply chains, AI providers and adopters may establish collective standards for reporting and efficiency. Such collaboration could help create benchmarks that make comparisons more meaningful and accelerate industry-wide progress.

Key Takeaways

  • Big tech isn’t sharing enough: The largest LLM providers rarely publish full lifecycle emissions data.
  • Transparency is becoming a competitive advantage: Pioneers like Mistral, and Hugging Face are setting a higher bar for accountability.
  • Efficiency is achievable: Microsoft’s BitNet proves that sustainability and performance can align
  • Sustainability is strategic: Choosing environmentally conscious AI models strengthens ESG credentials and reduces operational risks.
  • Adaptation will be rewarded: Early movers that prioritise sustainability today are better positioned to avoid regulatory shocks tomorrow.

Share this page

Picture of Emily Coombes

Emily Coombes

Hi! I'm Emily, a content writer at Japeto and an environmental science student.

Got a project?

Let us talk it through