← Back to Home

GreenInfer

Sustainable AI Inference Framework

✅ Project Complete!

🎉 Launched & Deployed

GreenInfer is a production-ready green AI orchestration framework that routes every prompt to the most energy-efficient model capable of answering it accurately, cutting AI compute energy by up to 97%.

Launch GreenInfer Chat View on GitHub
97% Max Energy Saved
73% Avg Reduction
98.9% Classifier Accuracy
5,072 Total Users

How GreenInfer Works

Every prompt passes through a 7-layer orchestration pipeline that scores complexity, optimizes tokens, and routes to the right model tier.

01

📝 Prompt In

User sends their query to the system

02

🔬 Complexity Score

DistilBERT classifier rates query 0-100 with 98.9% accuracy

03

⚡ Prompt Optimize

T5 model removes filler tokens (avg 35% reduction)

04

🌎 Carbon Route

ERCOT grid check for carbon-aware scheduling

05

🎯 Model Route

Routes to Small / Medium / Large tier based on complexity

06

✓ Answer

Returns response with complete energy metrics

Model Tiers

Small Tier

Eco

Model: Llama 3.2 1B

Energy: 0.9 mWh per query

Use Cases: Simple queries, factual questions, definitions

Traffic: 55% of all queries

Medium Tier

Balanced

Model: Llama 3.1 8B

Energy: 3.8 mWh per query

Use Cases: Reasoning tasks, explanations, analysis

Traffic: 30% of all queries

Large Tier

Heavy

Model: Llama 3.3 70B

Energy: 48.0 mWh per query

Use Cases: Complex reasoning, code generation, expert tasks

Traffic: 15% of all queries

Key Features

🔬

Complexity Scoring

DistilBERT classifier trained on 600 labeled examples rates every prompt 0 to 100. 98.9% validation accuracy across 4 tiers.

T5 Prompt Optimizer

Silently rewrites prompts to remove filler words before inference, cutting input tokens by an average of 35% with no quality loss.

🌎

Carbon-Aware Routing

Uses hourly ERCOT grid intensity estimates to defer expensive queries when the grid is running dirty, reducing CO₂ further.

🤺

Cascade Engine

Starts with the smallest model and only escalates to medium or large if confidence is below threshold, inspired by FrugalGPT.

💡

Smart Preview Mode

For complex queries, shows a summary and outline first. User confirms before the full expensive response runs.

📊

Per-Response Metrics

Every answer shows energy used in mWh, CO₂ emitted, tokens saved, and a Green Efficiency Score 0-100 with an improvement tip.

Environmental Impact

40,136 Prompts Optimized
42.4 Wh Energy Saved
14.8g CO₂ Avoided
183g CO₂ Avoided (Benchmark)

Technical Implementation

Framework Components:

Open Source: The entire framework is available on GitHub for developers to integrate green AI into their applications.

Project Links

Try Live Demo GitHub Repository HuggingFace Space View Documentation