Green Data Pipelines: AI-Driven Energy Optimization for Sustainable Cloud Workloads
Session
Computer Science and Communication Engineering
Description
The rapid proliferation of cloud computing workloads has intensified concerns about the high energy demand and environmental impact of large-scale data processing. As infrastructures expand, data centers now account for a growing share of global electricity consumption and carbon emissions. This study presents a green data-pipeline architecture that integrates AI-driven scheduling with advanced resource-management techniques to reduce energy use in cloud-native environments. The architecture combines modern data-engineering platforms, including Apache Airflow for workflow orchestration, Databricks (Spark) for computation, PostgreSQL as an analytical warehouse, and dbt for data transformation. It is coordinated by a reinforcement-learning agent that dynamically optimizes workload placement and resource allocation. Using real-time monitoring and predictive modeling, the AI scheduler aligns task execution with renewable-energy availability and workload fluctuations. Simulation results show meaningful reductions in energy consumption and carbon emissions compared with conventional static scheduling, while maintaining performance and operational stability. Although further research is required to validate scalability and generalizability across heterogeneous cloud settings, the proposed framework demonstrates strong potential to enhance the sustainability of data pipelines and promote environmentally responsible computing practices across the industry.
Keywords:
Sustainable computing, green data pipelines, cloud energy efficiency, reinforcement learning, dynamic scheduling
Proceedings Editor
Edmond Hajrizi
ISBN
978-9951-982-41-2
Location
UBT Lipjan, Kosovo
Start Date
25-10-2025 9:00 AM
End Date
26-10-2025 6:00 PM
DOI
10.33107/ubt-ic.2025.94
Recommended Citation
Salihu, Erina and Syla, Anduena, "Green Data Pipelines: AI-Driven Energy Optimization for Sustainable Cloud Workloads" (2025). UBT International Conference. 26.
https://knowledgecenter.ubt-uni.net/conference/2025UBTIC/CS/26
Green Data Pipelines: AI-Driven Energy Optimization for Sustainable Cloud Workloads
UBT Lipjan, Kosovo
The rapid proliferation of cloud computing workloads has intensified concerns about the high energy demand and environmental impact of large-scale data processing. As infrastructures expand, data centers now account for a growing share of global electricity consumption and carbon emissions. This study presents a green data-pipeline architecture that integrates AI-driven scheduling with advanced resource-management techniques to reduce energy use in cloud-native environments. The architecture combines modern data-engineering platforms, including Apache Airflow for workflow orchestration, Databricks (Spark) for computation, PostgreSQL as an analytical warehouse, and dbt for data transformation. It is coordinated by a reinforcement-learning agent that dynamically optimizes workload placement and resource allocation. Using real-time monitoring and predictive modeling, the AI scheduler aligns task execution with renewable-energy availability and workload fluctuations. Simulation results show meaningful reductions in energy consumption and carbon emissions compared with conventional static scheduling, while maintaining performance and operational stability. Although further research is required to validate scalability and generalizability across heterogeneous cloud settings, the proposed framework demonstrates strong potential to enhance the sustainability of data pipelines and promote environmentally responsible computing practices across the industry.
