skip navigation
skip mega-menu
Posts

Measuring the carbon footprint of your Python applications

With environmental targets an ever-growing concern, 构建尽可能高效的应用程序比以往任何时候都更加重要. 数据是理解应用程序对环境影响的关键.

它可以影响软件开发生命周期中的设计和架构选择. 例如,它可以帮助回答这样的问题:从平台移动组件是否像 Amazon EC2 to a serverless offering like AWS Lambda is worthwhile.

With this in mind, 您如何开始测量您的软件的碳足迹,并开始将其转换为更可读的格式? In the case of Python applications, hopefully this post will help.

What is CodeCarbon?

CodeCarbon 是一个开源的Python包,用来比较各种机器学习模型的碳足迹,但可以用于更一般的应用程序. It takes the infrastructure hosting the code, 主机的位置和执行时间,以估计单次执行中使用的二氧化碳当量(CO2e)的数量.

该软件包通过监测应用程序的能源消耗来实现这一点,然后将其乘以碳强度. 世界各地(甚至单个数据中心)都在使用不同的能源组合, each with a different carbon intensity. This varying data mix means that the statistics need to be localised, 由于一些地区从化石燃料中获得更多的电力,而一些化石燃料的碳强度高于其他燃料.

所有这些信息都可以用来更好地了解……对环境的影响, and resources being used by, the software, which allows you to optimise it to reduce them.

How to install CodeCarbon

You can use the Python Package Index (PyPI) repository to install CodeCarbon. 这可以通过在安装脚本中使用的需求文件中包含包名或使用以下命令来完成:

Grab this code from GitHub

How to monitor emissions

CodeCarbon的实现可能感觉很熟悉,有点像计时器或进度条. There are three different ways to use CodeCarbon:

  • As an object
  • As a decorator
  • As a context manager

每一个都可以组合在一起,以提供更细粒度的数据以及更广泛的概述.

Adding the instrumentation does come with an overhead. A test using the object method added around 1.2 seconds to the execution time, which resulted in 0.000043kWh of additional energy used. 虽然这种开销令人遗憾,但收集这些数据的好处超过了这种小开销.

As an object

The object implementation is likely to be the most commonly used. 如果您希望在函数即服务(FaaS)(如AWS Lambda)上使用CodeCarbon, 然后可以在处理程序函数之外初始化对象,以便在后续调用中重用. The example below shows how the object implementation can be used in this way.

Grab this code from GitHub

As a decorator

The second option is to use a decorator. 如果您只希望跟踪有限数量的函数或整个应用程序的排放量,那么decorator选项非常有用. 这种方法不太可能适合在FaaS上下文中监视整个应用程序,因为文件可能会丢失.

Grab this code from GitHub

As a context manager

Finally, we can use a context manager to wrap the code we want to monitor. 上下文管理器的功能类似于对象方法,但更简洁一些. 这样做的缺点是,当在FaaS上下文中使用时,它不会在后续调用中被重用.

Grab this code from GitHub

How to make your emissions data more visible

Each of these implementations outputs a file. The file contains data including the emissions in kilograms (kg), 以千瓦时(kWh)为单位的能源消耗和基础设施所在国.

View larger version

一旦这个文件可用,我们就可以提取这些数据,使其更加可见,并随着时间的推移跟踪它. 该文件允许我们查看对业务逻辑的更改对碳足迹的影响是正面的还是负面的. It also tells us how large that impact is.

下面的代码片段展示了一个示例,说明如何通过将数据作为自定义指标提交,从而提高数据的可见性 Amazon CloudWatch.

Grab this code from GitHub

您甚至可以将这些CloudWatch指标添加到仪表板中,使其易于访问和监控. In terms of pricing, Amazon’s free tier provides up to 3 dashboards, with each additional dashboard costing $3 per month. 值得注意的是,自定义指标不包括在免费层中,并且根据跟踪的自定义指标的数量和针对该指标的记录值的API请求的数量计费.

We’ve shared example project code on GitHub.

How does this help my organisation?

世界各地的政治和商业领袖使用来自各种来源的大量数据来做出明智的决策. This data could be anything, from unemployment figures to recycling rates. 当有大量的数据,但没有仪表板或报告,使数据包含的见解容易看到时,决策的困难就出现了.

一些组织开始使用碳预算来帮助实现他们的环境目标. Meanwhile, 研究人员和政界人士已经开始考虑对企业的排放量征税. 碳税将惩罚碳足迹最大的组织,同时为绿色创新提供更多资金.

由于碳预算,更高的数据可见性对于希望更密切地监控这些数据的数字领导者是有用的. Emissions figures could be used like code coverage, 旨在保持排放量接近或低于新变化引入之前的水平. The figures could also influence the design or architectural choices, particularly with regards to proof of concepts.

Both of these uses help drive down emissions. Knowing the numbers makes people more aware of the resources used, which means we don’t make decisions that cost the Earth.

Today we’ve taken a deep dive into Python applications, 但这只是组织利用数据减少能源消耗的众多方法之一. 如果你有兴趣评估你的整个数字资产对环境的影响, or harnessing the potential of your data to make better decisions, we’d love to help. You can find out more about our data services, or just drop us a line.

Subscribe to our newsletter

Sign up here