Categories: Privacy

Google’s TensorFlow Drops YAML Support Due to Code Execution Flaw

Google’s TensorFlow Drops YAML Support Due to Code Execution Flaw

TensorFlow, a popular Python-based machine learning and artificial intelligence project developed by Google has dropped support for YAML, to patch a critical code execution vulnerability.

YAML or Yet Another Markup Language is a convenient choice among developers looking for a human-readable data serialization language for handling configuration files and data in transit.

Untrusted deserialization vulnerability in TensorFlow

Maintainers behind both TensorFlow and Keras, a wrapper project for TensorFlow, have patched an untrusted deserialization vulnerability that stemmed from unsafe parsing of YAML.

Tracked as CVE-2021-37678,  the critical flaw enables attackers to execute arbitrary code when an application deserializes a Keras model provided in the YAML format.

Deserialization vulnerabilities typically occur when an application reads malformed or malicious data originating from inauthentic sources.

After an application reads and deserializes the data, it may crash resulting in a Denial of Service (DoS) condition, or worse, execute the attacker’s arbitrary code.

This YAML deserialization vulnerability, rated a 9.3 in severity, was responsibly reported to TensorFlow maintainers by security researcher Arjun Shibu.

Also Read: Top 11 Ultimate Cold Calling Guidelines To Boost Your Sales

And the source of the flaw, you ask? The notorious “yaml.unsafe_load()” function in TensorFlow code:

Vulnerable yaml.unsafe_load function call in TensorFlow (GitHub)

The “unsafe_load” function is known to deserialize YAML data rather liberally—it resolves all tags, “even those known to be unsafe on untrusted input.”

This means, ideally “unsafe_load” should only be called on input that comes from a trusted source and is known to be free of any malicious content.

Should that not be the case, attackers can exploit the deserialization mechanism to execute code of their choice by injecting malicious payload in the YAML data which is yet to be serialized.

An example Proof-of-Concept (PoC) exploit shared in the vulnerability advisory demonstrates just this:

from tensorflow.keras import models

payload = '''
!!python/object/new:type
args: ['z', !!python/tuple [], {'extend': !!python/name:exec }]
listitems: "__import__('os').system('cat /etc/passwd')"
'''
  
models.model_from_yaml(payload)

TensorFlow drops YAML altogether in favor of JSON

After the vulnerability was reported, TensorFlow decided to drop YAML support altogether and use JSON deserialization instead.

“Given that YAML format support requires a significant amount of work, we have removed it for now,” say the project maintainers in the same advisory.

“The methods `Model.to_yaml()` and `keras.models.model_from_yaml` have been replaced to raise a `RuntimeError` as they can be abused to cause arbitrary code execution,” also explain the release notes associated with the fix.

“It is recommended to use JSON serialization instead of YAML, or, a better alternative, serialize to H5.”

It is worth noting, TensorFlow is not the first or only project found to be using YAML’s unsafe_load. The function’s use is rather prevalent in Python projects.

Also Read: Management Training PDF for Effective Managers and Leaders

GitHub shows thousands of search results referencing the function, with some developers proposing improvements:

Many repos on GitHub have used and use YAML’s unsafe load function (GitHub)

Fix for CVE-2021-37678 is expected to arrive in TensorFlow version 2.6.0, and will also be backported into prior versions 2.5.1, 2.4.3, and 2.3.4, state the maintainers.

Privacy Ninja

Recent Posts

Enhancing Website Security: The Importance of Efficient Access Controls

Importance of Efficient Access Controls that every Organisation in Singapore should take note of. Enhancing…

2 weeks ago

Prioritizing Security Measures When Launching Webpage

Prioritizing Security Measures When Launching a Webpage That Every Organisation in Singapore should take note…

2 weeks ago

The Importance of Regularly Changing Passwords for Enhanced Online Security

Importance of Regularly Changing Passwords for Enhance Online Security that every Organisation in Singapore should…

3 weeks ago

Mitigating Human Errors in Organizations: A Comprehensive Approach to Data Protection and Operational Integrity

Comprehensive Approach to Data Protection and Operational Integrity that every Organsiation in Singapore should know…

3 weeks ago

The Importance of Pre-Launch Testing in IT Systems Implementation

Here's the importance of Pre-Launch Testing in IT Systems Implementation for Organisations in Singapore. The…

4 weeks ago

Understanding Liability in IT Vendor Relationships

Understanding Liability in IT Vendor Relationships that every Organisation in Singapore should look at. Understanding…

1 month ago