tensorflow-metal ReLU activation fails to clip negative values on M4 Apple Silicon

Environment:

Hardware: Mac M4

OS: macOS Sequoia 15.7.4

TensorFlow-macOS Version: 2.16.2

TensorFlow-metal Version: 1.2.0

Description:

When using the tensorflow-metal plug-in for GPU acceleration on M4, the ReLU activation function (both as a layer and as an activation argument) fails to correctly clip negative values to zero. The same code works correctly when forced to run on the CPU.

Reproduction Script:

import os
import numpy as np
import tensorflow as tf

# weights and biases = -1
weights = [np.ones((10, 5)) * -1, np.ones(5) * -1]

# input = 1
data = np.ones((1, 10))

# comment this line => GPU => get negative values
# uncomment this line => CPU => no negative values
# tf.config.set_visible_devices([], 'GPU') 

# create model
model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(10,)),
    tf.keras.layers.Dense(5, activation='relu')
])

# set weights
model.layers[0].set_weights(weights)

# get output
output = model.predict(data)

# check if negative is present
print(f"min value: {output.min()}")
print(f"is negative present? {np.any(output < 0)}")

It sounds like you're encountering an issue with the ReLU activation function on the Mac M4 using TensorFlow-Metal. Let's go through a few steps to troubleshoot and potentially resolve this issue:

Potential Causes and Solutions

TensorFlow-Metal Version Compatibility: Ensure that you are using the latest compatible version of TensorFlow-Metal. Sometimes, bugs are fixed in newer releases. Check for updates via pip: pip install --upgrade tensorflow-macos tensorflow-metal

ReLU Implementation in Metal: TensorFlow-Metal might have a different implementation of ReLU that doesn't handle edge cases like floating-point precision issues with exactly zero values. You could try a workaround by slightly offsetting zero values before applying ReLU: def custom_relu(x): return tf.maximum(x, 1e-10)

model = tf.keras.Sequential([ tf.keras.layers.Input(shape=(10,)), tf.keras.layers.Dense(5, activation=custom_relu) ])

Environment and Configuration:

Double-check that TensorFlow is correctly configured to use Metal. You can explicitly set it to use Metal by uncommenting the line to disable GPU visibility for testing: tf.config.set_visible_devices([], 'GPU') Restart your Python interpreter or Jupyter notebook session after changing configurations to ensure changes take effect.

Test with Smaller Data Types:

Sometimes precision issues arise with floating-point types. Try using tf.float16 or tf.float32 explicitly and see if the behavior changes: data = np.ones((1, 10), dtype=np.float32) weights = [np.ones((10, 5), dtype=np.float32) * -1, np.ones(5, dtype=np.float32) * -1]

Check for Known Issues:

Look up any known issues or discussions related to TensorFlow-Metal on GitHub or community forums. There might be specific patches or advice for your hardware configuration.

Fallback to CPU:

As a temporary measure, you can force the model to run on the CPU to bypass the issue until a fix is available: tf.config.set_visible_devices([], 'GPU')

Conclusion

Implement these suggestions and see if any of them resolve the issue with negative values not being clipped to zero. If the problem persists, consider reaching out to the TensorFlow-Metal maintainers or community for further assistance, providing them with details about your setup and the reproduction script.

tensorflow-metal ReLU activation fails to clip negative values on M4 Apple Silicon
 
 
Q