Hey,
Thank you for your detailed response.
I've prepared two standalone scripts that try to replicate the issue using randomly generated data. In both scripts, I generate synthetic data of varying sizes, to mirror the dynamic input size scenario in my actual project. This synthetic data is then passed through a Keras model, that includes a tf.keras.layers.Resizing
layer.
The first script uses tf.data.Dataset
to feed the model, while the second one utilizes a generator function to yield the data in batches.
Interestingly, the memory issue it seems to occur only in the script that uses tf.data.Dataset
(filling the memory cache) and does not seem to occur when using the generator (~1.5gb of memory). However, in my actual code, where I use the generator approach, I do observe the memory issue (reaching more than memory cache). Furthermore, the issue is absent when using CPU or Nvidia GPU (via Google Colab), which both reach less than 1.5gb of memory. Anyway, you can find the two scripts below.
Script using tf.data.Dataset
:
| import numpy as np |
| import tensorflow as tf |
| |
| |
| |
| |
| |
| def generate_data(num_samples, max_size): |
| """Generate synthetic data of varying sizes""" |
| data = [] |
| labels = [] |
| for _ in range(num_samples): |
| size = np.random.randint(1, max_size+1) |
| data.append(np.ones((size, size)) * 255) |
| labels.append(np.random.randint(0, 2)) |
| return data, labels |
| |
| class DynamicResizeModel(tf.keras.Model): |
| """A model that includes a resizing layer""" |
| def __init__(self, target_size): |
| super().__init__() |
| self.target_size = target_size |
| self.expand_dims = tf.keras.layers.Lambda(lambda x: tf.expand_dims(x, -1)) |
| self.resize = tf.keras.layers.Resizing(*target_size) |
| self.flatten = tf.keras.layers.Flatten() |
| self.dense = tf.keras.layers.Dense(1, activation='sigmoid') |
| |
| def call(self, inputs): |
| x = self.expand_dims(inputs) |
| x = self.resize(x) |
| x = self.flatten(x) |
| return self.dense(x) |
| |
| |
| |
| train_data, train_labels = generate_data(100, 1024) |
| |
| |
| train_data = tf.ragged.constant(train_data) |
| train_labels = tf.constant(train_labels) |
| |
| |
| train_dataset = tf.data.Dataset.from_tensor_slices((train_data, train_labels)) |
| train_dataset = train_dataset.shuffle(buffer_size=1024).batch(8) |
| |
| |
| model = DynamicResizeModel(target_size=(128, 32)) |
| model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) |
| model.fit(train_dataset, epochs=1000) |
Script using a generator:
| import numpy as np |
| import tensorflow as tf |
| |
| |
| |
| |
| |
| def generate_data(num_samples, max_size): |
| """Generate synthetic data of varying sizes""" |
| data = [] |
| labels = [] |
| for _ in range(num_samples): |
| size = np.random.randint(1, max_size+1) |
| data.append(np.ones((size, size)) * 255) |
| labels.append(np.random.randint(0, 2)) |
| return data, labels |
| |
| def data_generator(data, labels, batch_size): |
| """Create a generator that returns batches of data""" |
| num_samples = len(data) |
| indices = np.arange(num_samples) |
| while True: |
| for i in range(0, num_samples, batch_size): |
| batch_indices = indices[i:i+batch_size] |
| batch_data = tf.ragged.constant([data[idx] for idx in batch_indices], dtype=tf.float32) |
| batch_labels = np.array([labels[idx] for idx in batch_indices], dtype=np.float32) |
| yield batch_data, batch_labels |
| np.random.shuffle(indices) |
| |
| class DynamicResizeModel(tf.keras.Model): |
| """A model that includes a resizing layer""" |
| def __init__(self, target_size): |
| super().__init__() |
| self.target_size = target_size |
| self.expand_dims = tf.keras.layers.Lambda(lambda x: tf.expand_dims(x, -1)) |
| self.resize = tf.keras.layers.Resizing(*target_size) |
| self.flatten = tf.keras.layers.Flatten() |
| self.dense = tf.keras.layers.Dense(1, activation='sigmoid') |
| |
| def call(self, inputs): |
| x = self.expand_dims(inputs) |
| x = self.resize(x) |
| x = self.flatten(x) |
| return self.dense(x) |
| |
| |
| num_samples = 100 |
| max_size = 1024 |
| train_data, train_labels = generate_data(num_samples, max_size) |
| |
| |
| model = DynamicResizeModel(target_size=(1024, 128)) |
| model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) |
| |
| |
| batch_size = 8 |
| |
| |
| train_generator = data_generator(train_data, train_labels, batch_size) |
| |
| |
| model.fit(train_generator, steps_per_epoch=num_samples // batch_size, epochs=1000) |
Given these findings, I would like to understand if this behavior is expected with the tensorflow-metal plugin or if it is indeed an anomaly. If it's the former, could you provide guidance on optimizing my code to prevent the memory issue while using tensorflow-metal?
Looking forward to your insights.