02-CPU jobs

ARTEMISA has some resources reserved for CPU-only jobs.

Modern frameworks like TensorFlow or Keras allow the creation of data pipelines that benefit from parallel processing in CPU (eg. data augmentation) and GPU (eg. training). However, some tasks like data preparation and augmentation can be done in parallel taking advantage of the number of CPUs, speeding up the process.

We propose a basic example of data preparation requesting only CPU resources, to be used in an image classifier. The target dataset is the more challenging CIFAR-10 image dataset.

Local CPU execution

First, activate the conda environment created in the first tutorial

$ conda activate artuto

We will need the following packages

(artuto) $ pip install tensorflow-datasets scipy

We are going to run the following python code: augment_data_cpu.py

#!/usr/bin/env python3

# https://stepup.ai/train_data_augmentation_keras/

import os
import tensorflow as tf
from tensorflow.keras.datasets import cifar10
from tensorflow.keras import layers, models
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.utils import to_categorical
import matplotlib.pyplot as plt

# Helper function to inspect the first images in a dataset
def visualize_data(images, categories, class_names, file_name):
    fig = plt.figure(figsize=(14, 6))
    fig.patch.set_facecolor('white')
    for i in range(3 * 7):
        plt.subplot(3, 7, i+1)
        plt.xticks([])
        plt.yticks([])
        plt.imshow(images[i])
        class_index = categories[i].argmax()
        plt.xlabel(class_names[class_index])
    fig.savefig(file_name)

os.environ['CUDA_VISIBLE_DEVICES'] = "0"

# CIFAR-10 Dataset
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
num_classes = len(class_names)

(x_train, y_train), (x_test, y_test) = cifar10.load_data()

x_train = x_train / 255.0
y_train = to_categorical(y_train, num_classes)

x_test = x_test / 255.0
y_test = to_categorical(y_test, num_classes)

# The above code first downloads the dataset. The included preprocessing rescales the images into the range
# between [0, 1] and converts the label from the class index (integers 0 to 10) to a one-hot encoded
# categorical vector. 

# Show the first images of the training set
visualize_data(x_train, y_train, class_names, 'train_samples.png')


# Specification of the augmentation parameters: 
#   width shift: Randomly shift the image left and right by 3 pixels
#   height shift: Randomly shift the image up and down by 3 pixels
#   horizontal flip: Randomly flip the image horizontally.

width_shift = 3/32
height_shift = 3/32
flip = True

datagen = ImageDataGenerator(
    horizontal_flip=flip,
    width_shift_range=width_shift,
    height_shift_range=height_shift,
    )
datagen.fit(x_train)

# output directory
path = "./augmented_data"
try:
    os.mkdir(path)
except OSError:
    print ("Creation of the directory %s failed" % path)
else:
    print ("Successfully created the directory %s " % path)

my_batch_size=32
it = datagen.flow(x_train, y_train, shuffle=False, batch_size=my_batch_size,
        save_to_dir=path, save_prefix='artemisa_ex2')

# If we want to iterate and create more batches of images

# num_images_generated = batch_size * repetitions
# repetitions = 100
# for x in range(repetitions):
batch_images, batch_labels = next(it)

# Show samples augmented data
visualize_data(batch_images, batch_labels, class_names, 'augmented_samples.png')

Now we can run it in the UI without the gpurun tool, forcing the usage of the CPU.

(artuto) $ python augment_data_cpu.py

The code above implements data augmentation. We use the CIFAR-10 . It consists of 32x32 pixel images with 10 classes. The data is split into 50k training and 10k test images.

The data generated can be found under the created directory ./augmented_data/. The script also generates two figures with sets of samples: the augmented and corresponding original images, illustrating the transformations.

Original images

Augmented set

Execution in a Worker Node

Now we will run our augmentation process in a remote Worker Node. We will make use of the following submit description file

universe = vanilla

executable              = augment_data_cpu.py

log                     = condor_logs/test.log
output                  = condor_logs/outfile.$(Cluster).$(Process).out
error                   = condor_logs/errors.$(Cluster).$(Process).err

getenv = True

queue

and the executable .sh script referenced in the .sub file

Caution

Don’t forget to give the file referenced by executable execution permits: chmod +x augment_data_cpu.py

And prepare the directory for the output files

(artuto) $ mkdir condor_logs

Finally launch the job through HTCondor.

(artuto) $ condor_submit augment_data_cpu.sub
Submitting job(s).
1 job(s) submitted to cluster 904340.

You can check the status of this and the rest of your launched jobs with condor_q:

(artuto) [artemisa_user@mlui01 02_cpu]$ condor_q

-- Schedd: ----.----.--.-- : <---.---.---.---:----?... @ 07/23/25 14:43:35
OWNER     BATCH_NAME    SUBMITTED   DONE   RUN    IDLE  TOTAL JOB_IDS
user      ID: 904340   7/23 14:43      _      1      _      1 904340.0

Total for query: 1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended
Total for user: 1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended
Total for all users: 79 jobs; 0 completed, 0 removed, 30 idle, 32 running, 17 held, 0 suspended

Again, it is possible to check the results generated in the ./augmented_data/ directory, along with train_samples.png and augmented_samples.png figures.

Summary

Recap

ARTEMISA provides also CPU-only resources.
If we don´t request a GPU explicitly, only CPU resources are used.
If getenv = true the job will therefore execute with the same set of environment variables that the user had at submit time.
Don’t forget to activate the required virtual environment(ie: conda activate) before submitting the job.