06-Natural Language Processing

This tutorial demonstrates how you can perform Natural Language Processing tasks in ARTEMISA. It is focused on the operational part and it is heavily based on this tutorial from Hugging Face, in case more material is needed.

Setup

First, make sure your conda environment is activated

$ conda activate artemisa-tuto

Now, install the development version of transformers

(artemisa-tuto) $ pip install "transformers[sentencepiece]"

Pipelines

Pipelines allow us to connect a model with its necessary preprocessing and postprocessing steps, enabling us to directly input any text and get an intelligible answer. Here a simple script simple_pipeline.py:

#!/usr/bin/env python

from transformers import pipeline

classifier = pipeline("sentiment-analysis")
res1 = classifier("I've been waiting for a HuggingFace course my whole life.")
res2 = classifier(
    ["I've been waiting for a HuggingFace course my whole life.", "I hate this so much!"]
)

print("Result 1:", res1)
print("Result 2:", res2)

And the corresponding submission file:

universe = vanilla

executable              = simple_pipeline.py

log                     = condor_logs/log.log
output                  = condor_logs/outfile.out
error                   = condor_logs/errors.err

notify_user = artemisa.user@ific.uv.es
notification = always

getenv = True

request_gpus = 1

queue

Caution

As the python file is going to be executable, it must be given execution permits: chmod +x simple_pipeline.py

Now you can submit the job:

(artemisa-tuto) $  condor_submit simple_pipeline.sub

In the output log we can see the results from the sentiment analysis:

Result 1: [{'label': 'POSITIVE', 'score': 0.9598046541213989}]
Result 2: [{'label': 'POSITIVE', 'score': 0.9598046541213989}, {'label': 'NEGATIVE', 'score': 0.9994558691978455}]

Another available pipeline is text generation. The process is analogue. First, the python script to perform our task text_gen.py:

#!/usr/bin/env python

from transformers import pipeline

generator = pipeline("text-generation")
res = generator("I hope ARTEMISA helps me in")

print(res)

Give execution permits with chmod +x text_gen.py

And the submission file text_gen.sub

universe = vanilla

executable              = text_gen.py

log                     = condor_logs/log.log
output                  = condor_logs/outfile.out
error                   = condor_logs/errors.err

notify_user = artemisa.user@ific.uv.es
notification = always

getenv = True

request_gpus = 1

queue

When the task is finished, enjoy the randomness of the outcome:

[{'generated_text': 'I hope ARTEMISA helps me in my career," he said.\n\nParsons said he is
 disappointed with Parma\'s decision to delay the transfer for two months. "I will never play in a club for two months in any other'}]