Running Your First Python Script on Azure ML
Hey! If you haven’t created an Azure Machine Learning resource within the Azure Portal yet, you’ll need to do that to follow along with this. Here’s a quick guide.
Introduction
Today, we’ll learn how to run Python scripts in the cloud on Azure ML. All Azure ML is doing is taking some code we wrote, spinning up a Virtual Machine (VM), installing any dependencies we need, and running the script we asked it to run.
So, naturally, the first step is to write a Python script that works on your local machine. We are going to write a script that:
- Accepts an optional command line argument
--message
.- The default will be “Hello, world!”.
- Takes the value of
--message
and saves it to a filelogs/message.txt
- If the
logs/
directory doesn’t exist, it’ll create it.
- If the
On your local computer, the name of logs/
directory is meaningless. But, on Azure ML, this is a reserved directory specifically for saving things from your Run. So, don’t change it unless you wanna have a bad time. 😉
Writing a Simple Example
Lets set up a very simple example to understand the basics. First, we make a new directory azureml-examples
and prepare some files within. Ignore run_on_azureml.py
for now, but create the rest of the files shown below.
azureml-examples
├── config.json # AzureML Resource config.json file from the portal
├── requirements.txt # File that lists Python dependencies
├── run_on_azureml.py # The file we'll use to execute code on AzureML
└── simple_example # Directory holding your script + any related files
└── run.py # The script we'll run
-
config.json
{ "subscription_id": "<YOUR SUBSCRIPTION ID>", "resource_group": "<YOUR RESOURCE GROUP>", "workspace_name": "<YOUR WORKSPACE NAME>" }
-
requirements.txt
azureml-defaults mlflow azureml-mlflow torch torchvision pytorch-lightning cmake
-
simple_example/run.py
import os from argparse import ArgumentParser from pathlib import Path from pprint import pprint if __name__ == '__main__': parser = ArgumentParser() parser.add_argument('--message', type=str, default='Hello, world!') args = parser.parse_args() logdir = Path('./logs') logdir.mkdir(exist_ok=True, parents=True) outfile = logdir / 'message.txt' outfile.write_text(args.message) print(f"Message: {args.message}") print('-'*40) print() pprint(dict(os.environ)) print('-'*40) print(f"Current Directory: {Path.cwd()}") print('-'*40)
Note - we wont be using the dependencies in requirements.txt in this first example. But, by specifying them now, we can avoid the lengthy environment preparation phase as we move ahead with more complex examples…More on this later!
Run it locally
First, I suggest setting up a virtual environment and installing the requirements.txt file we defined by running pip install -r requirements.txt
.
Then, from the azureml-examples
directory, you should be able to run:
python simple_example/run.py
You should see a new folder appear called logs/
with your file logs/message.txt
saved within. If you open up that file, you should see:
Hello, world!
Run the file again, this time supplying a different value to our message argument:
python simple_example/run.py --message Howdy!
Your message.txt
file should now show the message you passed when you open it.
Howdy!
Cool, now that that’s working, lets try to run it on Azure ML.
Running Scripts via the azureml
Python SDK
Logical Flow
Here’s the logic for taking that script you just ran locally and running it on AzureML via the azureml
Python SDK:
- Authenticate with your AzureML Workspace via the
config.json
file. - Create or reference a Compute Target. These can be CPU-only or GPU-enabled machines. We’ll use cheap CPU-only instances in this tutorial, since our code isn’t using the GPU.
- Define the script you’d like to run on Azure. In our case, that’s
simple_example/run.py
. - Define the folder that holds your code and any other related helper files. In our case, that’s
simple_example/
. - Define the
Environment
in which your code will run. This is where you can point to arequirements.txt
file if ya have one (which we do).
Below, you can see a higher level overview of the components involved in our task today. We aren’t going to use the components marked in grey for now.
Writing the runner script
This script can technically be run from anywhere. We’ll run it locally, but you could also run this from a function app/logic app, another VM, the backend code of your API, etc…
Keep in mind that you’ll need your config.json
file in the same directory as this script in order for it to work.
First, lets start off with some imports
from pathlib import Path
from azureml.core import (
Environment,
Experiment,
Run,
ScriptRunConfig,
Workspace,
)
from azureml.core.compute import AmlCompute, ComputeTarget
We can Authenticate with our AzureML instance by creating a Workspace
object using the config.json file you downloaded earlier.
workspace_config = "config.json" # the path to your config.json file
ws = Workspace.from_config(workspace_config)
Next, we can either create or reference a ComputeTarget
. This is where we specify the size of the VM we want to use for our runs. We do that by supplying an ‘instance type’, which is basically an identifier name of an available machine from Azure. You can see a list of them and their prices here.
Let’s say we want to name our target "my-compute"
. We can first check if a target with that name exists. If it does, let’s use that. If not, we’ll create a new ComputeTarget
with that name given the instance type we want. Here’s what that would look like:
def find_or_create_compute_target(
workspace,
name,
vm_size="Standard_D8_v3",
min_nodes=0,
max_nodes=1,
idle_seconds_before_scaledown=900,
vm_priority="lowpriority",
):
if name in workspace.compute_targets:
return ComputeTarget(workspace=workspace, name=name)
else:
config = AmlCompute.provisioning_configuration(
vm_size=vm_size,
min_nodes=min_nodes,
max_nodes=max_nodes,
vm_priority=vm_priority,
idle_seconds_before_scaledown=idle_seconds_before_scaledown,
)
target = ComputeTarget.create(workspace, name, config)
target.wait_for_completion(show_output=True)
return target
compute_target_name = "my-compute"
compute_target = find_or_create_compute_target(ws, compute_target_name)
Next, we can define any dependencies needed by our script by creating an Environment
and supplying the path to our requirements.txt
file.
requirements_file = 'requirements.txt' # Assumes its in current directory
env = Environment.from_pip_requirements("my-pip-env", requirements_file)
Now we can configure the run by creating a ScriptRunConfig
.
source_directory
- a directory that holds yourscript
and any related files it may need. We just use pathlib to specify this is the direct parent directory of our script,simple_example/
.script
- The relative path of the script you’d like to run relative tosource_directory
. In our case, we can specify this is just the name of our file using pathlib,run.py
.arguments
- Any args to pass to yourscript
when it executes on Azure. This is inargparse
format, so you have to pass it a list like this:["--message", "Howdy!"]
.
script_path = 'simple_example/run.py'
script_args = ['--message', 'Howdy!']
run_config = ScriptRunConfig(
source_directory=Path(script_path).parent,
script=Path(script_path).name,
arguments=script_args,
compute_target=compute_target,
environment=env,
)
Finally, we submit the run - grouping it under an Experiment
. Experiments let you group related runs together. In our case, we’ll submit the run under an experiment called simple-example
. Don’t worry - if it doesn’t exist, Azure will create it for you.
experiment_name = 'simple-example'
exp = Experiment(ws, experiment_name)
exp.submit(run_config)
Optionally, we can stream the logs of the run directly to our terminal by using the run.wait_for_completion()
method and then delete the cluster so it doesn’t keep costing us money.
# Wait for the run to finish
run.wait_for_completion(show_output=True)
# Delete our compute target so it doesn't cost us money
compute_target.delete()
Putting it all together
-
The full script
from pathlib import Path from azureml.core import ( Environment, Experiment, Run, ScriptRunConfig, Workspace, ) from azureml.core.compute import AmlCompute, ComputeTarget def find_or_create_compute_target( workspace, name, vm_size="Standard_D8_v3", min_nodes=0, max_nodes=1, idle_seconds_before_scaledown=900, vm_priority="lowpriority", ): if name in workspace.compute_targets: return ComputeTarget(workspace=workspace, name=name) else: config = AmlCompute.provisioning_configuration( vm_size=vm_size, min_nodes=min_nodes, max_nodes=max_nodes, vm_priority=vm_priority, idle_seconds_before_scaledown=idle_seconds_before_scaledown, ) target = ComputeTarget.create(workspace, name, config) target.wait_for_completion(show_output=True) return target workspace_config = 'config.json' requirements_file = 'requirements.txt' compute_target_name = 'my-compute' experiment_name = 'simple-example' script_path = 'simple_example/run.py' script_args = ['--message', 'Howdy!'] # Authenticate with your AzureML Resource via its config.json file ws = Workspace.from_config(workspace_config) # The experiment in this workspace under which our runs will be grouped # If an experiment with the given name doesn't exist, it will be created exp = Experiment(ws, experiment_name) # The compute cluster you want to run on and its settings. # If it doesn't exist, it'll be created. compute_target = find_or_create_compute_target(ws, compute_target_name) # The Environment lets us define any dependencies needed to make our script run env = Environment.from_pip_requirements("my-pip-env", requirements_file) # A run configuration is how you define what youd like to run # We give it the directory where our code is, the script we want to run, the environment, and the compute info run_config = ScriptRunConfig( source_directory=Path(script_path).parent, script=Path(script_path).name, arguments=script_args, compute_target=compute_target, environment=env, ) # Submit our configured run under our experiment run = exp.submit(run_config) # Wait for the run to finish run.wait_for_completion(show_output=True) # Delete our compute target so it doesn't cost us money compute_target.delete()
What happens when you run it?
It’ll first spin up a cluster for you. You can watch its status in the portal from the compute tab.
Once a node is up, your code should be done preparing and will start running. Check out the experiment and the run created for you from the Experiments tab.
Once it finishes running, you should be able to see the output file we wrote in the ./logs
directory via the “Outputs + Logs” tab of the run. It should contain the message you supplied via the --message
arg.
Conclusion
In this tutorial, you learned:
- The basics of the Azure ML Python SDK
- How to run a simple Python script on Azure ML
- How to save files from your runs so you can view/download/use them later
In the next tutorial, we’ll learn the basics of deploying models to API endpoints on AzureML.