Getty Images

Asynchronous programming in Python tutorial

Asynchronous programming in Python improves efficiency for I/O-bound applications, but it's not a performance cure-all. Here's how async in Python works and when to use it.

Asynchronous programming in Python enables programmers to write code that can handle multiple tasks at the same time without multiple threads or processes.

The asyncio library, and asynchronous constructs such as coroutines, help Python applications efficiently perform nonblocking I/O operations. This is especially useful for tasks such as handling thousands of network requests, file operations or other I/O-bound processes where waiting for responses bogs down total execution time.

However, async is often misunderstood as multithreading, which leads to incorrect assumptions about its performance benefits.

Async vs. multithreading: Similarities and differences

A common misconception is that asynchronous programming is the same as multithreading. Although both approaches aim to achieve concurrency, they are fundamentally different in how they work.

In multithreading, multiple threads run in parallel, ideally utilizing multiple CPU cores. However, Python's Global Interpreter Lock (GIL) prevents true parallel execution of threads when working within the standard Python implementation. As a result, threads in Python are often limited by context switching overhead and the GIL itself, particularly when performing CPU-bound tasks.

Asynchronous programming, on the other hand, is not about parallelism, but rather efficient management of tasks that involve waiting, such as I/O operations. Async code enables a program to continue running other tasks while it waits for an operation such as a network request or file read to complete.

Importantly, Python remains single-threaded in both synchronous and asynchronous code. Async functions do not magically turn Python into a parallel-processing machine. Instead, they help the single thread to handle more operations concurrently, pausing tasks that await external events and resuming them when ready. This makes async ideal for I/O-bound tasks, but not for CPU-intensive work.

Python async examples

The best use cases for async programming in Python are those dominated by I/O operations. Examples include the following:

  • Web scraping large numbers of pages.
  • Making concurrent API requests.
  • Reading and writing large files.
  • Handling multiple client connections in a web server.
  • Database queries, especially over a network.

These scenarios all spend a lot of time waiting for external systems to respond. Traditional synchronous code would block and wait, and effectively stop the execution of the program. Async code, however, can initiate many such operations simultaneously, and efficiently switch between them as responses come in.

When not to use async in Python

On the other hand, tasks that are mostly CPU-bound are poor candidates for async. Here are some examples:

  • Image processing.
  • Data analysis and mathematical computation.
  • Machine learning model training.
  • Complex algorithms, such as simulations or sorting large data sets.

Async won't speed up these tasks because the bottleneck is the processor, not I/O. Attempts to use async in CPU-heavy scenarios most likely will degrade performance, adding overhead without any benefit. For these cases, parallelism through multiprocessing or other options that bypass the GIL is a better approach.

Implementing async in Python

Python provides the asyncio library as the standard way to write asynchronous code. With asyncio, a developer can define coroutines, which are special functions that can be paused and resumed, to run concurrently within a single thread.

The basics of how asyncio works are as follows:

  • async def defines an asynchronous function, also called a coroutine.
  • await suspends execution of the current coroutine until the awaited coroutine completes.
  • asyncio.run() runs the top-level coroutine and automatically manages the event loop.

Example:

import asyncio
async def fetch_data():
    print("Fetching data...")
    await asyncio.sleep(2)  # Simulate some IO delay
    print("Data fetched")
    return "Sample Data"
async def main():
    result = await fetch_data()
    print(result)
asyncio.run(main())

This simple program fetches data without blocking the entire application. The asyncio.sleep() call represents a nonblocking delay, which lets other coroutines run during that time.

Coroutines and subprocesses

Coroutines are the foundation of Python async programming. These functions can pause execution at certain points and resume later, which enables efficient concurrency. They are lightweight and more memory-efficient compared with threads.

Subprocesses, on the other hand, run separate programs or commands in their own operating system processes. In asyncio, subprocesses can be managed asynchronously, so the main application can stay responsive while it waits for external programs to complete.

Example:

import asyncio
async def run_command():
    process = await asyncio.create_subprocess_shell(
        'ls -l',
        stdout=asyncio.subprocess.PIPE,
        stderr=asyncio.subprocess.PIPE
    )
    stdout, stderr = await process.communicate()
    print(stdout.decode())
asyncio.run(run_command())

Network IO and process communication

One of the most powerful uses of asyncio is to handle network I/O. Whether it's to manage HTTP requests, WebSockets or TCP connections, async enables efficient processing of thousands of connections without spawning thousands of threads.

For interprocess communication, asyncio also interacts with pipes, sockets and queues asynchronously to facilitate seamless communication between processes without blocking the event loop.

The following example shows network I/O using aiohttp:

import aiohttp
import asyncio
async def fetch_url(url):
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as response:
            return await response.text()
async def main():
    content = await fetch_url('/https://example.com')
    print(content)
asyncio.run(main())

Code synchronization and queues

Async applications often need synchronization mechanisms, especially when multiple coroutines produce or consume shared resources. Asyncio provides an asyncio.Queue that supports safe, nonblocking queue operations.

Generally, one implements this using the producer-consumer design pattern. Producers can add data to the queue while consumers process it concurrently, all within the same event loop.

The following example shows how to produce and consume data without blocking the flow of the program:

import asyncio
async def producer(queue):
    for i in range(5):
        await queue.put(i)
        print(f'Produced {i}')
        await asyncio.sleep(1)
async def consumer(queue):
    while True:
        item = await queue.get()
        print(f'Consumed {item}')
        queue.task_done()
async def main():
    queue = asyncio.Queue()
    await asyncio.gather(producer(queue), consumer(queue))
asyncio.run(main())

Async for Python: Best for apps reliant on I/O, not CPU

Asynchronous programming in Python is a powerful tool to improve performance in I/O-bound applications. Apps that employ asyncio can handle thousands of tasks concurrently, which vastly improves efficiency for web servers, scrapers and network services.

However, it is crucial to understand that async is not multithreading. Python remains single-threaded when running async code, and async provides no advantages for CPU-bound tasks. In fact, trying to use async in computationally heavy scenarios can introduce unnecessary complexity and actually degrade performance.

In summary, async is best used for tasks that spend time waiting on external resources. For CPU-bound problems, traditional multiprocessing or compiled extensions remain the better choice.

David "Walker" Aldridge is a programmer with 40 years of experience in multiple languages and remote programming. He is also an experienced systems admin and infosec blue team member with interest in retrocomputing.

Dig Deeper on Core Java APIs and programming techniques