Protocols vs Abstract Base Classes in Python
Over the years, I’ve worked on many Python projects, from large-scale enterprise systems to modular libraries. One consistent challenge has been defining and enforcing the behavior of objects in a way that’s clear, maintainable, and scalable. Python offers two powerful tools for this purpose: Protocols and Abstract Base Classes (ABCs).
While both help define what an object should do, they cater to different scenarios and mindsets. In this post, I’ll walk you through what they are, how they work, and when I’ve found them most useful.
Protocols for Dynamic Duck-Typing
If you’ve ever worked with Python’s dynamic duck-typing approach, you’ve probably experienced the freedom (and chaos) of relying on objects “quacking” a certain way. Protocols take this idea and formalize it with type hints.
Protocols, introduced in Python’s typing module (3.8+), provide a way to define an interface without requiring explicit inheritance. A protocol defines a set of methods or attributes an object must implement to be considered “compatible.” What makes protocols special is that they don’t care about inheritance—any object that has the required methods or attributes satisfies the protocol.
A Data Processing Example
Let’s take an example inspired by a financial analytics tool I worked on. The system processed data from various sources like APIs, databases, and CSV files. Each data source object had read
and write
methods, but they didn’t share a common parent class. We needed a way to ensure that these objects worked with shared processing functions—without forcing a refactor. Enter protocols.
from typing import Protocol
class DataSource(Protocol):
def read(self) -> str:
...
def write(self, data: str) -> None:
...
class APIClient:
def read(self) -> str:
return "Data from API"
def write(self, data: str) -> None:
print(f"Sending data to API: {data}")
class CSVHandler:
def read(self) -> str:
return "Data from CSV"
def write(self, data: str) -> None:
print(f"Writing to CSV: {data}")
def process_data(source: DataSource) -> None:
data = source.read()
print(f"Processing: {data}")
source.write("Processed data")
api_client = APIClient()
csv_handler = CSVHandler()
process_data(api_client) # Works with APIClient
process_data(csv_handler) # Works with CSVHandler
This approach allows the process_data
function to accept any object that satisfies the DataSource
protocol. Here’s why I think protocols work well in situations like this:
- The
DataSource
protocol defines theread
andwrite
methods required by any object passed toprocess_data
. - Both
APIClient
andCSVHandler
implement these methods but don’t need to inherit from a common base class. - This flexibility ensures the system is extensible—you can add new data source types without modifying existing code.
In my experience, protocols are especially helpful when working with legacy code or integrating third-party libraries. Since they don’t require inheritance, they can provide type safety and enforce behavior without forcing you to restructure existing systems.
How Protocols Work
Under the hood, Python uses metaclasses to make protocols function as both type hints and runtime validators. When you define a protocol using Protocol
, Python creates a special metaclass that handles structural type checking. This means an object is considered a virtual subclass of a protocol if it implements the required methods and attributes.
print(issubclass(APIClient, DataSource)) # True
print(isinstance(csv_handler, DataSource)) # True
Neither APIClient
nor CSVHandler
explicitly inherits from DataSource
, but Python’s metaclass mechanism ensures that they qualify because they implement the read
and write
methods.
Note that if you need to validate a protocol at runtime, you must use the @runtime_checkable
decorator from the typing
module. Without it, isinstance
and issubclass
checks won’t work:
from typing import runtime_checkable
@runtime_checkable
class DataSource(Protocol):
def read(self) -> str:
...
def write(self, data: str) -> None:
...
print(isinstance(api_client, DataSource)) # True
This flexibility makes protocols particularly powerful for type checking while keeping your code dynamic and extensible.
Abstract Base Classes for Design-Time Structure
Protocols are great for flexibility, but sometimes you need a more structured approach. This is where Abstract Base Classes (ABCs) come in. ABCs are a tool for enforcing consistent behavior by defining a strict interface that subclasses must implement. Unlike protocols, ABCs require explicit inheritance, making them a better choice when you want a clearly defined hierarchy in your code.
I’ve found ABCs to be particularly useful during the design phase of a system, where you’re building things from scratch and want to ensure that all subclasses adhere to a common contract.
A Reporting Plugin Example
Suppose we’re building a system where each plugin generates a report and requires specific configuration. Here, we can use an ABC to enforce a structure where all plugins implement the methods generate_report
and configure
.
from abc import ABC, abstractmethod
class ReportPlugin(ABC):
@abstractmethod
def generate_report(self, data: dict) -> str:
"""Generate a report based on the given data."""
pass
@abstractmethod
def configure(self, settings: dict) -> None:
"""Configure the plugin with specific settings."""
pass
class PDFReportPlugin(ReportPlugin):
def generate_report(self, data: dict) -> str:
return f"PDF Report for {data['name']}"
def configure(self, settings: dict) -> None:
print(f"Configuring PDF Plugin with: {settings}")
class HTMLReportPlugin(ReportPlugin):
def generate_report(self, data: dict) -> str:
return f"HTML Report for {data['name']}"
def configure(self, settings: dict) -> None:
print(f"Configuring HTML Plugin with: {settings}")
def run_plugin(plugin: ReportPlugin, data: dict, settings: dict) -> None:
plugin.configure(settings)
report = plugin.generate_report(data)
print(report)
pdf_plugin = PDFReportPlugin()
run_plugin(pdf_plugin, {"name": "John Doe"}, {"font": "Arial"})
html_plugin = HTMLReportPlugin()
run_plugin(html_plugin, {"name": "Jane Smith"}, {"color": "blue"})
In this example:
- Structure is Enforced: All plugins must explicitly inherit from
ReportPlugin
and implement thegenerate_report
andconfigure
methods. - Behavior is Predictable: The
run_plugin
function operates on any plugin without needing to know its details. - Extensibility is Simple: Adding new plugins is straightforward, and the shared interface ensures consistency.
When to Use Protocols vs. ABCs
The choice between protocols and ABCs isn’t always black and white. In my experience, it often depends on the context of the project and your goals. Here’s a general guideline to help you decide which to use:
Use Protocols When:
- You’re working with existing code or integrating third-party libraries.
- Flexibility is a priority, and you don’t want to enforce rigid hierarchies.
- Objects from unrelated class hierarchies need to share behavior.
Use ABCs When:
- You’re designing a system from scratch and need to enforce structure.
- Relationships between classes are predictable, and inheritance makes sense.
- Shared functionality or default behavior can reduce duplication and improve consistency.
Reflecting on the Tools
In my experience, protocols and Abstract Base Classes aren’t competing tools—they’re complementary. I’ve used protocols to retrofit type safety into legacy systems without requiring extensive refactoring. On the other hand, I’ve relied on ABCs when building systems from the ground up, where structure and consistency were critical.
When deciding which to use, think about your project’s flexibility needs and long-term goals. Protocols provide flexibility and seamless integration, while ABCs help establish structure and consistency. By understanding their strengths, you can choose the right tool to build robust, maintainable Python systems.