Here are the general steps to follow when using Test-Driven Development (TDD) in Python:
1. Understand the Requirement
- Clearly understand what functionality or feature you're going to implement. Define the expected behavior or output.
2. Write a Test First (Red Phase)
- Start by writing a unit test that defines a function or method you haven't implemented yet. The test should reflect the requirements.
- Use a testing framework like
unittest
orpytest
to write the test. - Example with
pytest
:
3. Run the Test and See it Fail
- Run the test to confirm that it fails, as the function does not yet exist or has not been implemented. This step ensures that the test is valid.
- In
pytest
, you can run the test with:
4. Write the Minimal Code to Pass the Test (Green Phase)
- Implement just enough code to make the test pass. The focus here is on simplicity and correctness.
- Example:
5. Run the Tests Again
- Execute all tests to verify that the new code passes the test. If it does, you’ve confirmed that the functionality works as intended.
- In
pytest
:
6. Refactor the Code (Refactor Phase)
- Improve the code structure while keeping the functionality the same. Clean up duplications, improve readability, and follow coding best practices.
- Make sure to rerun the tests after refactoring to ensure nothing breaks.
7. Repeat
- Continue the cycle: write another failing test, implement just enough to pass, and then refactor.
By following these steps in TDD, you will incrementally develop your code while ensuring that it is well-tested and reliable.
Example
Here's an example of using TDD to build and test a Python pipeline that retrieves data from several APIs. We'll use the unittest
framework along with unittest.mock
to mock the API calls.
Scenario:
Imagine we have a pipeline that fetches data from two different APIs, processes it, and returns the combined results.
Step 1: Write a Failing Test
We'll start by writing a test for a pipeline that retrieves data from two APIs. We'll mock the API responses, as we don't want to make real requests in the test environment.
import unittest
from unittest.mock import patch
# Assuming the pipeline function we want to test is called 'fetch_data_pipeline'
class TestFetchDataPipeline(unittest.TestCase):
@patch('pipeline.requests.get')
def test_fetch_data_pipeline(self, mock_get):
# Mock API responses
mock_get.side_effect = [
# First API response
unittest.mock.Mock(status_code=200, json=lambda: {"data": "API 1 data"}),
# Second API response
unittest.mock.Mock(status_code=200, json=lambda: {"data": "API 2 data"})
]
# Import the pipeline function
from pipeline import fetch_data_pipeline
# Call the function and check the result
result = fetch_data_pipeline()
expected_result = {
"api_1": "API 1 data",
"api_2": "API 2 data"
}
self.assertEqual(result, expected_result)
if __name__ == '__main__':
unittest.main()
Step 2: Run the Test
At this point, running the test will fail because the fetch_data_pipeline
function doesn't exist yet.
Step 3: Write Minimal Code to Pass the Test
Now, we will create the fetch_data_pipeline
function that fetches data from the APIs. We'll use requests
for making HTTP calls.
import requests
def fetch_data_pipeline():
api_1_response = requests.get('https://api1.example.com/data')
api_2_response = requests.get('https://api2.example.com/data')
return {
"api_1": api_1_response.json().get("data"),
"api_2": api_2_response.json().get("data")
}
Step 4: Run the Tests Again
Now, when you run the tests, they should pass.
Step 5: Refactor (if necessary)
Once the test passes, you can refactor the code to improve readability or efficiency. For example, you could handle errors from the APIs or make the function more flexible. But for now, our code is simple and works fine.
Complete Example
Pipeline Code (pipeline.py
):
import requests
def fetch_data_pipeline():
api_1_response = requests.get('https://api1.example.com/data')
api_2_response = requests.get('https://api2.example.com/data')
return {
"api_1": api_1_response.json().get("data"),
"api_2": api_2_response.json().get("data")
}
Test Code (test_pipeline.py
):
import unittest
from unittest.mock import patch
class TestFetchDataPipeline(unittest.TestCase):
@patch('pipeline.requests.get')
def test_fetch_data_pipeline(self, mock_get):
# Mock API responses
mock_get.side_effect = [
unittest.mock.Mock(status_code=200, json=lambda: {"data": "API 1 data"}),
unittest.mock.Mock(status_code=200, json=lambda: {"data": "API 2 data"})
]
from pipeline import fetch_data_pipeline
result = fetch_data_pipeline()
expected_result = {
"api_1": "API 1 data",
"api_2": "API 2 data"
}
self.assertEqual(result, expected_result)
if __name__ == '__main__':
unittest.main()
Explanation:
- Mocking API calls: The
@patch('pipeline.requests.get')
decorator replacesrequests.get
with a mock object, so we don't actually hit real APIs. - Mock responses: We use
side_effect
to simulate two different API responses. The first call returns data from "API 1", and the second call returns data from "API 2". - Assertions: We check if the function returns the expected combined data from both APIs.
This approach ensures that the tests run quickly and independently of external services.