Functions and Modules in Python: Writing Reusable and Organized Code

Shambhavi Thakur

February 2, 2024

Are you tired of writing the same Python code over and over again? Do you copy and paste the same lines of code from one script to another, resulting in a confusing mess of files and functions?

Well, here’s news! You can transform your lengthy, scattered scripts into shorter, well-organized, and reusable codebases using functions and modules. This tutorial provides techniques to help you achieve modularity and organization in your projects. Throughout this tutorial, we will explore various concepts, categorized into the following sections:

Understanding Functions: Definition and Declaration
Function Use in Real-World Scenarios
Demystifying Function Scope
What Is a Module?
Accessing Built-In and Third-Party Modules
Organizing Custom Modules into Packages
Key Takeaways

Get ready to level up your Python skills. By the end of this tutorial, you will know how to transform your amateur scripts into modular code that any team will be proud of.

Start by exploring what functions are and how they are written.

Understanding Functions: Definition and Declaration

Functions help to organize code, improve readability, and promote code reuse. A function is a self-contained, reusable block of code with a name. It can have optional parameters that allow you to pass in data, which the function can then use or manipulate. Additionally, a function can return data as output. If you use a code block repeatedly to perform a specific task, consolidate the block into a function.

In Python, a function declaration includes the following elements:

A def statement: This statement forms the first line of the function. It names the function and binds it to the code block below it. By convention, function names use snake_case (lowercase words separated by underscores). Here is an example of a def statement:
```
def add_two_numbers():
```
Input parameters: These parameters are an optional set of variable names separated by commas. They appear within the parentheses in the def statement. They are placeholders for the values that a user must pass to the function when calling it. The values the user passes to the function are called arguments. If a parameter has a default value and no argument is provided during a function call, the default value is processed.
Here is an example of a def statement with input parameters without default values:
```
def add_two_numbers(num_1, num_2):
```
Below is a basic function that has a default value set for one of its two parameters:
```
def greet(name, message="Guten Abend"):
    print(f"{message}, {name}!")

# Calling the function with both arguments
greet("Joe", "Hi")
# Output: Hi, Joe!

# Calling the function with only the name argument
greet("Emma")
# Output: Guten Abend, Emma! (uses the default value for message)
```
In the example above, name and message are parameters of the greet function, and “Joe”, “Hi”, and “Emma” are arguments.
A docstring: This is a descriptive string enclosed by three quotes. It appears right below the def statement and explains what the function does. Docstrings are not required. However, if you want others to understand a function at first glance, include a docstring in it.
Here is an example of a def statement followed by a docstring:
```
def add_two_numbers(num_1, num_2):
  """
  Add two numbers together and return the result.
  Args:
    num_1 (int or float): The first number to add
    num_2 (int or float): The second number to add
  Returns:
    int or float: The sum of num_1 and num_2
  """
```

A code block: This is the part of a function that defines its executable logic. It comes after the def statement. By convention, developers indent the first level lines of the block four spaces to the right and each following level an additional four spaces.

Here is an incomplete function with a code block:

def add_two_numbers(num_1, num_2):
  """
  Add two numbers together and return the result.
  Args:
    num_1 (int or float): The first number to add
    num_2 (int or float): The second number to add
  Returns:
    int or float: The sum of num_1 and num_2
  """
  sum = num_1 + num_2

A return statement: This optional statement at the end of a function allows you to send a value back to the caller of the function. In a standard return statement, you include a single return value. To return multiple values, you must bundle them into a tuple or other collection that counts as a single return object. A function that does not end with a return statement processes code but does not send back an output when called.

Here is a complete function with a return statement:

def add_two_numbers(num_1, num_2):
  """
  Add two numbers together and return the result.
  Args:
    num_1 (int or float): The first number to add
    num_2 (int or float): The second number to add
  Returns:
    int or float: The sum of num_1 and num_2
  """
  sum = num_1 + num_2

Reflect on what you have learned about function components and identify the function and its components in the code below. What does the function do?

import math
def calculate_circle_area(radius):
  """Calculate and return the area of a circle from the radius"""    
  area = math.pi * radius**2

  return area

my_area = calculate_circle_area(5)
print(my_area)

The import statement above lets us leverage Python’s built-in math module in the calculate_circle_area function. You will learn more about modules later in this tutorial.

The calculate_circle_area function computes the area of a circle using the radius input and stores the result in a variable named area. The function then returns this variable as output. You can reuse this function within Python environments to compute circle areas.

Next, look at another function that performs a mathematical calculation and returns two values.

def calculate_compound_interest(principle, rate, years):
  """
  Calculate and return compound interest amount and future value.
  
  Args:
    principle (float): The initial deposit amount  
    rate (float): The annual interest percentage rate
    years (int): The number of years of compounding

  Returns:
    tuple:
      float: The compound interest amount calculated
      float: The total future value after compounding
  """
  compound_amt = principle * (1 + rate/100) ** years
  future_value = principle + compound_amt
  return (compound_amt, future_value)

amount, total = calculate_compound_interest(1000, 5, 10)
print(amount, total)

This function takes in the principle, rate, and years variables to calculate the compound amount and future total value. It returns both these values bundled in a tuple. When called, it needs to be assigned to two variables.

Now that you have learned how to write functions, consider how to use them in real-world scenarios.

Function Use in Real-World Scenarios

Python, a versatile programming language, allows for the automation of repetitive tasks, saving time. However, if you do not encapsulate the code for these tasks using functions, it can become lengthy, complex, and difficult to manage. To increase the manageability and decrease the length and complexity of the code, you must convert it into reusable functions.

Consider the following three situations where using functions can be beneficial:

Data Cleaning: It’s a good idea to standardize your data cleaning process. For example, you can create a function called ‘standardize_data()’ that encapsulates the code for loading, validating, and normalizing messy datasets. Run this function on different datasets with confidence, knowing that you have thoroughly tested it.
Data Visualization: Do not copy and paste similar Matplotlib chart code in different cells of a notebook and across different scripts. Instead, encapsulate the styles and logic into functions that plot performance and present forecasts. Call these functions wherever you need, enhancing efficiency and saving time.
Machine Learning: Streamline your data preprocessing and model training workflows by using reusable functions that prepare data and train models. By doing so, you ensure consistent implementations across contexts, propagate isolated improvements throughout your codebase, reduce the overall code through reuse, and improve readability through abstraction.

After you define a function in a PY file, you can call it from any place in the file below its definition. You can also call it from other files. However, not every part of your code can see or access what is inside the function because of restrictions. To understand which parts of your code can work with the function, you need to be familiar with the concept of function scope. Keep reading to learn more about function scope.

Demystifying Function Scope

Functions are more than mere tools for code reuse—they form the backbone of structured, maintainable codebases. But the moment we define these neatly arranged logic packets, we have to determine the scope of their contents. Who gets to access their internal workings? And under what circumstances? The balance of your program’s ecosystem depends on a clear understanding of which of its parts can access code within functions. This understanding affects the design and security of your applications.

Think of a program or a PY file as an estate bounded by a wall. Your functions are individual rooms in the grand mansion within the estate. The rooms represent local scopes, safe havens where variables and parameters coexist, shielded from the elements outside. The local scope keeps internal changes internal, protecting the rest of your program from side effects.

Despite the sanctity of the local scope, Python’s scope rules share a bidirectional relationship with the outside code. While the code that lives externally cannot intrude upon a function’s local territory uninvited, the function itself holds a special key. It can reach out to the global scope—the sprawling gardens surrounding the hypothetical grand mansion where globally accessible variables and constants bask in the open air. Functions can draw from this global scope, freely accessing its shared resources to inform their behavior.

Here is an example of a function that can access globally defined variables in addition to locally defined ones through name lookups:

global_count = 0 # Global 
def increment():
   global_count += 1 # Access global_count
   local_count = 1 # Local variable

As we explore the structural intricacies of Python, we encounter namespaces—the meticulously organized realms where variables are named and housed. When you run a Python script (a PY file), its functions, variables, and imported modules populate the global namespace. The elements that reside within the script know everything about this namespace. But call a function into action, and it conjures its local namespace—a bubble where its internal variables are born, live, and perish, untouchable to the outside world unless explicitly shared.

The global namespace is the tapestry that blankets your entire application, visible and accessible by all of your code’s components. The local namespaces, on the other hand, are akin to intricate stitches that create specific designs within the tapestry, each unique to a function and invisible to the outside world unless explicitly shared.

Python’s LEGB rule—a compass for navigating scopes—governs the interplay between namespaces. This rule defines the order in which Python looks up names and decides which variables you’re referring to within your functions:

Local (L) pertains to the names assigned within a function, which cannot be accessed from outside the function unless returned.
Enclosing (E) refers to the names in the scope of any enclosing functions, which is significant in the context of nested functions, where one function is wrapped within another.
Global (G) is about the names established at the top level of a PY file or explicitly declared as global within a function of the file, shared across the entire file.
Built-in (B) designates Python’s built-in names, which are globally available in any Python script.

These layers of scope—from the most immediate Local to the widely encompassing Built-in—ensure a structured and sensible approach to name resolution. Python checks scope layers in sequence, first looking for a variable name in the function’s local scope, then in any enclosing scopes, next in the global scope, and finally in the built-in scope. This ensures variables are resolved from the most specific context to the most general.

By mastering this hierarchy, you can write functions that not only interact seamlessly within themselves but also with the broader scope of the application, all while mitigating the risk of conflicts and enhancing collaborative capabilities.

Now that you understand the scope of functions, explore the concept of modules.

What Is a Module?

Modules in Python are a way to organize and reuse code across projects. A module is a PY file containing related Python code elements, such as functions, variables, and classes, designed to work together for a specific task. Python groups reusable code into two main types of modules, which are described below.

Built-in Modules: In Python, built-in modules refer to the standard libraries of pre-built utilities that you can access via import statements. Because the Python programming language includes these modules, you do not need to install them. Examples of built-in modules include math and random. To import these modules, refer to the code snippet provided:
```
import math # Math operations
import random # Random number generation
```
Custom Modules: To create your own module, all you need to do is add the necessary functions, variables, and classes to a PY file and save it using a unique name. For example, you can create a custom module for data analysis that includes functions to clean, visualize, and model data. Or, you can create a module for web scraping that bundles requests or parsing logic for collecting online content. Here’s an example of how you can import a custom module:
```
# my_module.py file
def clean_data(df):
# Data cleaning code

import my_module
my_module.clean_data(my_dataframe)
```

Custom modules help to organize complex projects into interoperable components, simplifying code modification and maintenance.

Knowing what modules are and how to use them to organize code is essential for writing efficient and reusable Python programs. In the next section, you will learn how to access the multitude of built-in and third-party modules available in Python.

Accessing Built-In and Third-Party Modules

The Python Package Index (PyPI) contains over 100,000 third-party modules, which can extend your capabilities beyond measure. PyPI is home to some of the commonly used and popular modules, such as NumPy, Pandas, Matplotlib, and TensorFlow. Below are examples of how to import and use Pandas and TensorFlow.

pip install pandas
pip install tensorflow # Install new libraries

import pandas as pd # Data analysis
import tensorflow as tf # Deep learning

Python provides a feature that allows for the creation of higher-level “packages” to organize related modules. This feature is crucial because it helps to keep related code together, simplifying its maintenance and management. In the upcoming section, we will delve into practical approaches to setting up packages.

Organizing Custom Modules into Packages

In Python, a package is a means of organizing modules that are related. Essentially, a package is a folder that contains one or more Python modules, as well as an __init__.py file that marks the folder as a package. It could also contain additional resources, such as documentation.

In some cases, packages nest related modules into subpackages. Each subpackage is a subfolder that contains an __init__.py file, one or more modules, and other resources.

Some advantages of Python packages include:

Modularity: Packages help in organizing and grouping related modules together, making it easier to manage and maintain large-scale projects.
Namespace management: Packages provide a way to manage namespaces, which helps avoid naming conflicts and makes it easier to use modules with similar names.
Code reuse: Packages make it easy to reuse code across multiple projects. By keeping related modules in a package, you can easily import them into other projects that require the same functionality.
Distribution: Packages provide a way to distribute and share code with others. By packaging your code into a distributable package, you can share it with others who can then easily install and use it in their projects.
Encapsulation: Packages provide a way to encapsulate related functionality together, making it easier to test and debug the code. By keeping related functionality in a package, you can easily isolate it from the rest of the codebase and test it independently.

When developing a Python package, it is best practice to group related custom modules within distinct subpackages. This approach enhances module accessibility by providing clear and logical groupings for easier navigation. It also mitigates the risk of naming conflicts. This is achieved by creating a structured namespace hierarchy where the same module name can be reused across different subpackages without ambiguity.

To illustrate, let’s consider a Python package called analytics_utils with two subpackages: data_processing and model_evaluation. Both subpackages include a module named utils.py, each serving a distinct purpose relevant to its respective subpackage.

Here’s the package structure:

analytics_utils/
    __init__.py
    data_processing/
        __init__.py
        utils.py  # Utilities for data processing
        preprocessing.py
    model_evaluation/
        __init__.py
        utils.py  # Utilities for model evaluation
        evaluation.py

To use the utils.py module from the data_processing subpackage, you can import it as an alias using the following code snippet:

from analytics_utils.data_processing import utils as dp_utils

# Now you can use functions from the data_processing utils module
dp_utils.some_data_processing_function()

Similarly, to use the utils.py module from the model_evaluation subpackage, you can import it as an alias using the following code snippet:

from analytics_utils.model_evaluation import utils as me_utils

# Now you can use functions from the model_evaluation utils module
me_utils.some_model_evaluation_function()

By providing aliases such as dp_utils and me_utils when importing the modules, you can ensure that it is clear which utils.py module you are referring to in your code, avoiding any confusion despite both modules sharing the same name.

In summary, Python’s package system and structured namespace hierarchy enable the seamless organization of modules within packages and subpackages. Thereby, they facilitate the reuse of module names in different contexts without any ambiguity or naming conflicts.

You have now gained knowledge about functions and modules, including how to write functions. You have also learned about the practical applications of custom functions, how to access built-in and third-party modules, and how to structure custom modules into packages. Keep reading for the main points to take away from this tutorial.

Key Takeaways

Functions encapsulate reusable logic statements
Modules group related functions/tools into files
Packages provide higher-level collections of modules

Together they allow us to craft extensible and maintainable data applications with clean, shared components.