You can use the glob module in Python to find all files with a .txt extension in a directory.txt
extension in directory, you can use the glob
module in Python. The glob
module provides a function called glob
that can find files and directories matching a specified pattern.
Here is an example of how to use glob
to find all .txt
files in a directory:
import glob
# Find all .txt files in the current directory
files = glob.glob('*.txt')
# Print the list of files
print(files)
The glob
function returns a list of file paths matching the pattern. In this case, the pattern '*.txt'
matches all files in the current directory with a .txt
extension.
You can also specify a specific directory to search by including the path in the pattern. For example, to find all .txt
files in the /path/to/dir
directory, you can use the following pattern: '/path/to/dir/*.txt'
If you want to search for files with a .txt
extension in subdirectories as well, you can use the recursive=True
option:
import glob
# Find all .txt files in the current directory and its subdirectories
files = glob.glob('**/*.txt', recursive=True)
# Print the list of files
print(files)
3 Ways to Find All Files by Extension in Python
Here are three ways to find all files with a specific extension in Python:
- The
‘glob’
Module - The
‘os.listdir’
Function - The
‘os.walk’
Function
1. Recursive Search with ‘os.listdir()’
The glob
module in Python provides a function called globglob,
that can be used to find files and directories matching a specified pattern. The glob
function returns a list of file paths matching the pattern.
Here is an example of how to use the glob
function to find all .txt
files in a directory:
import glob
# Find all .txt files in the current directory
files = glob.glob('*.txt')
# Print the list of files
print(files)
You can also specify a specific directory to search by including the path in the pattern. For example, to find all .txt
files in the /path/to/dir
directory, you can use the following pattern: '/path/to/dir/*.txt'
If you want to search for files with a .txt
extension in subdirectories as well, you can use the recursive=True
option:
import glob
# Find all .txt files in the current directory and its subdirectories
files = glob.glob('**/*.txt', recursive=True)
# Print the list of files
print(files)
The glob
module also provides a function called iglob
that works similarly to glob
, but returns an iterator instead of a list. This can be more memory-efficient when working with large directories.
import glob
# Find all .txt files in the current directory
files = glob.iglob('*.txt')
# Print the list of files
for file in files:
print(file)
2. Recursive Search with ‘os.listdir()’
To perform a recursive search for files with a specific extension using the os
module, you can use the following approach:
import os
def find_files(extension, path):
# Initialize an empty list to store the paths of the matching files
matching_files = []
# Iterate over the files and directories in the specified path
for item in os.listdir(path):
# Get the full path of the item
full_path = os.path.join(path, item)
# If the item is a file and has the desired extension, add it to the list of matching files
if os.path.isfile(full_path) and full_path.endswith(extension):
matching_files.append(full_path)
# If the item is a directory, recursively search for files with the desired extension
elif os.path.isdir(full_path):
matching_files.extend(find_files(extension, full_path))
return matching_files
# Find all .txt files in the current directory and its subdirectories
files = find_files('.txt', '.')
# Print the list of files
print(files)
This function uses the os.listdir
function to iterate over the files and directories in the specified path. If an item is a file with the desired extension, it is added to the list of matching files. If an item is a directory, the function calls itself recursively to search for files with the desired extension in that directory.
This approach can be slow for large directory trees, as it requires traversing the entire structure. An alternative method using the os.walk
function may be more efficient for these cases.
3. The ‘os.walk’ Function
The os.walk
function in Python is a powerful tool for traversing directory trees. It generates an iterator that yields tuples containing information about the current directory and its subdirectories.
Here is an example of how to use the os.walk
function to find all .txt
files in a directory tree:
import os
# Find all .txt files in the directory tree rooted at '.'
for root, dirs, files in os.walk('.'):
for file in files:
if file.endswith('.txt'):
print(os.path.join(root, file))
The os.walk
function iterates over all subdirectories, starting from the specified root directory, and yields a tuple for each directory containing the following information:
- The path of the current directory (
root
) - A list of subdirectories in the current directory (
dirs
) - A list of files in the current directory (
files
)
In the example above, we use a nested loop to iterate over the files
list and print the full path of each file that ends with .txt
.
You can also modify the dirs
list to control which subdirectories are visited. For example, to skip subdirectories starting with '.'
, you can use the following code:
import os
# Find all .txt files in the directory tree rooted at '.', but skip subdirectories starting with '.'
for root, dirs, files in os.walk('.'):
dirs[:] = [d for d in dirs if not d.startswith('.')]
for file in files:
if file.endswith('.txt'):
print(os.path.join(root, file))
The os.walk
function is generally more efficient than a recursive function that uses the os.listdir
function, as it avoids the overhead of repeatedly calling the function. It is also more flexible, allowing you to modify the traversed directory tree easily.
Wrap up
There are several ways to find all files with a specific extension in Python, depending on your needs and the complexity of the directory tree you are working with.
The glob
module provides a simple and efficient way to find files matching a specified pattern, and the recursive=True
option also allows you to search for files in subdirectories.
The os
module provides several functions that can be used to search for files, including os.listdir
and os.walk
. These functions allow you to iterate over the files and directories in a specified directory and can be used to perform a recursive search if needed.
The pathlib
module provides a convenient and object-oriented interface for working with filesystem paths and includes a glob
method that can be used to find files matching a specified pattern.
Regardless of your chosen approach, consider your solution’s performance and memory efficiency, especially if you are working with large directory trees.
Thanks for reading. Happy coding!