How to Log File Movements into a CSV Report with Python

How to Log File Movements into a CSV Report with Python

Often in Python, you may want to automatically sort files into folders based on their type. We’ve seen how to organize desktops or unzip archives. But what if you need to keep a meticulous record of every file that moves? For compliance, auditing, or simply tracking changes, logging file movements is crucial. This tutorial will show you how to enhance your file organization scripts to generate a detailed CSV report of every file moved, including its original location, new location, and the timestamp of the move.

Introduction

Automating file organization is efficient, but sometimes it feels like a “black box” operation. Files move, and you trust the script, but there’s no paper trail. By integrating CSV logging, you can create a transparent record of all file system modifications your script performs. This not only provides accountability but also makes it easier to troubleshoot or revert changes if necessary.

Setting Up Your Environment

Make ensure you have Python installed. If not, get it from python.org. We’ll be using Python’s built-in os, shutil, csv, and datetime modules.

Python Script

Integrate the CSV logging into a file organization script, and save this as organizer_with_log.py.

import os
import shutil
import csv
from datetime import datetime

def setup_csv_logger(log_file_path):
    """
    Sets up the CSV file for logging. Writes headers if the file is new.
    """
    fieldnames = ['Timestamp', 'Original Path', 'New Path', 'File Name', 'Category', 'Status']
    file_exists = os.path.isfile(log_file_path)

    with open(log_file_path, 'a', newline='') as csvfile:
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        if not file_exists:
            writer.writeheader()  # Write header only if file is new
    
    return log_file_path

def log_movement(log_file_path, original_path, new_path, file_name, category, status="Success"):
    """
    Logs a file movement or status into the CSV report.
    """
    timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    fieldnames = ['Timestamp', 'Original Path', 'New Path', 'File Name', 'Category', 'Status']
    
    with open(log_file_path, 'a', newline='') as csvfile:
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        writer.writerow({
            'Timestamp': timestamp,
            'Original Path': original_path,
            'New Path': new_path,
            'File Name': file_name,
            'Category': category,
            'Status': status
        })

def organize_files_with_logging(directory_path, log_file_path):
    """
    Organizes files in the given directory into subfolders based on file type
    and logs all movements to a CSV file.
    """
    setup_csv_logger(log_file_path) # Ensure log file and headers are ready

    file_categories = {
        "Images": ['.jpg', '.jpeg', '.png', '.gif', '.bmp', '.tiff', '.webp'],
        "Videos": ['.mp4', '.mov', '.avi', '.mkv', '.flv', '.wmv'],
        "Audio": ['.mp3', '.wav', '.aac', '.flac', '.ogg'],
        "Documents": ['.pdf', '.doc', '.docx', '.xls', '.xlsx', '.ppt', '.pptx', '.txt', '.rtf'],
        "Archives": ['.zip', '.rar', '.7z', '.tar', '.gz'],
        "Executables": ['.exe', '.dmg', '.app', '.bat'],
        "Code": ['.py', '.js', '.html', '.css', '.php', '.java', '.cpp', '.c', '.json'],
        "Design": ['.psd', '.ai', '.sketch', '.xd'],
        "Spreadsheets": ['.csv', '.ods'],
    }

    other_folder_name = "Others"
    other_folder_path = os.path.join(directory_path, other_folder_name)
    os.makedirs(other_folder_path, exist_ok=True)

    print(f"Organizing files in: {directory_path} and logging to {log_file_path}")

    # Create a list of items to process to avoid issues if files are moved
    # while iterating over os.listdir.
    items_to_process = list(os.listdir(directory_path))

    for item in items_to_process:
        original_item_path = os.path.join(directory_path, item)

        if not os.path.isfile(original_item_path):
            # Skip directories or items that are no longer files (e.g. moved by previous iteration)
            if os.path.isdir(original_item_path) and (item in file_categories.keys() or item == other_folder_name):
                # Skip the category folders we are creating or the 'Others' folder
                continue
            print(f"Skipping non-file or already processed item: '{item}'")
            continue
        
        file_extension = os.path.splitext(item)[1].lower()
        moved = False
        target_category = "Uncategorized" # Default for logging

        for category, extensions in file_categories.items():
            if file_extension in extensions:
                target_folder_path = os.path.join(directory_path, category)
                os.makedirs(target_folder_path, exist_ok=True)

                destination_path = os.path.join(target_folder_path, item)
                
                # Check if a file with the same name already exists in the destination
                if os.path.exists(destination_path):
                    status_message = f"File already exists in '{category}/'. Skipping."
                    print(f"Warning: '{item}' - {status_message}")
                    log_movement(log_file_path, original_item_path, destination_path, item, category, status_message)
                    moved = True # Consider it 'handled'
                    
                    break # Move to next item
                
                try:
                    shutil.move(original_item_path, destination_path)
                    print(f"Moved '{item}' to '{category}/'")
                    log_movement(log_file_path, original_item_path, destination_path, item, category, "Success")
                    moved = True
                    target_category = category
                    
                    break # Move to next item

                except shutil.Error as e:
                    status_message = f"Error moving '{item}': {e}"
                    print(status_message)
                    log_movement(log_file_path, original_item_path, destination_path, item, category, status_message)
                    moved = True # Prevent it from going to 'Others' if an error occurred in a specific category
                    
                    break # Move to next item

        if not moved:
            # If no category matched or an error occurred that prevented definitive move
            destination_path = os.path.join(other_folder_path, item)
            
            # Check if a file with the same name already exists in 'Others'
            if os.path.exists(destination_path):
                status_message = f"File already exists in '{other_folder_name}/'. Skipping."
                print(f"Warning: '{item}' - {status_message}")
                log_movement(log_file_path, original_item_path, destination_path, item, other_folder_name, status_message)
            else:
                try:
                    shutil.move(original_item_path, destination_path)
                    print(f"Moved '{item}' to '{other_folder_name}/'")
                    log_movement(log_file_path, original_item_path, destination_path, item, other_folder_name, "Success")
                except shutil.Error as e:
                    status_message = f"Error moving '{item}': {e}"
                    print(status_message)
                    log_movement(log_file_path, original_item_path, destination_path, item, other_folder_name, status_message)

    print("\nFile organization and logging complete!")

if __name__ == "__main__":
    # !!! IMPORTANT: Replace this with the actual path you want to organize !!!
    # It's recommended to test on a dedicated test folder first.
    target_directory = os.path.expanduser("~/Desktop/test_folder_for_logging")

    # Define the path for your log file.
    # It's good practice to place it outside the target_directory
    # or ensure it's not moved by the script itself.
    log_file = os.path.join(os.path.expanduser("~"), "file_movement_log.csv")

    # Ensure the target directory exists for testing
    os.makedirs(target_directory, exist_ok=True) 

    # Create some dummy files for testing
    # You might want to remove these lines after initial testing
    dummy_files = [
        "document.pdf", "image.jpg", "video.mp4", "audio.mp3",
        "archive.zip", "script.py", "presentation.pptx", "report.txt",
        "unknown_file.xyz", "another_image.png", "old_doc.doc"
    ]
    for filename in dummy_files:
        with open(os.path.join(target_directory, filename), 'w') as f:
            f.write("dummy content")

    print("Dummy files created in test folder for demonstration.")
    
    organize_files_with_logging(target_directory, log_file)
Understanding the Core Logic

Our enhanced script will:

  1. Define the target directory: The folder to organize.
  2. Define file categories: Similar to the desktop organizer.
  3. Initialize a CSV logger: Set up a CSV file to write our log entries.
  4. Iterate and move files: Scan the directory, identify file types, and move them.
  5. Log each movement: After each successful file move, record the details (original path, new path, timestamp, file category) into the CSV file.
  6. Handle existing log file: Append to an existing log or create a new one with headers.
How the Script Works

Let’s break down the new logging components of the script.

1. Importing New Modules
import csv
from datetime import datetime
  • csv: This module provides classes to read and write tabular data in CSV (Comma Separated Values) format. Perfect for our structured log.
  • datetime: From the datetime module, we import datetime to get the current timestamp for each log entry.
2. Setting Up the CSV Logger
def setup_csv_logger(log_file_path):
    fieldnames = ['Timestamp', 'Original Path', 'New Path', 'File Name', 'Category', 'Status']
    file_exists = os.path.isfile(log_file_path)

    with open(log_file_path, 'a', newline='') as csvfile:
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        if not file_exists:
            writer.writeheader()
    
    return log_file_path
  • fieldnames: Defines the column headers for our CSV report.
  • os.path.isfile(log_file_path): Checks if the log file already exists.
  • open(log_file_path, 'a', newline=''): Opens the CSV file in append mode ('a'). newline='' is crucial to prevent blank rows in Windows.
  • csv.DictWriter(csvfile, fieldnames=fieldnames): Creates a DictWriter object. This is handy because we can write rows using dictionaries, where keys match fieldnames.
  • writer.writeheader(): If the file is new, this writes the fieldnames as the first row in the CSV.
3. Logging Each Movement
def log_movement(log_file_path, original_path, new_path, file_name, category, status="Success"):
    timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    fieldnames = ['Timestamp', 'Original Path', 'New Path', 'File Name', 'Category', 'Status']
    
    with open(log_file_path, 'a', newline='') as csvfile:
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        writer.writerow({
            'Timestamp': timestamp,
            'Original Path': original_path,
            'New Path': new_path,
            'File Name': file_name,
            'Category': category,
            'Status': status
        })
  • datetime.now().strftime(...): Gets the current date and time and formats it into a readable string.
  • writer.writerow({...}): Writes a dictionary as a row to the CSV. Each key-value pair corresponds to a column and its data. We capture the original_path, new_path, file_name, assigned category, and a status (e.g., “Success”, “Error”).
4. Integration into File Organization Logic

The organize_files_with_logging function now calls setup_csv_logger at the beginning and log_movement after each successful move or in case of an error during a move. This ensures every action (or failed action) is recorded.

Running the Script
  1. Save the code: Save the Python script (e.g., organizer_with_log.py).
  2. Specify target directory: Set target_directory to the folder you want to organize. Always test with a dummy folder first.
  3. Specify log file path: Set log_file to where you want your CSV report to be saved. It’s often best to save this outside the directory being organized.
  4. Open your terminal or command prompt.
  5. Navigate to the script’s directory:

    cd /path/to/your/script
    
  6. Run the script:
    python organizer_with_log.py
    

You’ll see messages in your terminal about file movements. After the script completes, navigate to the log_file path you specified. You should find a file_movement_log.csv (or whatever you named it).

You’ve successfully enhanced your Python file organization script to generate a detailed CSV log of all file movements. This adds a crucial layer of accountability and transparency to your automated tasks. Whether for personal record-keeping, professional auditing, or debugging, having a log of file system changes is invaluable.

Additional Resources

The following tutorials explain how to perform other common file-handling tasks in Python: