Back to Code Snippets
Detect Schema Changes Across Datasets (Python)Python
Compare the schema of two datasets and identify any differences.
Execute this Python
import duckdb def compare_schemas(file1, file2): """ Compare schemas of two datasets and find differences. Args: file1 (str): Path to the first dataset (CSV/Parquet). file2 (str): Path to the second dataset (CSV/Parquet). Returns: list: Schema differences. """ con = duckdb.connect() schema1 = con.execute(f"DESCRIBE SELECT * FROM read_csv_auto('{file1}')").fetchall() schema2 = con.execute(f"DESCRIBE SELECT * FROM read_csv_auto('{file2}')").fetchall() return {"file1_schema": schema1, "file2_schema": schema2} # Example Usage differences = compare_schemas("data1.csv", "data2.csv") print(differences)
Copy code