Detect Schema Changes Across Datasets (Python)Python
Compare the schema of two datasets and identify any differences.
Execute this Python
import duckdb def compare_schemas(file1, file2): """ Compare schemas of two datasets and find differences. Args: file1 (str): Path to the first dataset (CSV/Parquet). file2 (str): Path to the second dataset (CSV/Parquet). Returns: list: Schema differences. """ con = duckdb.connect() schema1 = con.execute(f"DESCRIBE SELECT * FROM read_csv_auto('{file1}')").fetchall() schema2 = con.execute(f"DESCRIBE SELECT * FROM read_csv_auto('{file2}')").fetchall() return {"file1_schema": schema1, "file2_schema": schema2} # Example Usage differences = compare_schemas("data1.csv", "data2.csv") print(differences)Copy code
Export/import/share DuckDB UI Notebooks
The DuckDB UI stores notebook content in an internal database called _duckdb_ui. You can query and export notebook content, as well as insert new definitions into the database. Warning: Modifying the internal database may lead to corruption and data loss. Be cautious and use it at your own risk!
Export a notebook definition to JSONSQL
copy ( select "json" from _duckdb_ui.notebook_versions where 1=1 and title = 'MySingleNotebook' and expires is null ) to 'exported-notebook.json'Copy code
Import notebook definitionSQL
set variable notebook_content = (select json from 'exported-notebook.json'); set variable notebook_id = uuid(); set variable current_timestamp = now(); begin transaction; insert into _duckdb_ui.notebooks (id, name, created) select getvariable('notebook_id'), 'notebook_' || getvariable('notebook_id'), getvariable('current_timestamp') ; insert into _duckdb_ui.notebook_versions (notebook_id, version, title, json, created, expires) select getvariable('notebook_id'), 1, 'imported-notebook-' || getvariable('current_timestamp'), getvariable('notebook_content'), getvariable('current_timestamp'), null ; commit;Copy code
DuckDB to geojsonSQL
Snippet helps put correctly formatted geojson file using spatial and json extensions from any table/query.
Execute this SQL
COPY ( SELECT *, -- all these field will become properties in resulting geojson (doesnt work well with map-like columns) ST_GeomFromText(wkt) as geometry -- this is the main geometry entry FROM temp_table ) TO 'test.geojson' WITH (FORMAT GDAL, DRIVER 'GeoJSON', LAYER_NAME 'test');Copy code
Shabbir Marzban
Expand
Share link