Back to Code Snippets
Fetch HackerNews Stories with read_json()
You can read JSON from any endpoint into DuckDB. This uses read_json() to fetch all HackerNews 'Who is hiring' stories and create a table with all stories and the comment IDs. It works by: - creating a list of URLs - reading the list and creating the table - querying the table
Create JSON URLsSQL
-- https://hn.algolia.com/api with urls as ( select 'https://hn.algolia.com/api/v1/search_by_date?tags=author_whoishiring&page=' || page_num || '&query=hiring?&attributesToRetrieve=title,tags,comment_text,created_at,object_id,story_id,story_title,children,story_text' as url from unnest(range(0,10)) as t(page_num) ) select array_to_json(list(url)) from urlsCopy code
Read JSON from API endpointSQL
create or replace table hn_stories as ( select hit.title as story_title, hit.story_text, hit.created_at, hit.objectId, hit.children, from read_json( -- your_url_list_from_above -- ), unnest(hits) as t(hit) )Copy code
Dumky de Wilde
Expand
Share link