Skip to content

Getting Started

Tadpole is a lightweight, DSL and scraper engine. The DSL is powered by KDL. The CLI is the primary way to execute your .kdl scripts.

  • NodeJS
  • Modern version of Chrome or Chromimum (Tadpole uses CDP for browser automation)

You can install the Tadpole CLI globally using your preferred package manager:

Terminal window
pnpm add -g @tadpolehq/cli

Create a file named hello.kdl. We’ll use a simple script to grab the “Article of the Day” from Wikipedia.

main {
new_page {
goto "https://en.wikipedia.org"
extract data {
article {
$ "#mp-tfa"
text
}
}
}
}
  • main: The execution root of the script.
  • new_page: Creates a new browser tab and initiates a unique CDP session.
  • goto: Navigates to a URL. It automatically waits for the load event before proceeding.
  • extract: Transforms page content into a JSON object. By default, it starts at the document level.
  • $: Scopes the extraction to a specific CSS selector (in this case, #mp-tfa).
  • text: Pulls the innerText from the currently selected node and assigns it to the property name (in this case, article).
Terminal window
tadpole run hello.kdl --auto --headless
{
"data": {
"article": "Opifex fuscus is a species of mosquito that is endemic..."
}
}