Skip to content

Getting Started

Tadpole is a lightweight, DSL and scraper engine. The DSL is powered by KDL. The CLI is the primary way to execute your .kdl scripts.

Tadpole tries to simplify the complexities of web scraping and automation by:

  • Abstraction: Simulating realistic human behavior (bezier curves, easing) through high-level composed actions.
  • Zero Config: Import and share scraper modules directly via Git, bypass NPM/Registry overhead.
  • Reusability: Actions and evaluators can be composed through slots to create more complex workflows.
  • NodeJS
  • Modern version of Chrome or Chromimum (Tadpole uses CDP for browser automation)

You can install the Tadpole CLI globally using your preferred package manager:

Terminal window
pnpm add -g @tadpolehq/cli

Create a file named hello.kdl. We’ll use a simple script to grab the “Article of the Day” from Wikipedia.

main {
new_page {
goto "https://en.wikipedia.org"
extract data {
article {
$ "#mp-tfa"
text
}
}
}
}
  • main: The execution root of the script.
  • new_page: Creates a new browser tab and initiates a unique CDP session.
  • goto: Navigates to a URL. It automatically waits for the load event before proceeding.
  • extract: Transforms page content into a JSON object. By default, it starts at the document level.
  • $: Scopes the extraction to a specific CSS selector (in this case, #mp-tfa).
  • text: Pulls the innerText from the currently selected node and assigns it to the property name (in this case, article).
Terminal window
tadpole run hello.kdl --auto --headless
{
"data": {
"article": "Opifex fuscus is a species of mosquito that is endemic..."
}
}