Powerful web scraping and data extraction API with AI-powered structured output.
All API requests should be made to the following base URL:
https://api.flashi.ai
Complete endpoint URLs are formed by combining the base URL with the endpoint path. For example: https://api.flashi.ai/v2/scrape
All API requests require authentication using a Bearer token in the Authorization header.
Authorization: Bearer YOUR_API_KEY
Extract clean content from web pages
/v2/scrape
Retrieve cleaned Markdown content from a URL. Returns HTML by default.
Name | Type | Required | Description |
---|---|---|---|
url | string | Required | The URL of the website to scrape. |
format | string | Optional | Output format. Use "json" to receive a JSON object containing the markdown. |
scroll | boolean | Optional | Try scrolling to reveal content. Defaults to true. If false, the page will be loaded as is and quicker response time. |
GET /v2/scrape?url=https://example.com&format=json&should_click_elements=true Authorization: Bearer YOUR_API_KEY
Extract structured data using AI prompts
/v2/extract
Scrape content and extract structured data based on a prompt and JSON schema.
Name | Type | Required | Description |
---|---|---|---|
url | string | Required | The URL to scrape and parse. |
prompt | string | Required | Prompt to guide data extraction (e.g., 'Extract product details'). |
response_format | object | Required | JSON object describing the desired output structure (OpenAI function calling format). |
scroll | boolean | Optional | Try scrolling to reveal content. Defaults to true. If false, the page will be loaded as is and quicker response time. |
include_markdown | boolean | Optional | Include the cleaned Markdown content in the response. Defaults to false. |
POST /v2/extract Content-Type: application/json Authorization: Bearer YOUR_API_KEY { "url": "https://example-product-page.com", "prompt": "Extract the product name, price, and description.", "response_format": { "type": "object", "properties": { "product_name": { "type": "string", "description": "Name of the product" }, "price": { "type": "number", "description": "Price of the product" }, "description": { "type": "string", "description": "Product description" } }, "required": ["product_name", "price"] }, "should_click_elements": false, "include_markdown": true }
Schedule and manage automated scraping tasks
/v2/tasks
Get all scheduled tasks for the authenticated user.
GET /v2/tasks Authorization: Bearer YOUR_API_KEY
/v2/tasks
Create a new scheduled task to run a scraper.
Name | Type | Required | Description |
---|---|---|---|
scraper_id | string | Required | ID of the scraper to run. |
task_name | string | Required | A descriptive name for the task. |
cron_minute | string | Required | Cron expression: minute (0-59 or *). |
cron_hour | string | Required | Cron expression: hour (0-23 or *). |
cron_day_of_month | string | Required | Cron expression: day of month (1-31 or *). |
cron_month | string | Required | Cron expression: month (1-12 or *). |
cron_day_of_week | string | Required | Cron expression: day of week (0-6 or *, Sunday=0). |
cron_timezone | string | Optional | Timezone for the schedule (e.g., 'America/New_York'). Defaults to UTC. |
/v2/tasks/{task_id}
Get details of a specific task by its ID.
Name | Type | Required | Description |
---|---|---|---|
task_id | string | Required | The ID of the task to retrieve (path parameter). |
/v2/tasks/{task_id}
Delete a specific scheduled task.
Name | Type | Required | Description |
---|---|---|---|
task_id | string | Required | The ID of the task to delete (path parameter). |
/v2/tasks/{task_id}/run
Manually trigger a run of the specified task.
Name | Type | Required | Description |
---|---|---|---|
task_id | string | Required | The ID of the task to run (path parameter). |
/v2/tasks/{task_id}/runs
Get a list of all historical runs for a specific task.
Name | Type | Required | Description |
---|---|---|---|
task_id | string | Required | The ID of the task (path parameter). |
Manage scraper configurations
/v2/scrapers
Get all scraper configurations for the authenticated user.
/v2/scrapers
Create a new scraper configuration.
Name | Type | Required | Description |
---|---|---|---|
scraper_name | string | Required | A name for this scraper configuration. |
schema_id | string | Required | ID of the schema defining the desired output structure. |
scraped_url | string | Required | The target URL or URL pattern for the scraper. |
prompt | string | Required | The prompt guiding the data extraction process. |
/v2/scrapers/{scraper_id}
Get details of a specific scraper configuration.
Name | Type | Required | Description |
---|---|---|---|
scraper_id | string | Required | The ID of the scraper to retrieve (path parameter). |
/v2/scrapers/{scraper_id}/run
Manually trigger a run of a specific scraper configuration.
Name | Type | Required | Description |
---|---|---|---|
scraper_id | string | Required | The ID of the scraper to run (path parameter). |
Define data extraction schemas
/v2/schemas
Get all extraction schemas for the authenticated user.
/v2/schemas
Create a new extraction schema.
Name | Type | Required | Description |
---|---|---|---|
schema_name | string | Required | A descriptive name for the schema. |
schema_json | object | Required | JSON object defining the extraction structure (OpenAI function call format). |
/v2/schemas/{schema_id}
Get details of a specific extraction schema.
Name | Type | Required | Description |
---|---|---|---|
schema_id | string | Required | The ID of the schema to retrieve (path parameter). |
Model Context Protocol server for LLM-optimized content extraction
/mcp
MCP server tool for extracting clean HTML/markdown content from web pages, optimized for LLM ingestion.
Name | Type | Required | Description |
---|---|---|---|
url | string | Required | The website URL to scrape and extract content from. |
scroll | boolean | Optional | Enable scrolling to reveal dynamically loaded content. Defaults to true. |
MCP Client Configuration: { "mcpServers": { "flashcrawl": { "command": "python", "args": ["-m", "mcp", "install", "https://api.flashi.ai/mcp"], "headers": { "Authorization": "Bearer your-api-key-here" } } } }