Groups
A group is the level-1 organisational unit: every scraper belongs to exactly one group, and the group’s dataTable field names the MongoDB collection where its scrapers’ rows land. Use groups to:
- Apply uniform scheduling to a set of scrapers (
startType=MANUAL,DAILY,WEEKLY,NONE). - Share a coherent dataset between customers (Shared groups).
- Trigger downstream operations (product-catalogue matching, data merge) at group granularity.
Endpoint summary
| Method | Path | Operation ID | Auth scope |
|---|---|---|---|
| GET | /api/scraper/group/list | scrapewise_get_scraper_group_list | bearer |
| PUT | /api/scraper/group | scrapewise_create_scraper_group | bearer + idempotency-key |
| PUT | /api/scraper/group/{id}/start-type/{startType} | scrapewise_update_scraper_group_start_type | bearer + plan feature + idempotency-key |
| PUT | /api/scraper/group/{id}/match | scrapewise_update_scraper_group_match | bearer + TEXT_MATCHING + idempotency-key |
| POST | /api/scraper/group/{id}/preview-delete | scrapewise_delete_scraper_group_preview | bearer |
| DELETE | /api/scraper/group/{id} | scrapewise_delete_scraper_group | bearer + idempotency-key |
List groups — GET /api/scraper/group/list
curl -H "Authorization: Bearer $KEY" \
https://portal.scrapewise.ai/api/scraper-api/api/scraper/group/listReturns the groups you OWN. For groups shared TO you, use GET /api/scraper/shared/group/list.
Response (200) — List<GroupDTO> with id, name, dataTable name, startType, schedule metadata, member-scraper count. Empty list if you have no groups.
Errors — 400 (N/A) / 401 / 403 (N/A) / 404 (N/A) / 429 / 500.
Create or update a group — PUT /api/scraper/group
PUT /api/scraper/group
Authorization: Bearer <key>
Idempotency-Key: <uuid>
Content-Type: application/json
{
"id": null,
"name": "competitor-prices",
"dataTable": "competitor_prices",
"startType": "DAILY",
"shouldMatchProducts": false
}Upsert semantics: omit id to create; set id to update. Creating a new group is plan-feature gated by MAX_GROUPS; updating is not.
Response (200) — the persisted GroupDTO including the assigned id and any server-stamped fields.
Errors — 400 (validation) / 401 / 402 (MAX_GROUPS hit on create) / 403 (N/A) / 404 (N/A) / 429 / 500.
Set group-wide start type — PUT /api/scraper/group/{id}/start-type/{startType}
PUT /api/scraper/group/5f9a.../start-type/DAILY
Authorization: Bearer <key>
Idempotency-Key: <uuid>Updates EVERY scraper in the group to share the same StartType. Values: MANUAL, DAILY, WEEKLY, NONE (paused).
Each non-NONE value is plan-feature gated:
MANUAL→MANUAL_RUNDAILY→DAILY_SCHEDULERWEEKLY→WEEKLY_SCHEDULER
Response (200) — List<StartTypeForGroupDTO> — per-scraper records reflecting the new state.
Errors — 400 (group not found) / 401 / 402 (plan lacks the chosen feature) / 403 (N/A) / 404 (N/A — 400 used instead) / 429 / 500.
Trigger product matching — PUT /api/scraper/group/{id}/match
PUT /api/scraper/group/5f9a.../match
Authorization: Bearer <key>
Idempotency-Key: <uuid>Schedules an asynchronous match of the group’s most-recent scraped products against the Bebo master catalogue. Matched products get linked via a master-data id (used downstream for price comparison / catalogue enrichment).
Requires the TEXT_MATCHING plan feature AND shouldMatchProducts=true on the group. Returns 204 No Content once queued — actual matching happens async.
Errors
| Code | Meaning |
|---|---|
| 400 | Group doesn’t exist / not configured for matching (shouldMatchProducts=false) / a previous match is still PENDING or RUNNING |
| 401 | Missing/invalid bearer |
| 402 | Plan lacks TEXT_MATCHING |
| 403 | N/A |
| 404 | N/A |
| 429 | Rate-limited |
| 500 | Scheduling failure |
Delete a group (destructive — two-call protocol) — DELETE /api/scraper/group/{id}
Destructive operation. Deleting a group cascades to delete every scraper inside it. Optionally also drops the entire MongoDB data collection (withData=true) — all historical scraped rows are lost.
ADR-012 two-call pattern: first preview, then commit within 5 minutes.
Steps
POST /api/scraper/group/{id}/preview-delete[?withData=true]— mints a 5-minute token + preview summary.DELETE /api/scraper/group/{id}[?withData=true]— commits the delete (idempotency-key required).
Skipping the preview step deletes rows without confirmation.
Step 1 — Preview
POST /api/scraper/group/5f9a.../preview-delete?withData=true
Authorization: Bearer <key>Response (200)
{
"token": "9b2d-...",
"opName": "scrapewise_delete_scraper_group",
"targetEntityId": "5f9a...",
"previewSummary": {
"entityName": "5f9a...",
"entityType": "scraper_group",
"cascadeCounts": {},
"warnings": [
"withData=true: the group's entire MongoDB data collection will be dropped (all historical scraped rows lost, irreversible)."
]
}
}When withData=true, the warning array surfaces the irreversible-data-loss notice — the user must see this before committing.
Step 2 — Commit
DELETE /api/scraper/group/5f9a...?withData=true
Authorization: Bearer <key>
Idempotency-Key: <uuid>withData query param semantics:
false(default) → drops the group + all its scrapers; preserves the MongoDB data collection.true→ also drops the data collection (irreversible).
Response — 204 No Content.
Errors (both steps)
| Code | Meaning |
|---|---|
| 400 | Group doesn’t exist for this customer |
| 401 | Missing/invalid bearer |
| 403 | N/A |
| 404 | N/A (400 used instead) |
| 429 | Rate-limited |
| 500 | Persistence failure mid-delete |
See also
- Scrapers — scrapers belong to a group
- Scraper jobs — load-history + data merge are group-scoped
- Shared groups — share a group’s read access cross-customer
- Scraped data — read group data