crypto_project_analyst/data-orchestrator/SKILL.md

173 lines
7.0 KiB
Markdown

---
name: data-orchestrator
description: >
Infrastructure orchestrator that receives a CoinMarketCap URL, extracts links,
spawns the appropriate operators in parallel, collects their responses, and returns
a unified JSON string. Does not interpret, evaluate, or summarize any content.
---
# Input
A plain CoinMarketCap URL string:
```
https://coinmarketcap.com/currencies/bitcoin/
```
---
# Procedure
**Follow these steps strictly in order. Do not skip ahead. Do not parallelize across steps.**
---
## Step 1 — Spawn only url-operator and wait
Spawn only url-operator with the URL as a plain string:
```
sessions_spawn(agentId="url-operator", task="https://coinmarketcap.com/currencies/bitcoin/", timeoutSeconds=1200)
```
**Do not spawn anything else. Wait for url-operator to return before proceeding.**
The response will look like this:
```
{
"source_url": "https://coinmarketcap.com/currencies/bitcoin/",
"links": {
"github": ["https://github.com/bitcoin/bitcoin"],
"twitter": ["https://x.com/bitcoin"],
"other": ["https://bitcoin.org", "https://bitcointalk.org"]
}
}
```
Extract `project_name` from the URL slug:
- `https://coinmarketcap.com/currencies/bitcoin/``project_name: "Bitcoin"`
- Capitalize the slug: `bnb``"BNB"`, `quack-ai``"Quack AI"`
If url-operator returns an error or all link arrays are empty, stop and return:
```
{"error": "url_operator_failed", "detail": "<error detail>"}
```
---
## Step 2 — Spawn remaining operators in parallel
Only once Step 1 is complete and you have the links in hand, spawn all eligible operators at once:
| Operator | agentId | Spawn condition | Task payload |
|--------------------|---------------------|--------------------------------|--------------------------------------------------------------------------|
| `rss-operator` | `rss-operator` | Always — never skip | `"{\"project_name\":\"...\"}"` |
| `github-operator` | `github-operator` | `links.github` non-empty | `"{\"repos\":[...links.github]}"` |
| `twitter-operator` | `twitter-operator` | `links.twitter` non-empty | `"{\"usernames\":[...extracted usernames]}"` |
| `web-operator` | `web-operator` | `links.other` non-empty | `"{\"project_name\":\"...\",\"urls\":[...links.other]}"` |
Spawn templates — task must be a JSON string. Fill in placeholders, then call all at once:
```
sessions_spawn(agentId="github-operator", task="{\"repos\":[\"<links.github URLs>\"]}", timeoutSeconds=3000)
sessions_spawn(agentId="twitter-operator", task="{\"usernames\":[\"<username>\"]}", timeoutSeconds=3000)
sessions_spawn(agentId="web-operator", task="{\"project_name\":\"<project_name>\",\"urls\":[\"<links.other URLs>\"]}", timeoutSeconds=3000)
sessions_spawn(agentId="rss-operator", task="{\"project_name\":\"<project_name>\"}", timeoutSeconds=3000)
```
**twitter-operator:** extract username from URL — `https://x.com/bitcoin``"bitcoin"`
**web-operator:** spawn exactly once with ALL `links.other` URLs in one `urls` array. Never spawn once per URL.
**Task must always be a JSON string. Never an object, never a text description.**
If you are unsure how to format the task, use `json.dumps({"project_name": project_name})` or equivalent — do not reason about escaping manually. If the tool returns `task: must be string`, it means you passed a dict/object; wrap it with `json.dumps()` and retry immediately without further analysis.
---
## Step 3 — Await all responses
Wait for every spawned operator to complete or time out. Do not return partial results.
An operator is considered failed if any of the following occur:
- `sessions_spawn` throws or returns an exception
- The call exceeds `timeoutSeconds` without a response
- The returned value is `null`, `undefined`, or not valid JSON
If an operator fails for any of these reasons, record it in `skipped_operators` with the reason, set its `operator_results` key to `null`, and continue — do not abort the whole run.
**The operator response is returned directly by sessions_spawn. Do not read session transcripts, workspace files, or any other external source.**
---
## Step 4 — Return
Store exactly what each operator returned. Do not reformat, rename, summarize, or restructure. Return operator output verbatim, even if it looks inconsistent across operators.
WRONG — summarized, renamed keys, inferred structure:
```
"rss": {"source": "CoinDesk", "articles_count": 10, "topics": ["..."]}
"github": {"repository": "...", "stars": 88398}
```
CORRECT — raw output, whatever shape the operator returned:
```
"rss": [{"title":"...","source":"CoinDesk","link":"...","published":"..."}]
"github": {"repo":"bitcoin/bitcoin","stars":88398,"forks":38797,"watchers":4059,...}
```
Note that `rss` returns an array and `github` returns an object — this is intentional. Do not normalize them to a common shape.
Return:
```
{
"source_url": "<coinmarketcap_url>",
"operator_results": {
"github": "<raw response or null if not spawned>",
"twitter": "<raw response or null if not spawned>",
"web": "<raw response or null if not spawned>",
"rss": "<raw response — always present>"
},
"skipped_operators": [{"operator": "<name>", "reason": "<timeout|error|invalid_response>"}],
"errors": []
}
```
---
# Full Example
Input:
```
https://coinmarketcap.com/currencies/bitcoin/
```
Step 1 — Spawn url-operator, wait for response, extract `project_name="Bitcoin"`:
```
sessions_spawn(agentId="url-operator", task="https://coinmarketcap.com/currencies/bitcoin/", timeoutSeconds=1200)
```
Step 2 — url-operator returned links. Now spawn all operators at once:
```
sessions_spawn(agentId="github-operator", task="{\"repos\":[\"https://github.com/bitcoin/bitcoin\"]}", timeoutSeconds=3000)
sessions_spawn(agentId="twitter-operator", task="{\"usernames\":[\"bitcoin\"]}", timeoutSeconds=3000)
sessions_spawn(agentId="web-operator", task="{\"project_name\":\"Bitcoin\",\"urls\":[\"https://bitcoin.org\",\"https://bitcointalk.org\"]}", timeoutSeconds=3000)
sessions_spawn(agentId="rss-operator", task="{\"project_name\":\"Bitcoin\"}", timeoutSeconds=3000)
```
Step 3 — Await all four responses.
Step 4 — Return:
```
{
"source_url": "https://coinmarketcap.com/currencies/bitcoin/",
"operator_results": {
"github": {"repo":"bitcoin/bitcoin","stars":88398,"forks":38797},
"twitter": {"results":{"bitcoin":[]},"errors":{}},
"web": {"project_name":"Bitcoin","pages":[],"errors":[]},
"rss": [{"title":"...","source":"...","link":"...","published":"..."}]
},
"skipped_operators": [],
"errors": []
}
```