MCP Server
The aodn_cloud_optimised package ships an optional MCP (Model Context Protocol)
server that exposes the AODN dataset catalog, schema definitions, and Jupyter
notebook templates to AI assistants such as Claude Desktop.
The AI can use the server to:
Discover datasets relevant to a user request (e.g. “mooring temperature near Sydney”).
Inspect schema variables, CF attributes, and S3 location for a specific dataset.
Retrieve the canonical Jupyter notebook template for that dataset.
Adapt the template notebook — adding location filters, date ranges, or custom plots — based on the user’s specific needs.
All catalog information is built from the local config/dataset/*.json files
shipped with the package. No S3 calls or credentials are required to start the
server.
MCP Server Installation
The MCP server requires the optional mcp extra:
make mcp # or make dev
Or, from the source tree:
pip install -e ".[mcp]"
Starting the Server
The server speaks the MCP protocol over stdio and is designed to be launched by an MCP client. Do not run it directly in a terminal — stdin becomes the JSON-RPC channel, so any keyboard input will appear as malformed JSON to the server.
Note
To verify the server is working you can use the MCP inspector:
npx @modelcontextprotocol/inspector aodn-mcp-server
Gemini CLI (Linux / Ubuntu)
Gemini CLI reads MCP server
configuration from ~/.gemini/settings.json (user-wide) or
.gemini/settings.json in your project directory (project-specific, takes
precedence).
Important
Use the full absolute path to ``aodn-mcp-server`` in your MCP config.
AI CLI tools (Copilot CLI, Gemini CLI) spawn MCP servers in a bare
environment that does not inherit your shell’s PATH or conda
activation. If you use just "command": "aodn-mcp-server", the client
will fail with ENOENT (file not found).
Find the correct path with:
which aodn-mcp-server
# e.g. /home/<your-user>/miniforge3/envs/AodnCloudOptimised/bin/aodn-mcp-server
Create or edit ~/.gemini/settings.json:
{
"mcpServers": {
"aodn": {
"command": "/home/<your-user>/miniforge3/envs/<env>/bin/aodn-mcp-server",
"env": {
"AODN_NOTEBOOKS_PATH": "/home/<your-user>/aodn_cloud_optimised/notebooks",
"AODN_CONFIG_PATH": "/home/<your-user>/aodn_cloud_optimised/aodn_cloud_optimised/config/dataset"
},
"trust": true
}
}
}
Replace <your-user> and <env> with your username and conda environment
name. The trust: true flag skips confirmation dialogs for each tool call —
remove it if you prefer to approve each action.
Once saved, start Gemini CLI and use /mcp to verify the server is listed and
connected. You can then prompt it naturally, for example:
Give me a notebook for mooring temperature data near Sydney between 2020 and 2023.
GitHub Copilot CLI (Linux)
GitHub Copilot CLI
stores its MCP configuration in ~/.copilot/mcp-config.json (the directory
can be changed with the COPILOT_HOME environment variable).
Option A — interactive setup (recommended for first-time setup):
Start the CLI and run:
/mcp add
Fill in the server details using Tab to move between fields, then press Ctrl+S to save.
Option B — direct JSON editing:
Create or edit ~/.copilot/mcp-config.json:
{
"mcpServers": {
"aodn": {
"type": "stdio",
"command": "/home/<your-user>/miniforge3/envs/<env>/bin/aodn-mcp-server",
"env": {
"AODN_NOTEBOOKS_PATH": "/home/<your-user>/aodn_cloud_optimised/notebooks",
"AODN_CONFIG_PATH": "/home/<your-user>/aodn_cloud_optimised/aodn_cloud_optimised/config/dataset"
},
"tools": ["*"]
}
}
}
"tools": ["*"] enables all tools. You can restrict it to a subset, for
example ["search_datasets", "get_dataset_info", "get_notebook_template"].
Once configured, restart the CLI. Use /mcp to confirm the aodn server
is listed. The server tools are available automatically in any session — just
prompt naturally:
Give me a notebook for mooring temperature data near Sydney between 2020 and 2023.
Note
Tool name prefixing (Copilot CLI v1.0.x):
Copilot CLI may call MCP tools as shell commands prefixed with the server
key, e.g. aodn-search_datasets "mooring temperature".
The package registers a standalone executable for every tool so these calls
succeed without any additional configuration:
aodn-search_datasets "wave buoy Tasmania"
aodn-list_datasets --format parquet
aodn-get_dataset_info argo.parquet
aodn-get_dataset_schema satellite_ghrsst_l3s_1d_nrt
aodn-check_dataset_coverage argo \
--lat-min -45 --lat-max -10 --lon-min 140 --lon-max 155 \
--date-start 2020-01-01 --date-end 2020-12-31
aodn-introspect_dataset_live argo.parquet
aodn-get_notebook_template argo.parquet
aodn-get_plot_guide argo.parquet
aodn-get_dataquery_reference
All executables accept --help for a usage summary.
GitHub Copilot in VS Code (Linux)
GitHub Copilot’s Agent Mode supports MCP servers from VS Code 1.99+. You need the GitHub Copilot extension and agent mode enabled.
Option A — Workspace config (repo-specific, checked into version control):
Create .vscode/mcp.json at the root of your project:
{
"servers": {
"aodn": {
"type": "stdio",
"command": "/home/<your-user>/miniforge3/envs/<env>/bin/aodn-mcp-server",
"env": {
"AODN_NOTEBOOKS_PATH": "${workspaceFolder}/notebooks",
"AODN_CONFIG_PATH": "${workspaceFolder}/aodn_cloud_optimised/config/dataset"
}
}
}
}
${workspaceFolder} expands to the repo root automatically — no hard-coded
paths needed when working from the cloned repository.
Option B — User/global config (applies to all workspaces):
Open VS Code user settings (Ctrl+, → “Open Settings JSON”) and add:
"mcp.servers": {
"aodn": {
"type": "stdio",
"command": "/home/<your-user>/miniforge3/envs/<env>/bin/aodn-mcp-server",
"env": {
"AODN_NOTEBOOKS_PATH": "/home/<your-user>/aodn_cloud_optimised/notebooks",
"AODN_CONFIG_PATH": "/home/<your-user>/aodn_cloud_optimised/aodn_cloud_optimised/config/dataset"
}
}
}
Option C — CLI one-liner:
code --add-mcp '{"name":"aodn","type":"stdio","command":"aodn-mcp-server","env":{"AODN_NOTEBOOKS_PATH":"/home/<your-user>/aodn_cloud_optimised/notebooks","AODN_CONFIG_PATH":"/home/<your-user>/aodn_cloud_optimised/aodn_cloud_optimised/config/dataset"}}'
After configuration, open the Copilot Chat panel, switch to Agent Mode
(@workspace → Agent), then press Ctrl+Shift+P and run
MCP: List Servers to confirm aodn is listed and started.
Note
Ensure agent mode is enabled in VS Code settings:
"chat.agent.enabled": true
Claude Desktop Configuration (macOS / Windows)
Edit the Claude Desktop configuration file:
macOS:
~/Library/Application Support/Claude/claude_desktop_config.jsonWindows:
%APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"aodn": {
"command": "/home/<your-user>/miniforge3/envs/<env>/bin/aodn-mcp-server",
"env": {
"AODN_NOTEBOOKS_PATH": "/path/to/aodn_cloud_optimised/notebooks",
"AODN_CONFIG_PATH": "/path/to/aodn_cloud_optimised/aodn_cloud_optimised/config/dataset"
}
}
}
}
Replace the paths with the absolute paths in your cloned repository. If you installed from a wheel that includes notebooks, you can omit the env variables.
Environment Variables
- AODN_NOTEBOOKS_PATH
Absolute path to the directory containing AODN Jupyter notebooks (
*.ipynb). If not set, the server attempts to auto-detect thenotebooks/directory relative to the package source tree. Set this variable explicitly when running the server from a non-standard install location.
- AODN_CONFIG_PATH
Absolute path to the directory containing AODN dataset JSON config files (
*.json). These files define the schema, CF variable attributes, partitioning strategy, and S3 source paths for each dataset, and share the same base filename as their corresponding notebook (e.g.mooring_temperature_logger_delayed_qc.json↔mooring_temperature_logger_delayed_qc.ipynb).If not set, the server loads configs from the installed package via
importlib.resources. Set this variable to use configs from a local clone or a custom location.
Available MCP Tools
Once connected, an AI assistant has access to the following tools:
Tool |
Description |
|---|---|
|
List all available AODN datasets. Supports optional filters for format
( |
|
Fuzzy keyword search across dataset names, AWS registry descriptions, and
CF variable attributes ( |
|
Full metadata for a specific dataset: description, S3 ARN, all schema variables with CF attributes, and partitioning strategy. |
|
Authoritative variable listing — call this before writing any notebook
code. Returns every schema variable with its exact column name, CF role
( |
|
Single-call dataset profile — returns everything an AI needs to USE a
dataset: name, format, data type classification, full AWS description,
coordinate and data variable tables, matching notebook path, and
ready-to-use code patterns. This replaces calling |
|
Live S3 coverage query — makes anonymous S3 requests to determine the
dataset’s actual temporal extent (first/last timestamp), spatial bounding
box, and key global metadata attributes (title, institution, summary,
licence, etc.). Accepts optional |
|
Real variable introspection from the live S3 store. Unlike
|
|
Run a notebook cell by cell and report errors. Uses
|
|
Interactive Python REPL for per-cell testing. Runs a code snippet in
a persistent, named session ( |
|
Full raw JSON config for a specific dataset (complete |
|
Returns the canonical Jupyter notebook for a dataset as readable text. Falls back to a generic template if no dataset-specific notebook exists. |
|
Returns ready-to-paste plotting code snippets for a specific dataset. Automatically selects Parquet (non-gridded) or Zarr (gridded) patterns, injects real variable names from the schema (including the full variable table), and adds radar-specific vector plots when relevant. |
|
Public API reference for |
|
Start building a validated notebook. Initialises a draft with a title
and output path, auto-adds and executes the DataQuery setup cell.
Returns a |
|
Add a validated cell to a notebook draft. Code cells are executed in the persistent session BEFORE being committed — if execution fails, the cell is rejected with the traceback and the AI must fix and retry. Markdown cells are added unconditionally. |
|
Save and validate a notebook. Writes cells to |
|
Fix a cell in an existing draft. Replaces a cell by index, with the
same execute-then-commit validation as |
|
Rescue an existing broken notebook. Validates the |
Available MCP Resources
Resource URI |
Description |
|---|---|
|
Machine-readable JSON array of all datasets with name, format, description, S3 ARN, catalogue URL, and variable list. |
Example AI Prompts
The following prompts work well with an MCP-enabled AI assistant:
“Give me a notebook to access mooring temperature data near Sydney between 2020 and 2023.” ← the AI will call
check_dataset_coverageto confirm the dataset actually covers the Sydney area and that time range.“Show me all satellite sea surface temperature datasets available as Zarr.”
“What variables are in the Argo float dataset? Give me a notebook that plots temperature profiles.” ← the AI will call
get_dataset_schemaand find that the time axis isJULD, notTIME.“I need ocean chlorophyll-a data from MODIS Aqua for the Coral Sea in 2022 — can you prepare a notebook for that?”
“List all radar datasets covering South Australian waters.”
“Does the SOOP-BA dataset have data in the Bass Strait between 2018 and 2021?” ← directly exercises
check_dataset_coveragewith lat/lon and date filters.
Notebook Builder Workflow
The recommended workflow for generating validated Jupyter notebooks uses the builder pattern — a sequence that guarantees every code cell has been executed successfully before the notebook is delivered:
┌─────────────────────┐
│ 1. start_notebook │──▶ session_id
└────────┬────────────┘
│
▼ (repeat for each cell)
┌─────────────────────────────┐
│ 2. add_notebook_cell │
│ code → execute → commit │
│ if fails → ❌ reject │
└────────┬────────────────────┘
│
▼
┌──────────────────────────────────────────┐
│ 3. save_notebook │
│ write .ipynb → re-execute in fresh │
│ kernel → if ❌ → keep draft open │
└────────┬─────────────────────────────────┘
│ (if validation fails)
▼
┌──────────────────────────────────────────┐
│ 4. replace_notebook_cell(cell_index, …) │
│ fix broken cells → go to step 3 │
└──────────────────────────────────────────┘
Key properties:
start_notebookcreates a draft session with a DataQuery setup cell (importsGetAodn,plot_ts_diagram) that is auto-executed on creation. The setup cell adds the notebooks directory tosys.pathso imports work in any kernel.add_notebook_cellexecutes code cells in the persistent session before committing them. Variables persist across cells (just like a Jupyter kernel). If a cell raises an exception, it is rejected — the AI must fix the code and retry.save_notebookwrites cells to the.ipynbfile, then re-executes the entire notebook in a fresh Jupyter kernel (viavalidate_notebook). If any cell fails, the draft stays alive and the error report is returned.replace_notebook_cellreplaces a broken cell by index (with the same execute-then-commit validation), then the AI callssave_notebookagain.
This architecture makes it impossible to deliver a broken notebook.
save_notebook will not succeed until every cell passes full-kernel
validation — including setup imports, data queries, and plots.
Typical sequence for an oceanographic analysis notebook:
search_datasets("wave buoy Tasmania")— find relevant datasets.get_dataset_summary("wave_buoys_realtime_nonqc.parquet")— understand type, variables, code patterns.check_dataset_coverage("wave_buoys_realtime_nonqc.parquet", ...)— confirm data exists in the user’s region and time window.start_notebook(title="Wave Buoy Analysis — Tasmania", output_path="wave_buoy_tasmania.ipynb")add_notebook_cell(session_id, "# Introduction\n\nWave buoy analysis...", cell_type="markdown")add_notebook_cell(session_id, "ds = GetAodn('wave_buoys_realtime_nonqc.parquet')\ndf = ds.get_data(...)")add_notebook_cell(session_id, "df.plot(...)")— creates a plot cell.save_notebook(session_id)— writes the validated notebook.
Known Code Pitfalls Avoided by the Server
The server instructions and get_plot_guide tool explicitly guard against
these recurring Python errors in oceanographic notebooks:
- 1. Day-of-month overflow (``ValueError: Day out of range “2015-04-31”``).
Never add 1 to the last day returned by
calendar.monthrange()to create an exclusive upper bound — April, June, September and November only have 30 days. Use the safe helper instead:def _next_month_start(yr, m): ts = pd.Timestamp(year=yr, month=m, day=1) + pd.DateOffset(months=1) return np.datetime64(ts.strftime('%Y-%m-%d'))
- 2. numpy datetime64 f-string format spec (
ValueError: Invalid format specifier '%Y-%m-%d'). The format spec
{arr[0]:%Y-%m-%d}fails fornumpy.datetime64values. Always convert first:pd.Timestamp(arr[0]).strftime('%Y-%m-%d')
- 3. DataQuery standalone functions called as class methods
(
AttributeError: 'ParquetDataSource' has no attribute 'plot_ts_diagram').plot_ts_diagram,plot_timeseries, and similar helpers are module-level functions, not methods of any dataset class. Import and call them directly:from DataQuery import plot_ts_diagram plot_ts_diagram(df, temp_col='TEMP', psal_col='PSAL', z_col='DEPTH')
- 4. xarray ``NotImplementedError`` (slice + ``method=’nearest’``). Xarray
refuses to combine a range slice and a nearest-neighbour lookup in one
.sel()call. Always chain two separate calls:ds.sel(time=slice(t0, t1)).sel(lat=y, lon=x, method='nearest')
- 5. pandas duplicate-column ``ValueError``. Renaming a column to a name that
already exists creates duplicate columns and breaks many pandas operations. Pass original column names as keyword arguments instead of renaming.
Dataset–Notebook Mapping
Each dataset in config/dataset/ has a corresponding Jupyter notebook in
notebooks/ sharing the same base name. For example:
config/dataset/mooring_temperature_logger_delayed_qc.jsonnotebooks/mooring_temperature_logger_delayed_qc.ipynb
The notebooks use the standalone DataQuery.py library (see
Module Overview) which provides the GetAodn class and associated
methods for querying and visualising cloud-optimised data on S3.
Testing
A dedicated integration test suite validates all MCP tools, including live S3 coverage queries, notebook execution, and end-to-end user-prompt scenarios. See Testing the MCP Server for full instructions.