Installation

Choose your installation method based on your use case:

Quick Decision Tree:

“I just want to process datasets with existing configs” → Use make core (fastest, minimal dependencies)
“I want to read and analyze cloud-optimised data with Python/Jupyter” → Use make notebooks (adds DataQuery, Jupyter, plotting)
“I want to set up the MCP server for AI assistant integration” → Use make mcp (core + MCP server dependencies)
“I want to contribute code, run tests, and build docs” → Use make dev (all dependencies + development tools)
“I just want to build the documentation locally” → Use make docs (Sphinx + documentation tools only)

Requirements

Python >= 3.11
miniforge3 or Poetry (see installation options below)

Recommended: Data Processing (Core Only)

The fastest way to start processing data is with the core package:

git clone https://github.com/aodn/aodn_cloud_optimised.git
cd aodn_cloud_optimised
make core

This installs the minimum set: zarr, parquet, xarray, dask, and all processing dependencies.

Then use:

generic_cloud_optimised_creation --config your_dataset.json

See Quick Start for an example.

Jupyter Notebooks (Data Analysis)

To use the DataQuery API and create analysis notebooks:

git clone https://github.com/aodn/aodn_cloud_optimised.git
cd aodn_cloud_optimised
make notebooks

This adds Jupyter, matplotlib, cartopy, and visualization tools. See Notebooks for examples.

Contributing to the Project

For full development setup (recommended for contributors):

1. Clone the repository

gh repo clone aodn/aodn_cloud_optimised
cd aodn_cloud_optimised

2. Install using Makefile (Poetry venv)

The Makefile is the primary recommended workflow for contributors:

make dev

This creates a Poetry-managed virtual environment with all tools: testing, documentation, linting, and debugging.

After setup, install pre-commit hooks:

poetry run pre-commit install

Note

Important Note

Run make dev once after cloning and after every git pull that changes poetry.lock or pyproject.toml.

Alternative: Mamba/Conda named environments

For contributors who prefer conda/mamba named environments:

./setup_miniforge_venvs.sh dev        # AodnCloudOptimised_dev (recommended)
./setup_miniforge_venvs.sh notebooks  # AodnCloudOptimised_notebooks
./setup_miniforge_venvs.sh tests      # AodnCloudOptimised_tests
./setup_miniforge_venvs.sh docs       # AodnCloudOptimised_docs
./setup_miniforge_venvs.sh all        # Create all at once

# Activate the environment
mamba activate AodnCloudOptimised_dev
pre-commit install

Note

Important Note

The mamba env may need to be (re)activated after installation for all scripts to be available in $PATH.

Dependency Extras

The project uses PEP 621 optional extras so you only install what you need:

Extra	Contents	Use case
(core)	zarr, parquet, xarray, dask, coiled	Data processing pipelines
`notebooks`	core + DataQuery + Jupyter, matplotlib, cartopy, seaborn	Jupyter notebooks, DataQuery API, data analysis
`tests`	core + pytest, coverage, moto	Running the test suite
`docs`	core + Sphinx and tools	Building documentation locally
`mcp`	core + MCP server dependencies	Running the Model Context Protocol server
`dev`	All of the above + poetry, pre-commit, ipdb	Full contributor setup (recommended)

Direct Installation (pip / Poetry)

If you’re not using the Makefile, you can install directly:

Using pip (in an existing environment):

pip install aodn-cloud-optimised              # core only
pip install aodn-cloud-optimised[notebooks]   # core + notebooks
pip install aodn-cloud-optimised[dev]         # full dev setup

Using Poetry directly:

poetry install --with dev
poetry run pre-commit install

DataQuery Import Note

Note

DataQuery is not exported from the top-level package. Always import it directly:

from aodn_cloud_optimised.lib.DataQuery import GetAodn

What’s Next?

Process your first dataset: Quick Start
Configure a new dataset: Dataset Configuration
Write Jupyter notebooks: Notebooks
Set up the MCP server: MCP Server
Start contributing: Development