Skip to content

Contributing to Kedro-Dagster

Thank you for your interest in contributing to Kedro-Dagster! This document provides guidelines for contributing to the project.

Code of Conduct

We are committed to providing a welcoming and inclusive environment for all contributors. Please be respectful and considerate in all interactions.

Getting Started

Prerequisites

  • Python 3.11+
  • uv (recommended)
  • just (optional, for task automation)
  • Git

Development Setup

  1. Fork the repository on GitHub

  2. Clone your fork:

git clone https://github.com/YOUR_USERNAME/kedro-dagster.git
cd kedro-dagster
  1. Install dependencies:
uv sync --group dev
  1. Install pre-commit hooks:
uv run pre-commit install

Development Workflow

Making Changes

  1. Create a new branch:
git checkout -b feature/my-feature
  1. Make your changes

  2. Run tests:

just test
uvx nox -s test
uv run pytest
  1. Format and fix code:
just fix
uvx nox -s fix
uv run ruff format src tests
uv run ruff check src tests --fix
uv run ty check src
  1. Commit your changes:
git add .
git commit -m "feat: add my feature"

We follow Conventional Commits for commit messages. The commit message format is enforced by commitizen pre-commit hooks, which will validate your commit messages automatically.

Valid commit message examples:

  • feat: add new feature
  • fix: resolve bug in calculation
  • docs: update installation guide
  • chore: update dependencies
  • test: add tests for new feature

Running Tests

Kedro-Dagster uses pytest with markers to categorize tests into different types:

  • Fast tests: Unit tests that run quickly without subprocess calls or heavy I/O
  • Slow tests: Tests marked with @pytest.mark.slow that take longer to execute
  • Integration tests: Tests marked with @pytest.mark.integration that run subprocesses or test multiple components together

Test Commands

Run fast tests only (recommended during development):

just test-fast
uvx nox -s test_fast
uv run pytest -m "not slow and not integration"

Run slow and integration tests:

just test-slow
uvx nox -s test_slow
uv run pytest -m "slow or integration"

Run all tests:

just test
uvx nox -s test
uv run pytest

Run tests with coverage:

just test-cov
uvx nox -s test_coverage
uv run pytest --cov=kedro_dagster --cov-report=html

Run tests across multiple Python versions:

uvx nox -s test

When to Mark Tests as Slow or Integration

Mark your tests appropriately to help maintain fast feedback during development:

  • Use @pytest.mark.slow for tests that:

    • Take more than a few seconds to run
    • Perform heavy computations
    • Make network requests
    • Access external resources
  • Use @pytest.mark.integration for tests that:

    • Run subprocess commands
    • Test multiple components working together
    • Require complex setup or teardown
    • Exercise end-to-end workflows

Example:

import pytest

@pytest.mark.slow
def test_large_computation():
    # Long-running test
    pass

@pytest.mark.integration
@pytest.mark.slow
def test_end_to_end_workflow():
    # Complex integration test
    pass

Test Organization

Follow these conventions when writing tests:

Class-based test structure: Group related tests into classes using the Test<Component><Scenario> naming pattern.

Fixture usage: Prefer fixtures from conftest.py over module-level data. See tests/conftest.py for available factories.

Property-based testing: Hypothesis is available for property-based testing of edge cases and invariants.

CI Test Strategy

The CI pipeline uses a two-tier testing strategy optimized for fast feedback:

  1. Fast tests (test-fast job): Runs on minimum and maximum Python versions (3.11, 3.14) only:

    • Draft PRs: Ubuntu only - Quick feedback in ~2-3 minutes
    • Ready PRs/Main: All OS - Ubuntu, Windows, macOS - Cross-platform validation
  2. Full test suite (test-full job): Runs all tests (fast + slow + integration) on Ubuntu across all Python versions (3.11-3.14) when the PR is not in draft mode or on the main branch. This comprehensive validation includes coverage reporting on the minimum supported Python version.

Code Quality

Run linters and type checkers:

just lint
uvx nox -s lint
uv run ruff check src tests
uv run ty check src

Format code and fix issues:

just fix
uvx nox -s fix
uv run ruff format src tests
uv run ruff check src tests --fix
uv run ty check src

Run all quality checks:

just check
just fix && just test

Docstring Standards

All public functions, methods, and classes require NumPy-style docstrings. Coverage is enforced at 100% by interrogate.

Check docstring coverage:

uvx interrogate src

Required sections (as applicable):

  • Parameters - All function/method parameters with types and descriptions
  • Returns - Return value type and description
  • Raises - Exceptions raised
  • See Also - Related classes/functions
  • References - Academic references for algorithms or methods used
  • Notes - Implementation details, mathematical background
  • Examples - Usage examples (tested via pytest --doctest-modules)

See Also format:

Use standard numpydoc format with short backtick names. The mkdocs-autorefs plugin automatically links backtick references (e.g., `ClassName`) to the corresponding API pages in rendered documentation. This means plain backtick-wrapped names in docstrings become clickable links in the docs site without any special syntax.

For hyperlinks, always use Markdown syntax: [text](url).

Documentation

Build documentation:

just build
uvx nox -s build_docs
uv run mkdocs build

Serve documentation locally:

just serve
uvx nox -s serve_docs
uv run mkdocs serve

View all available commands:

just --list

Before You Open a PR

  • Run just test-fast - all fast tests pass
  • Run just fix - code is formatted and linted
  • Write or update tests for your changes
  • If you changed docs, run just serve and verify they render
  • Use conventional commit messages
  • Keep the PR focused on a single concern

Submitting Changes

  1. Push your changes to your fork:
git push origin feature/my-feature
  1. Open a Pull Request on GitHub

  2. Ensure all CI checks pass

  3. Wait for review and address any feedback

Pull Request Guidelines

  • Write clear, descriptive PR titles following Conventional Commits
  • Include a description of the changes
  • Add tests for new functionality
  • Update documentation as needed
  • Ensure all tests pass
  • Keep PRs focused and atomic

Commit Message Convention

We use Conventional Commits enforced by commitizen:

  • feat: - New features (triggers minor version bump)
  • fix: - Bug fixes (triggers patch version bump)
  • docs: - Documentation changes
  • style: - Code style changes (formatting, etc.)
  • refactor: - Code refactoring
  • test: - Adding or updating tests
  • chore: - Maintenance tasks
  • perf: - Performance improvements
  • ci: - CI/CD changes

Breaking changes: Add ! after the type or add BREAKING CHANGE: in the footer to trigger a major version bump.

Example with scope:

git commit -m "feat(api): add new endpoint for user data"

Example with breaking change:

git commit -m "feat!: redesign authentication system

BREAKING CHANGE: authentication now requires API keys instead of passwords"

The pre-commit hook will validate your commit messages and prevent commits that don't follow the convention.

Release Process

Maintainers only

The release process is managed by project maintainers. Contributors do not need to create releases. Open PRs and a maintainer will handle versioning and publishing.

Releases are fully automated through GitHub Actions when a new tag is pushed, with a manual approval gate before publishing to PyPI to ensure quality control.

graph LR
    A[Push Tag<br/>v*.*.*] --> B[changelog.yml]
    B --> C[Generate<br/>CHANGELOG.md]
    B --> D[Build Package<br/>validation]
    C --> E[Create PR]
    E --> F[Review & Merge<br/>PR]
    F --> G[publish-release.yml]
    G --> H[Create GitHub<br/>Release]
    H --> I{Manual<br/>Approval}
    I -->|Approve| J[Publish to PyPI]
    style I fill:#f59e0b,stroke:#333,stroke-width:2px,color:#fff
    style J fill:#10b981,stroke:#333,stroke-width:2px,color:#fff

How It Works

  1. Tag a release:

    bash git tag v0.2.0 -m "Release v0.2.0" git push origin v0.2.0

  2. Automated changelog workflow (changelog.yml):

    • Generates changelog from conventional commits using git-cliff
    • Creates a Pull Request with the updated CHANGELOG.md
    • Builds the package distributions (wheels and sdist) for immediate validation
    • Stores distributions as workflow artifacts (reused later to avoid rebuilding)
  3. Review and merge the changelog PR:

    • A maintainer reviews the generated changelog
    • Once approved, merge the PR to main
  4. Automated release workflow (publish-release.yml):

    • Creates a GitHub Release with generated release notes
    • Attaches distribution files to the release
    • Waits for manual approval before proceeding to PyPI
  5. Manual approval for PyPI publishing:

    • Designated reviewers receive a notification
    • Review the GitHub Release to verify everything is correct
    • Approve the deployment to publish to PyPI
    • Package is published using Trusted Publishing (OIDC, no tokens needed)
  6. Release notes generation:

    • All commits since the last tag are analyzed
    • Commits are grouped by type (Added, Fixed, Documentation, etc.)
    • Only commits following conventional format are included
    • Breaking changes are highlighted

Version Numbering

This project uses Semantic Versioning:

  • Major (1.0.0): Breaking changes
  • Minor (0.1.0): New features (backward compatible)
  • Patch (0.0.1): Bug fixes (backward compatible)

Use conventional commits to communicate the type of change, and select the appropriate version number when tagging.

Questions?

If you have any questions, feel free to:

Thank you for contributing! 🎉