A web UI to help mirror GitHub repos to Gitea - including releases, issues, PR, and wikis
Find a file
Jonas Rosland 9fc76bfaf5 Fix errors with config.json not existing in the repo
Signed-off-by: Jonas Rosland <jonas.rosland@gmail.com>
2025-04-04 01:01:56 -04:00
gitmirror Moved logging to volume instead of mount, and added better rotation 2025-04-04 00:54:20 -04:00
images Add main screenshot to top of README 2025-03-14 11:09:02 -04:00
templates First commit 2025-03-14 09:04:43 -04:00
tests First commit 2025-03-14 09:04:43 -04:00
.env.example Moved logging to volume instead of mount, and added better rotation 2025-04-04 00:54:20 -04:00
.gitignore Fix errors with config.json not existing in the repo 2025-04-04 01:01:56 -04:00
config.json Fix errors with config.json not existing in the repo 2025-04-04 01:01:56 -04:00
config.json.example Moved logging to volume instead of mount, and added better rotation 2025-04-04 00:54:20 -04:00
docker-compose.yml Fix errors with config.json not existing in the repo 2025-04-04 01:01:56 -04:00
Dockerfile Moved logging to volume instead of mount, and added better rotation 2025-04-04 00:54:20 -04:00
FUTURE_ENHANCEMENTS.md First commit 2025-03-14 09:04:43 -04:00
LICENSE Initial commit 2025-03-14 08:24:11 -04:00
pytest.ini First commit 2025-03-14 09:04:43 -04:00
README.md Moved logging to volume instead of mount, and added better rotation 2025-04-04 00:54:20 -04:00
requirements.txt First commit 2025-03-14 09:04:43 -04:00
run-tests.sh First commit 2025-03-14 09:04:43 -04:00
setup.py First commit 2025-03-14 09:04:43 -04:00
start.sh Moved logging to volume instead of mount, and added better rotation 2025-04-04 00:54:20 -04:00
test-docker.sh First commit 2025-03-14 09:04:43 -04:00
test-requirements.txt First commit 2025-03-14 09:04:43 -04:00

GitHub to Gitea Mirror

This tool sets up and manages pull mirrors from GitHub repositories to Gitea repositories, including the entire codebase, issues, PRs, releases, and wikis.

I've been eagerly awaiting Gitea's PR 20311 for over a year, but since it keeps getting pushed out for every release I figured I'd create something in the meantime.

Blunt disclaimer: This is a hobby project, and I hope the PR above will be merged and implemented soon. When it is, this project will have served its purpose. I created it from scratch using Cursor and Claude 3.7 Sonnet.

Main screenshot

Features

  • Web UI for managing mirrors and viewing logs screens and more info
  • Set up GitHub repositories as pull mirrors in Gitea
  • Mirror the entire codebase, issues, and PRs (not just releases)
  • Mirror GitHub releases and release assets with full descriptions and attachments
  • Mirror GitHub wikis to separate Gitea repositories
  • Auto-discover mirrored repositories in Gitea
  • Support for both full GitHub URLs and owner/repo format
  • Comprehensive logging with direct links to logs from error messages
  • Support for large release assets with dynamic timeouts
  • Asset download and upload with size-based timeout calculation
  • Scheduled mirroring with configurable interval
  • Enhanced UI with checkboxes for configuration options
  • Dark mode support
  • Error handling and visibility

Quick Start

Get up and running in minutes:

# Clone the repository
git clone https://github.com/jonasrosland/gitmirror.git
cd gitmirror

# Copy and configure the example .env file
cp .env.example .env
# Edit the .env file with your tokens and Gitea URL

# Start the application
docker-compose up -d

# Access the web UI
# Open http://localhost:5000 in your browser

Prerequisites

  • Docker and Docker Compose (for running the application)
  • GitHub Personal Access Token with repo scope
  • Gitea Access Token with read:user, write:repository, and write:issue scopes
  • Access to both GitHub and Gitea repositories

Configuration

Create a .env file in the same directory as the docker-compose.yml with the following variables:

# GitHub Personal Access Token (create one at https://github.com/settings/tokens)
# Required scopes: repo (for private repositories)
# For public repositories, this is optional but recommended
GITHUB_TOKEN=your_github_token

# Gitea Access Token (create one in your Gitea instance under Settings > Applications)
# Required permissions: read:user, write:repository, write:issue
GITEA_TOKEN=your_gitea_token

# Your Gitea instance URL (no trailing slash)
GITEA_URL=https://your-gitea-instance.com

# Secret key for the web UI (OPTIONAL)
# This key is used to secure Flask sessions and flash messages
# If not provided, a random key will be automatically generated at container start
# SECRET_KEY=your_secret_key

# Log retention days (OPTIONAL, defaults to 30 days)
LOG_RETENTION_DAYS=30

Authentication for Private Repositories

If you want to mirror private GitHub repositories, you must provide a GitHub token with the repo scope. This token is used to authenticate with GitHub when creating the mirror.

For public repositories, the GitHub token is optional but recommended to avoid rate limiting issues.

Usage

For easier deployment, you can use Docker Compose:

  1. Start the web UI:
docker-compose up -d
  1. Run the mirror script (one-time execution):
docker-compose run --rm mirror
  1. Run the mirror script for a specific repository:
docker-compose run --rm mirror mirror owner/repo gitea_owner gitea_repo
  1. Run with specific flags:
# Enable mirroring metadata (issues, PRs, labels, milestones, wikis)
docker-compose run --rm mirror mirror --mirror-metadata

# Force recreation of empty repositories (required when an existing repository is empty but not a mirror)
docker-compose run --rm mirror mirror --force-recreate

# Combine flags for a specific repository
docker-compose run --rm mirror mirror owner/repo gitea_owner gitea_repo --mirror-metadata --force-recreate
  1. View logs:
docker-compose logs -f
  1. Stop the services:
docker-compose down

Using Bind Mounts for Logs

By default, logs are stored in a Docker volume for better permission handling. If you prefer to use bind mounts instead (to access logs directly on the host filesystem), you can modify the docker-compose.yml file:

  1. Change the volume configuration from:
volumes:
  - gitmirror_logs:/app/logs

to:

volumes:
  - ./logs:/app/logs
  1. Create the logs directory with the correct permissions:
mkdir -p logs
chmod 755 logs
  1. Update the container user to match your host user's UID/GID:
environment:
  - PUID=$(id -u)
  - PGID=$(id -g)
user: "${PUID}:${PGID}"
  1. Remove the volumes section at the bottom of the file that defines gitmirror_logs

This setup will:

  • Store logs directly in the logs directory on your host
  • Allow you to access logs without using Docker commands
  • Maintain proper permissions for the container to write logs
  • Keep the same log rotation and retention settings

Using Docker Directly

To run the application with Docker directly:

  1. Build the Docker image:
docker build -t github-gitea-mirror .
  1. Run the container:

    a. Run the web UI (default mode):

    docker run --rm -p 5000:5000 --env-file .env github-gitea-mirror
    

    b. Run the mirror script in auto-discovery mode:

    docker run --rm --env-file .env github-gitea-mirror mirror
    

    c. Run the mirror script for a specific repository:

    docker run --rm --env-file .env github-gitea-mirror mirror owner/repo gitea_owner gitea_repo
    

    d. Run with the force-recreate flag (for empty repositories):

    docker run --rm --env-file .env github-gitea-mirror mirror owner/repo gitea_owner gitea_repo --force-recreate
    
  2. For persistent storage of logs, mount a volume:

    docker run --rm -p 5000:5000 -v ./logs:/app/logs --env-file .env github-gitea-mirror
    

How Mirroring Works

When you set up a repository for mirroring, the script performs several types of synchronization:

  1. Code Mirroring: Uses Gitea's built-in pull mirror functionality to sync:

    • The entire codebase
    • All branches and tags

    NOTE: This is done automatically at creation of the mirror repo, and sometimes it takes a while for Gitea to finish the first code sync

  2. Release Mirroring: Uses custom code to sync:

    • Releases and release assets
    • Release descriptions and metadata
    • Release attachments with proper naming and descriptions
  3. Metadata Mirroring: Syncs additional GitHub data:

    • Issues and their comments
    • Pull requests and their comments
    • Labels and milestones
    • Wiki content (if enabled)

The process works as follows:

  1. The script checks if the repository exists in Gitea
  2. If it exists and is already a mirror:
    • It triggers a code sync
    • It only mirrors releases and metadata if those options are explicitly enabled in the repository configuration
  3. If it exists but is not a mirror:
    • If the target repository is empty, it requires explicit confirmation via the --force-recreate flag used with the CLI command (see below) before deleting and recreating it as a mirror
    • If the target repository has commits, it warns you that you need to to delete it manually
  4. If it doesn't exist, it creates a new repository as a mirror
  5. After setting up the mirror, it triggers a code sync in Gitea
  6. It only mirrors releases, issues, PRs, and other metadata if those options are enabled in the repository configuration

By default, all mirroring options (metadata, releases, etc.) are disabled for safety. You can enable them through the web UI's repository configuration page or by using the appropriate command-line flags.

Repository Safety

The tool includes safety measures to prevent accidental data loss:

  1. Empty Repository Protection: When an existing repository is empty but not configured as a mirror, the tool will not automatically delete and recreate it without explicit confirmation via the --force-recreate flag.

  2. Non-Empty Repository Protection: If a repository contains commits, the tool will never attempt to delete it, even with the --force-recreate flag. This ensures that repositories with actual content are never accidentally deleted.

  3. Explicit Confirmation: The --force-recreate flag serves as an explicit confirmation that you want to delete and recreate empty repositories as mirrors, providing an additional safety layer against accidental data loss.

  4. CLI-Only Operation: The --force-recreate flag is deliberately available only through the command-line interface and not through the web UI. This design choice prevents accidental repository deletion through misclicks in the web UI and ensures that repository recreation is a deliberate, intentional action that requires specific command knowledge.

This multi-layered approach to safety ensures that repositories are protected from accidental deletion while still providing the flexibility to recreate empty repositories when necessary.

Wiki Mirroring

When mirroring a GitHub repository with a wiki, the tool creates a separate repository for the wiki content. This is necessary because:

  1. Gitea's Limitations: Gitea's repository mirroring feature doesn't automatically mirror the wiki repository. Wikis in Git are actually separate repositories (with .wiki.git suffix).

  2. Read-Only Constraint: For mirrored repositories in Gitea, the wiki is read-only and cannot be directly pushed to, which prevents direct mirroring of wiki content.

The mirroring process for wikis works as follows:

  1. The tool checks if the GitHub repository has a wiki
  2. It verifies that git is installed in the container (this is handled automatically)
  3. If a wiki exists, it clones the GitHub wiki repository
  4. It creates a new repository in Gitea with the name {original-repo-name}-wiki
  5. It pushes the wiki content to this new repository
  6. It updates the main repository's description to include a link to the wiki repository

This approach ensures that all wiki content from GitHub is preserved and accessible in Gitea, even for mirrored repositories.

Web UI

The web UI provides a user-friendly interface for managing mirrors and viewing logs:

  1. Access the web UI by navigating to http://localhost:5000 in your browser after starting the Docker container

  2. Use the web interface to:

    • View mirrored repositories in list or card view
    • Run mirrors manually
    • View logs with auto-refresh functionality (updates every 5 seconds)
    • Configure scheduled mirroring with a customizable interval
    • Configure repository-specific mirroring options
  3. The UI features:

    • Dark mode support
    • Checkboxes for configuration options
    • Direct links to logs from error messages
    • Color-coded status indicators
    • Responsive design for mobile and desktop

Repository List View

Repository List View

Adding a Repository

Adding a Repository

Repository Configuration

Repository Configuration

Log Viewer

Log Viewer

Repository Configuration

Each repository can be individually configured with the following options:

  1. Mirror Metadata: Enable/disable mirroring of metadata (issues, PRs, labels, etc.)

    • Mirror Issues: Sync GitHub issues to Gitea
    • Mirror Pull Requests: Sync GitHub PRs to Gitea
    • Mirror Labels: Sync GitHub labels to Gitea
    • Mirror Milestones: Sync GitHub milestones to Gitea
    • Mirror Wiki: Sync GitHub wiki to a separate Gitea repository
  2. Mirror Releases: Enable/disable mirroring of GitHub releases to Gitea

These options can be configured through the repository configuration page, accessible by clicking the "Configure" button for a repository in the repositories list.

Error Handling and Logging

The application provides comprehensive logging and error handling:

  1. Log Files: All mirror operations are logged to date-based log files in the logs directory
  2. Error Visibility: Errors and warnings are prominently displayed in the UI with appropriate color coding
  3. Direct Log Links: Error messages are clickable and link directly to the relevant log file
  4. Status Indicators: Repositories with errors or warnings are visually highlighted in both list and card views

When an error occurs during mirroring, you can click on the error message to view the detailed log, which helps in diagnosing and resolving issues.

Logs

Logs are stored in the logs directory and are:

  • Rotated daily at midnight
  • Retained for a configurable number of days (default: 30)
  • Separated by service (web and mirror)
  • Viewable through the web UI

Log files follow this naming convention:

  • Current log file: web.log or mirror.log
  • Rotated log files: web.log.2024-03-20, web.log.2024-03-19, etc.

The log retention period can be configured using the LOG_RETENTION_DAYS environment variable in your .env file.

Development and Testing

Setting Up for Development

  1. Install test dependencies:

    pip install -r test-requirements.txt
    
  2. Run all tests:

    ./run-tests.sh
    
  3. Run specific test categories:

    # Run unit tests
    python -m pytest tests/unit -v
    
    # Run integration tests
    python -m pytest tests/integration -v
    
    # Run with coverage report
    python -m pytest --cov=gitmirror --cov-report=term-missing
    

Test Suite Structure

The test suite is organized into several categories:

  1. Unit Tests (tests/unit/): Tests individual components in isolation

    • test_github_api.py: Tests GitHub API functionality
    • test_gitea_repository.py: Tests Gitea repository operations
    • test_gitea_api.py: Tests Gitea API functionality
    • test_cli.py: Tests command-line interface
    • test_mirror.py: Tests core mirroring functionality
    • test_web.py: Tests web interface routes and functionality
    • test_imports_and_modules.py: Tests module imports and basic functionality
  2. Integration Tests (tests/integration/): Tests interactions between components

    • test_mirror_integration.py: Tests the integration of mirroring components
  3. Configuration Tests (tests/test_config.py): Tests configuration loading and saving

Test Coverage

All tests are now passing. The current test coverage is 27%, with most of the coverage in the core functionality:

  • GitHub API module: 86% coverage
  • CLI module: 84% coverage
  • Gitea repository module: 58% coverage
  • Config utilities: 54% coverage
  • Issue module: 42% coverage
  • Metadata module: 32% coverage

Areas with lower coverage include:

  • Web interface: 16% coverage
  • PR module: 2% coverage
  • Comment module: 24% coverage
  • Wiki module: 11% coverage

Mocking Strategy

The tests use extensive mocking to avoid external dependencies:

  1. API Requests: All HTTP requests are mocked using unittest.mock.patch to avoid actual API calls
  2. File System: File operations are mocked or use temporary directories
  3. Environment Variables: Environment variables are mocked to provide test values
  4. Configuration: Configuration loading and saving are mocked to avoid file system dependencies

Running Tests in Docker

You can also run the tests inside a Docker container:

docker-compose run --rm web python -m pytest

This ensures tests run in an environment similar to production.

Code Structure

The codebase has been structured as a modular package for better maintainability:

  • gitmirror/: Main package
    • github/: GitHub API interactions
    • gitea/: Gitea API interactions, organized into focused modules:
      • repository.py: Repository management functions
      • release.py: Release management functions
      • issue.py: Issue management functions
      • pr.py: Pull request management functions
      • comment.py: Comment management functions
      • wiki.py: Wiki management functions
      • metadata.py: Labels, milestones, and other metadata functions
    • utils/: Utility functions
      • logging.py: Logging setup and utilities, including log file management
      • config.py: Configuration management utilities
    • mirror.py: Main mirroring logic
    • cli.py: Command-line interface
    • web.py: Web UI

This modular organization improves code maintainability, makes it easier to locate specific functionality, and allows for more focused testing and development.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Known Limitations

  • Large Repositories: Very large repositories with many issues, PRs, or releases may take a long time to mirror initially.
  • Rate Limiting: GitHub API rate limits may affect mirroring performance for frequent updates or large repositories.
  • Authentication: The application currently only supports personal access token authentication.
  • Webhooks: The tool does not currently support automatic mirroring via webhooks; scheduled mirroring is used instead.
  • Bidirectional Sync: This is a one-way mirror from GitHub to Gitea; changes made in Gitea are not synced back to GitHub.

Contributing

Contributions are welcome! If you'd like to contribute to this project:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Please make sure to update tests as appropriate and follow the existing code style.