|
||
---|---|---|
gitmirror | ||
images | ||
templates | ||
tests | ||
.env.example | ||
.gitignore | ||
config.json | ||
config.json.example | ||
docker-compose.yml | ||
Dockerfile | ||
FUTURE_ENHANCEMENTS.md | ||
LICENSE | ||
pytest.ini | ||
README.md | ||
requirements.txt | ||
run-tests.sh | ||
setup.py | ||
start.sh | ||
test-docker.sh | ||
test-requirements.txt |
GitHub to Gitea Mirror
This tool sets up and manages pull mirrors from GitHub repositories to Gitea repositories, including the entire codebase, issues, PRs, releases, and wikis.
I've been eagerly awaiting Gitea's PR 20311 for over a year, but since it keeps getting pushed out for every release I figured I'd create something in the meantime.
Blunt disclaimer: This is a hobby project, and I hope the PR above will be merged and implemented soon. When it is, this project will have served its purpose. I created it from scratch using Cursor and Claude 3.7 Sonnet.
Features
- Web UI for managing mirrors and viewing logs screens and more info
- Set up GitHub repositories as pull mirrors in Gitea
- Mirror the entire codebase, issues, and PRs (not just releases)
- Mirror GitHub releases and release assets with full descriptions and attachments
- Mirror GitHub wikis to separate Gitea repositories
- Auto-discover mirrored repositories in Gitea
- Support for both full GitHub URLs and owner/repo format
- Comprehensive logging with direct links to logs from error messages
- Support for large release assets with dynamic timeouts
- Asset download and upload with size-based timeout calculation
- Scheduled mirroring with configurable interval
- Enhanced UI with checkboxes for configuration options
- Dark mode support
- Error handling and visibility
Quick Start
Get up and running in minutes:
# Clone the repository
git clone https://github.com/jonasrosland/gitmirror.git
cd gitmirror
# Copy and configure the example .env file
cp .env.example .env
# Edit the .env file with your tokens and Gitea URL
# Start the application
docker-compose up -d
# Access the web UI
# Open http://localhost:5000 in your browser
Prerequisites
- Docker and Docker Compose (for running the application)
- GitHub Personal Access Token with
repo
scope - Gitea Access Token with
read:user
,write:repository
, andwrite:issue
scopes - Access to both GitHub and Gitea repositories
Configuration
Create a .env
file in the same directory as the docker-compose.yml with the following variables:
# GitHub Personal Access Token (create one at https://github.com/settings/tokens)
# Required scopes: repo (for private repositories)
# For public repositories, this is optional but recommended
GITHUB_TOKEN=your_github_token
# Gitea Access Token (create one in your Gitea instance under Settings > Applications)
# Required permissions: read:user, write:repository, write:issue
GITEA_TOKEN=your_gitea_token
# Your Gitea instance URL (no trailing slash)
GITEA_URL=https://your-gitea-instance.com
# Secret key for the web UI (OPTIONAL)
# This key is used to secure Flask sessions and flash messages
# If not provided, a random key will be automatically generated at container start
# SECRET_KEY=your_secret_key
# Log retention days (OPTIONAL, defaults to 30 days)
LOG_RETENTION_DAYS=30
Authentication for Private Repositories
If you want to mirror private GitHub repositories, you must provide a GitHub token with the repo
scope. This token is used to authenticate with GitHub when creating the mirror.
For public repositories, the GitHub token is optional but recommended to avoid rate limiting issues.
Usage
Using Docker Compose (Recommended)
For easier deployment, you can use Docker Compose:
- Start the web UI:
docker-compose up -d
- Run the mirror script (one-time execution):
docker-compose run --rm mirror
- Run the mirror script for a specific repository:
docker-compose run --rm mirror mirror owner/repo gitea_owner gitea_repo
- Run with specific flags:
# Enable mirroring metadata (issues, PRs, labels, milestones, wikis)
docker-compose run --rm mirror mirror --mirror-metadata
# Force recreation of empty repositories (required when an existing repository is empty but not a mirror)
docker-compose run --rm mirror mirror --force-recreate
# Combine flags for a specific repository
docker-compose run --rm mirror mirror owner/repo gitea_owner gitea_repo --mirror-metadata --force-recreate
- View logs:
docker-compose logs -f
- Stop the services:
docker-compose down
Using Bind Mounts for Logs
By default, logs are stored in a Docker volume for better permission handling. If you prefer to use bind mounts instead (to access logs directly on the host filesystem), you can modify the docker-compose.yml
file:
- Change the volume configuration from:
volumes:
- gitmirror_logs:/app/logs
to:
volumes:
- ./logs:/app/logs
- Create the logs directory with the correct permissions:
mkdir -p logs
chmod 755 logs
- Update the container user to match your host user's UID/GID:
environment:
- PUID=$(id -u)
- PGID=$(id -g)
user: "${PUID}:${PGID}"
- Remove the
volumes
section at the bottom of the file that definesgitmirror_logs
This setup will:
- Store logs directly in the
logs
directory on your host - Allow you to access logs without using Docker commands
- Maintain proper permissions for the container to write logs
- Keep the same log rotation and retention settings
Using Docker Directly
To run the application with Docker directly:
- Build the Docker image:
docker build -t github-gitea-mirror .
-
Run the container:
a. Run the web UI (default mode):
docker run --rm -p 5000:5000 --env-file .env github-gitea-mirror
b. Run the mirror script in auto-discovery mode:
docker run --rm --env-file .env github-gitea-mirror mirror
c. Run the mirror script for a specific repository:
docker run --rm --env-file .env github-gitea-mirror mirror owner/repo gitea_owner gitea_repo
d. Run with the force-recreate flag (for empty repositories):
docker run --rm --env-file .env github-gitea-mirror mirror owner/repo gitea_owner gitea_repo --force-recreate
-
For persistent storage of logs, mount a volume:
docker run --rm -p 5000:5000 -v ./logs:/app/logs --env-file .env github-gitea-mirror
How Mirroring Works
When you set up a repository for mirroring, the script performs several types of synchronization:
-
Code Mirroring: Uses Gitea's built-in pull mirror functionality to sync:
- The entire codebase
- All branches and tags
NOTE: This is done automatically at creation of the mirror repo, and sometimes it takes a while for Gitea to finish the first code sync
-
Release Mirroring: Uses custom code to sync:
- Releases and release assets
- Release descriptions and metadata
- Release attachments with proper naming and descriptions
-
Metadata Mirroring: Syncs additional GitHub data:
- Issues and their comments
- Pull requests and their comments
- Labels and milestones
- Wiki content (if enabled)
The process works as follows:
- The script checks if the repository exists in Gitea
- If it exists and is already a mirror:
- It triggers a code sync
- It only mirrors releases and metadata if those options are explicitly enabled in the repository configuration
- If it exists but is not a mirror:
- If the target repository is empty, it requires explicit confirmation via the
--force-recreate
flag used with the CLI command (see below) before deleting and recreating it as a mirror - If the target repository has commits, it warns you that you need to to delete it manually
- If the target repository is empty, it requires explicit confirmation via the
- If it doesn't exist, it creates a new repository as a mirror
- After setting up the mirror, it triggers a code sync in Gitea
- It only mirrors releases, issues, PRs, and other metadata if those options are enabled in the repository configuration
By default, all mirroring options (metadata, releases, etc.) are disabled for safety. You can enable them through the web UI's repository configuration page or by using the appropriate command-line flags.
Repository Safety
The tool includes safety measures to prevent accidental data loss:
-
Empty Repository Protection: When an existing repository is empty but not configured as a mirror, the tool will not automatically delete and recreate it without explicit confirmation via the
--force-recreate
flag. -
Non-Empty Repository Protection: If a repository contains commits, the tool will never attempt to delete it, even with the
--force-recreate
flag. This ensures that repositories with actual content are never accidentally deleted. -
Explicit Confirmation: The
--force-recreate
flag serves as an explicit confirmation that you want to delete and recreate empty repositories as mirrors, providing an additional safety layer against accidental data loss. -
CLI-Only Operation: The
--force-recreate
flag is deliberately available only through the command-line interface and not through the web UI. This design choice prevents accidental repository deletion through misclicks in the web UI and ensures that repository recreation is a deliberate, intentional action that requires specific command knowledge.
This multi-layered approach to safety ensures that repositories are protected from accidental deletion while still providing the flexibility to recreate empty repositories when necessary.
Wiki Mirroring
When mirroring a GitHub repository with a wiki, the tool creates a separate repository for the wiki content. This is necessary because:
-
Gitea's Limitations: Gitea's repository mirroring feature doesn't automatically mirror the wiki repository. Wikis in Git are actually separate repositories (with
.wiki.git
suffix). -
Read-Only Constraint: For mirrored repositories in Gitea, the wiki is read-only and cannot be directly pushed to, which prevents direct mirroring of wiki content.
The mirroring process for wikis works as follows:
- The tool checks if the GitHub repository has a wiki
- It verifies that git is installed in the container (this is handled automatically)
- If a wiki exists, it clones the GitHub wiki repository
- It creates a new repository in Gitea with the name
{original-repo-name}-wiki
- It pushes the wiki content to this new repository
- It updates the main repository's description to include a link to the wiki repository
This approach ensures that all wiki content from GitHub is preserved and accessible in Gitea, even for mirrored repositories.
Web UI
The web UI provides a user-friendly interface for managing mirrors and viewing logs:
-
Access the web UI by navigating to
http://localhost:5000
in your browser after starting the Docker container -
Use the web interface to:
- View mirrored repositories in list or card view
- Run mirrors manually
- View logs with auto-refresh functionality (updates every 5 seconds)
- Configure scheduled mirroring with a customizable interval
- Configure repository-specific mirroring options
-
The UI features:
- Dark mode support
- Checkboxes for configuration options
- Direct links to logs from error messages
- Color-coded status indicators
- Responsive design for mobile and desktop
Repository List View
Adding a Repository
Repository Configuration
Log Viewer
Repository Configuration
Each repository can be individually configured with the following options:
-
Mirror Metadata: Enable/disable mirroring of metadata (issues, PRs, labels, etc.)
- Mirror Issues: Sync GitHub issues to Gitea
- Mirror Pull Requests: Sync GitHub PRs to Gitea
- Mirror Labels: Sync GitHub labels to Gitea
- Mirror Milestones: Sync GitHub milestones to Gitea
- Mirror Wiki: Sync GitHub wiki to a separate Gitea repository
-
Mirror Releases: Enable/disable mirroring of GitHub releases to Gitea
These options can be configured through the repository configuration page, accessible by clicking the "Configure" button for a repository in the repositories list.
Error Handling and Logging
The application provides comprehensive logging and error handling:
- Log Files: All mirror operations are logged to date-based log files in the
logs
directory - Error Visibility: Errors and warnings are prominently displayed in the UI with appropriate color coding
- Direct Log Links: Error messages are clickable and link directly to the relevant log file
- Status Indicators: Repositories with errors or warnings are visually highlighted in both list and card views
When an error occurs during mirroring, you can click on the error message to view the detailed log, which helps in diagnosing and resolving issues.
Logs
Logs are stored in the logs
directory and are:
- Rotated daily at midnight
- Retained for a configurable number of days (default: 30)
- Separated by service (web and mirror)
- Viewable through the web UI
Log files follow this naming convention:
- Current log file:
web.log
ormirror.log
- Rotated log files:
web.log.2024-03-20
,web.log.2024-03-19
, etc.
The log retention period can be configured using the LOG_RETENTION_DAYS
environment variable in your .env
file.
Development and Testing
Setting Up for Development
-
Install test dependencies:
pip install -r test-requirements.txt
-
Run all tests:
./run-tests.sh
-
Run specific test categories:
# Run unit tests python -m pytest tests/unit -v # Run integration tests python -m pytest tests/integration -v # Run with coverage report python -m pytest --cov=gitmirror --cov-report=term-missing
Test Suite Structure
The test suite is organized into several categories:
-
Unit Tests (
tests/unit/
): Tests individual components in isolationtest_github_api.py
: Tests GitHub API functionalitytest_gitea_repository.py
: Tests Gitea repository operationstest_gitea_api.py
: Tests Gitea API functionalitytest_cli.py
: Tests command-line interfacetest_mirror.py
: Tests core mirroring functionalitytest_web.py
: Tests web interface routes and functionalitytest_imports_and_modules.py
: Tests module imports and basic functionality
-
Integration Tests (
tests/integration/
): Tests interactions between componentstest_mirror_integration.py
: Tests the integration of mirroring components
-
Configuration Tests (
tests/test_config.py
): Tests configuration loading and saving
Test Coverage
All tests are now passing. The current test coverage is 27%, with most of the coverage in the core functionality:
- GitHub API module: 86% coverage
- CLI module: 84% coverage
- Gitea repository module: 58% coverage
- Config utilities: 54% coverage
- Issue module: 42% coverage
- Metadata module: 32% coverage
Areas with lower coverage include:
- Web interface: 16% coverage
- PR module: 2% coverage
- Comment module: 24% coverage
- Wiki module: 11% coverage
Mocking Strategy
The tests use extensive mocking to avoid external dependencies:
- API Requests: All HTTP requests are mocked using
unittest.mock.patch
to avoid actual API calls - File System: File operations are mocked or use temporary directories
- Environment Variables: Environment variables are mocked to provide test values
- Configuration: Configuration loading and saving are mocked to avoid file system dependencies
Running Tests in Docker
You can also run the tests inside a Docker container:
docker-compose run --rm web python -m pytest
This ensures tests run in an environment similar to production.
Code Structure
The codebase has been structured as a modular package for better maintainability:
gitmirror/
: Main packagegithub/
: GitHub API interactionsgitea/
: Gitea API interactions, organized into focused modules:repository.py
: Repository management functionsrelease.py
: Release management functionsissue.py
: Issue management functionspr.py
: Pull request management functionscomment.py
: Comment management functionswiki.py
: Wiki management functionsmetadata.py
: Labels, milestones, and other metadata functions
utils/
: Utility functionslogging.py
: Logging setup and utilities, including log file managementconfig.py
: Configuration management utilities
mirror.py
: Main mirroring logiccli.py
: Command-line interfaceweb.py
: Web UI
This modular organization improves code maintainability, makes it easier to locate specific functionality, and allows for more focused testing and development.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Known Limitations
- Large Repositories: Very large repositories with many issues, PRs, or releases may take a long time to mirror initially.
- Rate Limiting: GitHub API rate limits may affect mirroring performance for frequent updates or large repositories.
- Authentication: The application currently only supports personal access token authentication.
- Webhooks: The tool does not currently support automatic mirroring via webhooks; scheduled mirroring is used instead.
- Bidirectional Sync: This is a one-way mirror from GitHub to Gitea; changes made in Gitea are not synced back to GitHub.
Contributing
Contributions are welcome! If you'd like to contribute to this project:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
Please make sure to update tests as appropriate and follow the existing code style.