Browse GitHub/Gitea/Forgejo Actions artifacts https://codeberg.org/ThetaDev/artifactview/
Find a file
2024-06-22 18:48:35 +02:00
.forgejo/workflows ci: artifactview pr comment: add artifact_paths 2024-06-22 16:35:53 +02:00
crates chore: update quick-xml to v0.32.0 2024-06-19 00:45:22 +02:00
resources update README 2024-06-22 05:27:06 +02:00
src fix: use forge aliases for PR comment links 2024-06-22 18:47:49 +02:00
templates fix: add instance domain to userscript description 2024-06-19 00:50:52 +02:00
tests feat: create PR comments 2024-06-22 05:27:06 +02:00
.editorconfig feat: improve website design 2024-05-29 21:58:13 +02:00
.env.example update README 2024-05-31 18:14:17 +02:00
.gitignore tests: add integration tests 2024-06-14 00:37:30 +02:00
.pre-commit-config.yaml fix: improve path header 2024-06-14 01:24:45 +02:00
build.rs feat: add file listing 2024-05-29 18:59:55 +02:00
Cargo.lock chore(release): release artifactview v0.4.4 2024-06-22 18:48:35 +02:00
Cargo.toml chore(release): release artifactview v0.4.4 2024-06-22 18:48:35 +02:00
CHANGELOG.md chore(release): release artifactview v0.4.4 2024-06-22 18:48:35 +02:00
cliff.toml chore(release): release artifactview v0.1.0 2024-05-31 02:02:53 +02:00
Dockerfile chore: update repo URL to Codeberg 2024-05-31 13:05:07 +02:00
Justfile chore(release): release artifactview v0.4.0 2024-06-22 05:29:47 +02:00
LICENSE initial commit 2024-05-26 13:56:53 +02:00
README.md fix: 404 error on GitHub comment creation 2024-06-22 18:34:09 +02:00

Artifactview

View CI build artifacts from Forgejo/GitHub using your web browser!

Forgejo and GitHub's CI systems allow you to upload files and directories as artifacts. These can be downloaded as zip files. However there is no simple way to view individual files of an artifact.

That's why I developed Artifactview. It is a small web application that fetches these CI artifacts and serves their contents.

It is a valuable tool in open source software development: you can quickly look at test reports or coverage data or showcase your single page web applications to your teammates.

Features

  • 📦 Quickly view CI artifacts in your browser without messing with zip files
  • 📂 File listing for directories without index page
  • 🏠 Every artifact has a unique subdomain to support pages with absolute paths
  • 🌎 Full SPA support with 200.html and 404.html fallback pages
  • 👁️ Viewer for Markdown, syntax-highlighted code and JUnit test reports
  • 🐵 Greasemonkey userscript to automatically add a "View artifact" button to GitHub/Gitea/Forgejo
  • 🦀 Fast and efficient, only extracts files from zip archive if necessary

How to use

Open a Github/Gitea/Forgejo actions run with artifacts and paste its URL into the input box on the main page. You can also pass the run URL with the ?url= parameter.

Artifactview will show you a selection page where you will be able to choose the artifact you want to browse.

If there is no index.html or fallback page present, a file listing will be shown so you can browse the contents of the artifact.

Artifact file listing

If you want to use Artifactview to showcase a static website, you can make use of fallback pages. If a file named 200.html is placed in the root directory, it will be served in case no file exists for the requested path. This allows serving single-page applications with custom routing. A custom 404 error page is defined using a file named 404.html in the root directory.

The behavior is the same as with other web hosts like surge.sh, so a lot of website build tools already follow that convention.

Artifactview includes different viewers to better display files of certain types that browsers cannot handle by default. There is a renderer for markdown files as well as a syntax highlighter for source code files. The viewers are only shown if the files are accessed with the ?viewer= URL parameter which is automatically set when opening a file from a directory listing. You can always download the raw version of the file via the link in the top right corner.

Code viewer

Artifactview even includes an interactive viewer for JUnit test reports (XML files with junit in their filename). The application has been designed to be easily extendable, so if you have suggestions on other viewers that should be added, feel free to create an issue or a PR.

JUnit report viewer

Accessing Artifactview by copying the CI run URL into its homepage may be a little bit tedious. That's why there are some convenient alternatives available.

You can install the Greasemonkey userscript from the link at the bottom of the homepage. The script adds a "View artifact" link with an eye icon next to every CI artifact on both GitHub and Forgejo.

If you want to give every collaborator to your project easy access to previews, you can use Artifactview to automatically create a pull request comments with links to the artifacts.

Pull request comment

To accomplish that, simply add this step to your CI workflow (after uploading the artifacts). Note that the workflow URL has to be built differently on GitHub and Forgejo, so this solution is sadly not cross-forge compatible.

- name: 🔗 Artifactview PR comment (Forgejo)
  if: ${{ always() && github.event_name == 'pull_request' }}
  run: |
    curl -SsL --fail-with-body -w "\n" -X POST https://av.thetadev.de/.well-known/api/prComment -H "Content-Type: application/json" --data "{\"url\": \"$GITHUB_SERVER_URL/$GITHUB_REPOSITORY/actions/runs/$GITHUB_RUN_NUMBER\", \"pr\": ${{ github.event.number }}}"    

- name: 🔗 Artifactview PR comment (GitHub)
  if: ${{ always() && github.event_name == 'pull_request' }}
  run: |
    curl -SsL --fail-with-body -w "\n" -X POST https://av.thetadev.de/.well-known/api/prComment -H "Content-Type: application/json" --data "{\"url\": \"$GITHUB_SERVER_URL/$GITHUB_REPOSITORY/actions/runs/$GITHUB_RUN_ID\", \"pr\": ${{ github.event.number }}}"    

API

Artifactview does have a HTTP API to access data about the CI artifacts. To make the API available to every site without interfering with any paths from the artifacts, the endpoints are located within the reserved /.well-known/api directory.

Get list of artifacts of a CI run

GET /.well-known/api/artifacts?url=<RUN_URL>

GET <HOST>--<USER>--<REPO>--<RUN>-<ARTIFACT>.example.com/.well-known/api/artifacts

Response

Note: the difference between download_url and user_download_url is that the first one is used by the API client and the second one is shown to the user. user_download_url is only set for GitHub artifacts. Forgejo does not have different download URLs since it does not require authentication to download artifacts.

[
  {
    "id": 1,
    "name": "Example",
    "size": 1523222,
    "expired": false,
    "download_url": "https://codeberg.org/thetadev/artifactview/actions/runs/28/artifacts/Example",
    "user_download_url": null
  }
]

Get metadata of the current artifact

GET <HOST>--<USER>--<REPO>--<RUN>-<ARTIFACT>.example.com/.well-known/api/artifact

Response

{
  "id": 1,
  "name": "Example",
  "size": 1523222,
  "expired": false,
  "download_url": "https://codeberg.org/thetadev/artifactview/actions/runs/28/artifacts/Example",
  "user_download_url": null
}

Get all files from the artifact

GET <HOST>--<USER>--<REPO>--<RUN>-<ARTIFACT>.example.com/.well-known/api/files

Response

[
  { "name": "example.rs", "size": 406, "crc32": "2013120c" },
  { "name": "README.md", "size": 13060, "crc32": "61c692f0" }
]

Create a pull request comment

POST /.well-known/api/prComment

Artifactview can create a comment under a pull request containing links to view the artifacts. This way everyone looking at a project can easily access the artifact previews.

To use this feature, you need to setup an access token with the permission to create comments for every code forge you want to use (more details in the section Access tokens).

To prevent abuse and spamming, this endpoint is rate-limited and Artifactview will only create comments after it verified that the workflow matches the given pull request and the worflow is still running.

JSON parameter Description
url (string) CI workflow URL
Example: https://codeberg.org/ThetaDev/artifactview/actions/runs/31
pr (int) Pull request number
recreate (bool) If set to true, the pull request comment will be deleted and recreated if it already exists. If set to false or omitted, the comment will be edited instead.
title (string) Comment title (default: "Latest build artifacts")
artifact_titles (map) Set custom titles for your artifacts.
Example: {"Hello": "🏠 Hello World ;-)"}
artifact_paths (map) Set custom paths for your artifacts if you want the links to point to a specific file (e.g. a test report).
Example: {"Test": "/junit.xml?viewer=1"}

Response

{ "status": 200, "msg": "created comment #2183634497" }

Setup

You can run artifactview using the docker image provided under thetadev256/artifactview:latest or bare-metal using the provided binaries.

Artifactview is designed to run behind a reverse proxy since it does not support HTTPS by itself. If you are using a reverse proxy, you have to set the REAL_IP_HEADER option to the client IP address header name provided by the proxy (usually x-forwarded-for. Otherwise artifactview will assume it is being accessed by only 1 client (the proxy itself) and the rate limiter would count all users as one.

Docker Compose

Here is an example setup with docker-compose, using Traefik as a reverse proxy:

services:
  artifactview:
    image: thetadev256/artifactview:latest
    restart: unless-stopped
    networks:
      - proxy
    environment:
      ROOT_DOMAIN: av.thetadev.de
      REAL_IP_HEADER: x-forwarded-for
      GITHUB_TOKEN: github_pat_123456
      REPO_WHITELIST: github.com;codeberg.org;code.thetadev.de
      SITE_ALIASES: gh=>github.com;cb=>codeberg.org;th=>code.thetadev.de
    labels:
      - "traefik.enable=true"
      - "traefik.docker.network=proxy"
      - "traefik.http.routers.artifactview.entrypoints=websecure"
      - "traefik.http.routers.artifactview.rule=HostRegexp(`^[a-z0-9-]*.?av.thetadev.de$`)"

networks:
  proxy:
    external: true

Configuration

Artifactview is configured using environment variables.

Note that some variables contain lists and maps of values. Lists need to have their values separated with semicolons. Maps use an arrow => between key and value, with pairs separated by semicolons.

Example list: foo;bar, example map: foo=>f1;bar=>b1

Variable Default Description
PORT 3000 HTTP port
CACHE_DIR /tmp/artifactview Temporary directory where to store the artifacts
ROOT_DOMAIN localhost:3000 Public hostname+port number under which artifactview is accessible. If this is configured incorrectly, artifactview will show the error message "host does not end with configured ROOT_DOMAIN"
RUST_LOG info Logging level
NO_HTTPS false Set to True if the website is served without HTTPS (used if testing artifactview without an )
MAX_ARTIFACT_SIZE 100000000 (100 MB) Maximum size of the artifact zip file to be downloaded
MAX_FILE_SIZE 100000000 (100 MB) Maximum contained file size to be served
MAX_FILE_COUNT 10000 Maximum amount of files within a zip file
MAX_AGE_H 12 Maximum age in hours after which cached artifacts are deleted
ZIP_TIMEOUT_MS 1000 Maximum time in milliseconds for reading the index of a zip file. If this takes too long, the zip file is most likely excessively large or malicious (zip bomb)
GITHUB_TOKEN - GitHub API token for downloading artifacts and creating PR comments. Using a fine-grained token with public read permissions is recommended
FORGEJO_TOKENS - Forgejo API tokens for creating PR comments
Example: codeberg.org=>fc010f65348468d05e570806275528c936ce93a4
MEM_CACHE_SIZE 50 Artifactview keeps artifact metadata as well as the zip file indexes in memory to improve performance. The amount of cached items is adjustable.
REAL_IP_HEADER - Get the client IP address from a HTTP request header
If Artifactview is exposed to the network directly, this option has to be unset. If you are using a reverse proxy the proxy needs to be configured to send the actual client IP as a request header.
For most proxies this header is x-forwarded-for.
LIMIT_ARTIFACTS_PER_MIN 5 Limit the amount of downloaded artifacts per IP address and minute to prevent excessive resource usage.
LIMIT_PR_COMMENTS_PER_MIN 5 Limit the amount of pull request comment requests per IP address and minute to prevent spamming.
REPO_BLACKLIST - List of sites/users/repos that can NOT be accessed. The blacklist takes precedence over the whitelist (repos included in both lists cannot be accessed)
Example: github.com/evil-corp/world-destruction;codeberg.org/blackhat;example.org
REPO_WHITELIST - List of sites/users/repos that can ONLY be accessed. If the whitelist is empty, it will be ignored and any repository can be accessed. Uses the same syntax as REPO_BLACKLIST.
SITE_ALIASES - Aliases for sites to make URLs shorter
Example: gh => github.com;cb => codeberg.org
SUGGESTED_SITES codeberg.org; github.com; gitea.com List of suggested code forges (host only, without https://, separated by ;). If repo_whitelist is empty, this value is used for the matched sites in the userscript. The first value is used in the placeholder URL on the home page.
VIEWER_MAX_SIZE 500000 Maximum file size to be displayed using the viewer

Access tokens

GitHub does not allow downloading artifacts for public repositories for unauthenticated users. So you need to setup an access token to use Artifactview with GitHub.

If you are not using the prComment feature, you can use a fine-grained access token with the "Public repositories (read-only)" permission. If you want to create pull request comments, you have to use a classic token with the "public_repo" scope enabled (the fine-grained tokens did not work in my test).

Forgejo does not require access tokens to download artifacts on public repositories, so you only need to create a token if you want to use the prComment-API. In this case, the token needs the following permissions:

  • Repository and Organization Access: Public only
  • issue: Read and write
  • user: Read (for determining own user ID)

Note that if you are using Artifactview to create pull request comments, it is recommended to create a second bot account instead of using your main account.

Technical details

URL format

Artifactview uses URLs in the given format for accessing the individual artifacts: <HOST>--<USER>--<REPO>--<RUN>-<ARTIFACT>.hostname

Example: https://github-com--theta-dev--example-project--4-11.example.com

The reason for using subdomains instead of URL paths is that many websites expect to be served from a separate subdomain and access resources using absolute paths. Using URLs like example.com/github.com/theta-dev/example-project/4/11/path/to/file would make the application easier to host, but it would not be possible to preview a React/Vue/Svelte web project.

Since domains only allow letters, numbers and dashes but repository names allow dots and underscores, these escape sequences are used to access repositories with special characters in their names.

  • -0 -> .
  • -1 -> -
  • -2 -> _

Another issue with using subdomains is that they are limited to a maximum of 63 characters. Most user and repository names are short enough for this not to become a problem, but it could still happen that a CI run becomes inaccessible. Since the run ID is incremented on each new CI run, it might even happen that Artifactview works fine at the beginning of a project, but the subdomains exceed the length limit in the future.

That's why I added aliases for forge URLs. You can for example alias github.com as gh, shaving 8 characters from the subdomain. This makes the subdomains short enogh that you will be unlikely to hit the limit even with longer user/project names.

Security considerations

It is recommended to use the whitelist feature to allow artifactview to access only trusted servers, users and organizations.

Since many well-known URIs are used to configure security-relevant properties of a website or attest ownership of a website (like .well-known/acme-challenge for issuing TLS certificates), Artifactview will serve no files from the .well-known folder.

There is a configurable limit for both the maximum downloaded artifact size and the maximum size of individual files to be served (100 MB by default). Additionally there is a configurable timeout for the zip file indexing operation. These measures should protect the server against denial-of-service attacks like overfilling the server drive or uploading zip bombs.