improve docs

This commit is contained in:
slatinsky 2023-03-25 01:51:24 +01:00
parent ab8a2fadcc
commit ce781c194d
8 changed files with 364 additions and 261 deletions

83
docs/Architecture.md Normal file
View file

@ -0,0 +1,83 @@
# Architecture choices
The project is divided into two main parts: the server and the client.
## Used port numbers
- 21011 - nginx (reverse proxy)
- 21013 - http-server (static files)
- 27017 - mongodb (database)
- 58000 - fastapi (backend api)
All specified port numbers are required to be free.
## Server (Backend)
### Preprocess
The purpose of the preprocess process is to process data exported by [DiscordChatExporter](https://github.com/Tyrrrz/DiscordChatExporter) into the mongodb database.
It doesn't matter, where the exported files are located. The preprocess will recursively search for all files in the `exports` directory and process them.
Specifically, the process does the following:
- processes all JSON channel exports
- while processing, it also processes all referenced assets and convert urls to local paths
- downloads ggsans font from discord cdn.
### mongodb
Database used to store data. The database is divided into multiple collections:
- assets - precomputed assets with local paths and dimensions
- authors - used only for searching
- channels - used for searching and channel list. It also includes threads and forum posts, because they are treated by discord as channels too.
- emojis - used only for searching
- guilds - used for guild list.
- jsons - stores sha256 hashes of processed files. This way, the preprocessor can skip already processed files.
- messages - stores all messages, including authors, emojis, attachments, etc.
### fastapi
Middleman between the client and the mongo database. Provides JSON api for the client (search, guild list, channel list, and messages).
Search and channel endpoints return message ids. Those ids are then fetched by the client when needed.
### http-server
http-server is used to server static files for the frontend. Nginx could be used instead, but sometimes paths exceed the maximum path length of 260 characters on Windows and nginx fails to serve the files.
http-server is used as a workaround for that bug in nginx. But it is also needed to apply registry patch to increase the maximum path length (use `change_260_character_path_limit_to_32767.reg`).
### windows runner (Windows only)
Windows runner is the main entry point of the program on Windows. This script is compiled into `dcef.exe` in the release.
- writes logs from other services to `logs.txt` file for easier debugging.
- checks if all required ports are free on startup
- enforces single instance of the program (opens another window if already running)
- hides console window if the main window is open
- cleans up all services on window close
### nginx
Nginx combines multiple services into one. See `nginx-prod.conf`.
1. First is server frontend files needed for the client to load (paths `/_app/`, `/css/`, `/js/`, `/fonts/`, `/`).
2. Then it proxies static files from http-server from port 21013 (path `/input/`).
3. Then it serves static files created by preprocess (path `/data/`) - now used only for ggsans font.
4. Finally, it proxies api requests from fastapi from port 58000 (path `/api/`).
## Client (Frontend)
Frontend is written in Svelte. It is a statically compiled sveltekit app.

23
docs/Compile.md Normal file
View file

@ -0,0 +1,23 @@
# Compile the Windows release from source code
## Easy way to compile
Fork the repository and enable GitHub Actions. Then, go to the Actions tab and click on the latest workflow. Let it run and download the artifacts from the bottom of the page.
## Manual compilation (Windows)
### Prerequisites
Follow the instructions in [Development environment](docs/Development-env.md) to install all the dependencies.
Then install additional dependencyto compile http-server:
```bash
npm install -g pkg
```
### Compile
Run `BUILD_RELEASE.bat`. The result will be the contents of the `release` folder.

42
docs/Development-env.md Normal file
View file

@ -0,0 +1,42 @@
# Setting up a development environment on Windows
## Prerequisites
- Python 3.11
- Node.js 16.16.0
- pyinstaller 5.5 (installed globally from `pip`)
- nodemon (installed globally from `npm`)
- wt (windows terminal)
## Install dependencies
Install frontend dependencies:
```bash
cd frontend
npm install
cd ..
```
Install backend dependencies:
```bash
cd backend/preprocess
py -m pip install -r requirements.txt
cd ../..
```
```bash
cd backend/fastapi
py -m pip install -r requirements.txt
pip install uvicorn==0.20.0
cd ../..
```
```bash
cd backend/windows-runner
py -m pip install -r requirements.txt
cd ../..
```
## Start the development script
run `DEV.bat`

View file

@ -0,0 +1,48 @@
# Helper script to export forum posts in a channel
Viewing forums is supported by this viewer, but exporting them manually is time consuming. You can use this helper script to generate command to download all forum posts in a forum channel automatically:
### Steps
1. Open discord in browser (browser needs to be Chromium based - Chrome, Edge, Opera, Brave, Vivaldi, etc., not working in Firefox)
2. Navigate to channel with forum post list
3. press F12 and paste this script to the console:
```js
len = 0
ids = []
previouseScrollTop = 0
function scrollToPosition(offset) {
scrollDiv = document.querySelector('div[class*="chat-"] > div > div > div[class*="scrollerBase-"]')
scrollDiv.scroll(0, offset)
}
function captureIds() {
document.querySelectorAll('div[data-item-id]').forEach((e) => ids.push(e.dataset.itemId))
ids = [...new Set(ids)]
if (ids.length > len) {
len = ids.length
console.log('Found', len, 'IDs')
}
}
function printIds() {
console.log('DiscordChatExporter.Cli.exe export --token TOKEN --output "exports/forums" --format Json --media --reuse-media --markdown false --channel',ids.join(' '))
}
scrollToPosition(0)
interval = setInterval(() => {
scrollToPosition(scrollDiv.scrollTop + window.innerHeight / 3)
setTimeout(() => {
captureIds()
if (previouseScrollTop === scrollDiv.scrollTop) {
clearInterval(interval)
printIds()
}
previouseScrollTop = scrollDiv.scrollTop
}, 1000)
}, 1542)
```
4. script will scroll the page. At the the end, it will print command to the console, which allows you to export all forum posts in the forum channel
5. edit command printed in the console (`--format`, `--output` and `--token`) and export with CLI version of DiscordChatExporter

106
docs/Exporting-threads.md Normal file
View file

@ -0,0 +1,106 @@
# Helper script to export forum posts in a channel
Viewing threads is supported by this viewer, but exporting them manually is time consuming. You can use this helper script to generate command to download all archived threads in a channel automatically:
### Steps
1. Open discord in browser (browser needs to be Chromium based - Chrome, Edge, Opera, Brave, Vivaldi, etc., not working in Firefox)
2. Navigate to channel with threads. Do not open thread list (if you opened it, refresh the page)
3. press F12 and paste this script to the console:
```js
len = 0
ids = []
previouseScrollTop = 0
// interceptor https://stackoverflow.com/a/66564476
if (window.oldXHROpen === undefined) {
window.oldXHROpen = window.XMLHttpRequest.prototype.open;
window.XMLHttpRequest.prototype.open = function(method, url, async, user, password) {
this.addEventListener('load', function() {
try {
var json = JSON.parse(this.responseText);
if (json.hasOwnProperty('threads')) {
// get ids
for (const thread of json.threads) {
ids.push(thread.id)
}
}
}
catch (e) {
console.log(e)
}
});
return window.oldXHROpen.apply(this, arguments);
}
}
else {
console.log('OK, interceptor already exists')
}
function getSelector(name) {
if (name === 'thread-icon')
return document.querySelectorAll('div[class*="toolbar-"] > div[aria-label="Threads"]')
if (name === 'scroller-base')
return document.querySelectorAll('div[id*="popout_"] > div > div > div[class*="scrollerBase-"]')
}
function clickThreadsIcon() {
if (getSelector('scroller-base').length > 0) {
console.log('OK, found scroller base (1)')
mainFunc()
}
else if (getSelector('scroller-base').length == 0 && getSelector('thread-icon').length > 0) {
getSelector('thread-icon')[0].click()
console.log('OK, clicked Threads Icon')
setTimeout(() => {
if (getSelector('scroller-base').length == 0) {
throw new Error('ERROR, could not find scroller-base')
}
else {
console.log('OK, found scroller base (2)')
mainFunc()
}
}, 1000)
}
else {
throw new Error('ERROR, could not find threads icon')
}
}
clickThreadsIcon()
function scrollToPosition(offset) {
scrollDiv = getSelector('scroller-base')[0]
scrollDiv.scroll(0, offset)
}
function captureIds() {
ids = [...new Set(ids)]
if (ids.length > len) {
len = ids.length
console.log('Found', len, 'IDs')
}
}
function printIds() {
console.log('DiscordChatExporter.Cli.exe export --token TOKEN --output "exports/threads" --format Json --media --reuse-media --markdown false --channel',ids.join(' '))
}
function mainFunc() {
scrollToPosition(0)
interval = setInterval(() => {
scrollToPosition(scrollDiv.scrollTop + window.innerHeight / 3)
setTimeout(() => {
captureIds()
if (previouseScrollTop === scrollDiv.scrollTop) {
clearInterval(interval)
printIds()
}
previouseScrollTop = scrollDiv.scrollTop
}, 500)
}, 742)
}
```
4. script will scroll thread list. At the the end, it will print command to the console, which allows you to export all archived threads in the channel
5. edit command printed in the console (`--format`, `--output` and `--token`) and export with CLI version of DiscordChatExporter

45
docs/Server-hosting.md Normal file
View file

@ -0,0 +1,45 @@
# Hosting on a Linux server
## Disclaimer
The client-server architecture allows you to host the viewer on public servers. But that doesn't mean that you should do it. Please be polite to other users and don't share any sensitive messages with broader audience (or the whole internet), than it was intended to be shared with before.
Intended use cases:
- old private discord server was hacked and new one was created. You want to share a backup with the same people, who were in the old server.
- or you host your own backup on your own server, so you can access it from anywhere.
- ... use common sense
## Protecting the viewer with password authentication
It is recomended to put the server behind another reverse proxy, such as nginx. The reverse proxy should be configured to require authentication (for example using [basic auth](https://docs.nginx.com/nginx/admin-guide/security-controls/configuring-http-basic-authentication/)).
Create firewall rules to open only TCP ports 22 (SSH) and 80 (HTTP) and enable firewall:
```
ufw allow 22/tcp
ufw allow 80/tcp
ufw enable
```
Bind port 21011 from docker container only to loopback (localhost `127.0.0.1`):
```bash
docker run ... -p 127.0.0.1:21011:21011 -it dcef
```
Create .htpasswd file:
```bash
sudo apt install apache2-utils
htpasswd -c /etc/nginx/.htpasswd <username>
<password>
```
Change nginx site config to require authentication (add lines `auth_basic` and `auth_basic_user_file`):
```
server {
listen 80;
location / {
proxy_pass http://localhost:21011/;
auth_basic "Auth only";
auth_basic_user_file /etc/nginx/.htpasswd;
}
}
```

View file

@ -0,0 +1,3 @@
# Supporting other exporters
Are you creating a new discord exporter and want to view the exported data with this viewer? Just convert the data from your format to the same format as DiscordChatExporter uses :).