From 995eea9d67af38a62579da900704f15e9b0d035e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E9=BB=91=E5=A2=A8=E6=B0=B4=E9=B1=BC?=
 <heimoshuiyu@gmail.com>
Date: Sun, 18 Feb 2024 17:28:16 +0800
Subject: [PATCH] Update README.md

---
 README.md | 182 ++++++++++++++++++++++++++++++------------------------
 1 file changed, 100 insertions(+), 82 deletions(-)

diff --git a/README.md b/README.md
index 16913d4..bfe8d67 100644
--- a/README.md
+++ b/README.md
@@ -7,16 +7,110 @@
 - 自定义 Authorization 验证头
 - 支持所有类型的接口 (`/v1/*`)
 - 提供 Prometheus Metrics 统计接口 (`/v1/metrics`)
-- 按照定义顺序请求 OpenAI 上游
-- 识别 ChatCompletions Stream 请求，针对 Stream 请求使用 5 秒超时。具体超时策略请参阅 [超时策略](#超时策略) 一节
-- 记录完整的请求内容、使用的上游、IP 地址、响应时间以及 GPT 回复文本
-- 请求出错时发送 飞书 或 Matrix 消息通知
-- 支持 Replicate 平台上的模型
+- 按照定义顺序请求 OpenAI 上游，出错或超时自动按顺序尝试下一个
+- 识别 ChatCompletions Stream 请求，针对 Stream 请求使用更短的超时。具体超时策略请参阅 [超时策略](#超时策略) 一节
+- 有选择地记录请求内容、请求头、使用的上游、IP 地址、响应时间以及响应等内容。具体记录策略请参阅 [记录策略](#记录策略) 一节
+- 请求出错时发送 飞书 或 Matrix 平台的消息通知
+- 支持 Replicate 平台上的 mistral 模型（beta）
 
 本文档详细介绍了如何使用负载均衡和能力 API 的方法和端点。
 
+## 配置文件
+
+默认情况下程序会使用当前目录下的 `config.yaml` 文件，您可以通过使用 `-config your-config.yaml` 参数指定配置文件路径。
+
+以下是一个配置文件示例，你可以在 `config.sample.yaml` 文件中找到同样的内容
+
+```yaml
+authorization: woshimima
+
+# 默认超时时间，默认 120 秒，流式请求是 10 秒
+timeout: 120
+stream_timeout: 10 
+
+# 使用 sqlite 作为数据库储存请求记录
+dbtype: sqlite
+dbaddr: ./db.sqlite
+
+# 使用 postgres 作为数据库储存请求记录
+# dbtype: postgres
+# dbaddr: "host=127.0.0.1 port=5432 user=postgres dbname=openai_api_route sslmode=disable password=woshimima"
+
+upstreams:
+  - sk: hahaha
+    endpoint: "https://localhost:8888/v1"
+    allow:
+      # whisper 等非 JSON API 识别不到 model，则使用 URL 路径作为模型名称
+      - /v1/audio/transcriptions
+
+  - sk: "secret_key_1"
+    endpoint: "https://api.openai.com/v2"
+    timeout: 120  # 请求超时时间，默认120秒
+    stream_timeout: 10  # 如果识别到 stream: true, 则使用该超时时间
+    allow:  # 可选的模型白名单
+      - gpt-3.5-trubo
+      - gpt-3.5-trubo-0613
+
+  # 您可以设置很多个上游，程序将依次按顺序尝试
+  - sk: "secret_key_2"
+    endpoint: "https://api.openai.com/v1"
+    timeout: 30
+    deny: 
+      - gpt-4
+
+  - sk: "key_for_replicate"
+    type: replicate
+    allow:
+      - mistralai/mixtral-8x7b-instruct-v0.1
+```
+
+### 配置多个验证头
+
+您可以使用英文逗号 `,` 分割多个验证头。每个验证头都是有效的，程序会记录每个请求使用的验证头
+
+```yaml
+authorization: woshimima,iampassword
+```
+
+您也可以为上游单独设置验证头
+
+```yaml
+authorization: woshimima,iampassword
+upstreams:
+  - sk: key
+    authorization: woshimima
+```
+
+如此，只有携带 `woshimima` 验证头的用户可以使用该上游。
+
 ## 部署方法
 
+有两种推荐的部署方法：
+
+1. 使用预先构建好的容器 `docker.io/heimoshuiyu/openai-api-route:latest`
+2. 自行编译
+
+### 使用容器运行
+
+> 注意，如果您使用 sqlite 数据库，您可能还需要修改配置文件以将 SQLite 数据库文件放置在数据卷中。
+
+```bash
+docker run -d --name openai-api-route -v /path/to/config.yaml:/config.yaml docker.io/heimoshuiyu/openai-api-route:latest
+```
+
+使用 Docker Compose
+
+```yaml
+version: '3'
+services:
+  openai-api-route:
+    image: docker.io/heimoshuiyu/openai-api-route:latest
+    ports:
+      - 8888:8888
+    volumes:
+      - ./config.yaml:/config.yaml
+```
+
 ### 编译
 
 以下是编译和运行该负载均衡 API 的步骤：
@@ -41,78 +135,6 @@
    ./openai-api-route
    ```
 
-   默认情况下，API 将会在本地的 8888 端口进行监听。
-
-   如果您希望使用不同的监听地址，可以使用 `-addr` 参数来指定，例如：
-
-   ```
-   ./openai-api-route -addr 0.0.0.0:8080
-   ```
-
-   这将会将监听地址设置为 0.0.0.0:8080。
-
-6. 如果数据库不存在，系统会自动创建一个名为 `db.sqlite` 的数据库文件。
-
-   如果您希望使用不同的数据库地址，可以使用 `-database` 参数来指定，例如：
-
-   ```
-   ./openai-api-route -database /path/to/database.db
-   ```
-
-   这将会将数据库地址设置为 `/path/to/database.db`。
-
-7. 现在，您已经成功编译并运行了负载均衡和能力 API。您可以根据需要添加上游、管理上游，并使用 API 进行相关操作。
-
-### 运行
-
-以下是运行命令的用法：
-
-```
-Usage of ./openai-api-route:
-  -addr string
-        监听地址（默认为 ":8888"）
-  -upstreams string
-        上游配置文件（默认为 "./upstreams.yaml"）
-  -dbtype
-        数据库类型 (sqlite 或 postgres，默认为 sqlite)
-  -database string
-        数据库地址（默认为 "./db.sqlite"）
-        如果数据库为 postgres ，则此值应 PostgreSQL DSN 格式
-        例如 "host=127.0.0.1 port=5432 user=postgres dbname=openai_api_route sslmode=disable password=woshimima"
-  -list
-        列出所有上游
-  -noauth
-        不检查传入的授权头
-```
-
-以下是一个 `./upstreams.yaml` 文件配置示例
-
-```yaml
-authorization: woshimima
-
-# 使用 sqlite 作为数据库储存请求记录
-dbtype: sqlite
-dbaddr: ./db.sqlite
-
-# 使用 postgres 作为数据库储存请求记录
-# dbtype: postgres
-# dbaddr: "host=127.0.0.1 port=5432 user=postgres dbname=openai_api_route sslmode=disable password=woshimima"
-
-upstreams:
-  - sk: "key_for_replicate"
-    type: replicate
-    allow: ["mistralai/mixtral-8x7b-instruct-v0.1"]
-  - sk: "secret_key_1"
-    endpoint: "https://api.openai.com/v2"
-  - sk: "secret_key_2"
-    endpoint: "https://api.openai.com/v1"
-    timeout: 30
-```
-
-请注意，程序会根据情况修改 timeout 的值
-
-您可以直接运行 `./openai-api-route` 命令，如果数据库不存在，系统会自动创建。
-
 ## 模型允许与屏蔽列表
 
 如果对某个上游设置了 allow 或 deny 列表，则负载均衡只允许或禁用用户使用这些模型。负载均衡程序会先判断白名单，再判断黑名单。
@@ -137,8 +159,4 @@ upstreams:
 
 1. **默认超时时间**：如果没有特殊条件，服务将使用默认的超时时间，即 60 秒。
 
-2. **流式请求**：如果请求体被识别为流式（`requestBody.Stream` 为 `true`），并且请求体检查（`requestBodyOK`）没有发现问题，超时时间将被设置为 5 秒。这适用于那些预期会快速响应的流式请求。
-
-3. **大请求体**：如果请求体的大小超过 128KB（即 `len(inBody) > 1024*128`），超时时间将被设置为 20 秒。这考虑到了处理大型数据可能需要更长的时间。
-
-4. **上游超时配置**：如果上游服务器在配置中指定了超时时间（`upstream.Timeout` 大于 0），服务将使用该值作为超时时间。这个值是以秒为单位的。
+2. **流式请求**：如果请求体被识别为流式（`requestBody.Stream` 为 `true`），并且请求体检查（`requestBodyOK`）没有发现问题，超时时间将被设置为 5 秒。这适用于那些预期会快速响应的流式请求。
\ No newline at end of file