Compare commits

...

39 Commits

Author SHA1 Message Date
Saboteur7 de2c031797 docs: update readme 2024-07-19 15:46:19 +08:00
Saboteur7 3aa571aa1b Merge pull request #2163 from 6vision/wechatcom_app
Ensure compatibility for /wxcomapp URL with trailing slash
2024-07-19 15:38:20 +08:00
Saboteur7 3e4969efe6 Merge branch 'master' into wechatcom_app 2024-07-19 15:38:08 +08:00
Saboteur7 446e94df76 Merge pull request #2164 from 6vision/mini_bot
Support gpt-4o-mini model
2024-07-19 15:37:30 +08:00
Saboteur7 5b26066a4c Merge pull request #2154 from distiny-cool/ali_api
增加了使用阿里云进行语音识别的引擎
2024-07-19 15:37:05 +08:00
Saboteur7 8a80de5c3f Merge pull request #2141 from Yanyutin753/new
PictureChange插件功能升级
2024-07-19 15:36:02 +08:00
6vision 52a490c87e Support gpt-4o-mini model 2024-07-19 11:04:45 +08:00
6vision 29490741fd Ensure compatibility for /wxcomapp URL with trailing slash 2024-07-18 23:21:45 +08:00
kody f0e416455f 增加了使用阿里云进行语音识别的引擎 2024-07-15 22:03:31 +08:00
vision f7a2c97943 Merge pull request #2153 from 6vision/update_linkaibot
support more file types.
2024-07-15 19:09:05 +08:00
6vision 993853757b Linkai bot supports more file types. 2024-07-15 18:57:58 +08:00
6vision a3abfb987d update 2024-07-15 18:50:38 +08:00
Saboteur7 2711fa1b1b Merge branch 'master' of github.com:zhayujie/chatgpt-on-wechat 2024-07-08 19:00:03 +08:00
Saboteur7 1f7afaba07 fix: client cmd config bug 2024-07-08 18:57:27 +08:00
Clivia e02c8bff81 PictureChange插件功能升级 2024-07-08 17:58:59 +08:00
Saboteur7 22391ba1a5 Update README.md 2024-07-05 15:45:54 +08:00
Saboteur7 a05781ec19 Merge pull request #2103 from 6vision/claude-3.5-sonnet
feat: support claude-3.5-sonnet model
2024-07-05 14:39:17 +08:00
Saboteur7 f898ed6a2a Merge branch 'master' into claude-3.5-sonnet 2024-07-05 14:32:45 +08:00
Saboteur7 e6d0a15b54 Merge pull request #2110 from He0607/新增高铁(火车)票查询插件
新增高铁(火车)票查询插件
2024-07-05 14:31:15 +08:00
Saboteur7 49cff026e2 Merge pull request #2113 from 6vision/update-0626
Update parameter descriptions for clarity
2024-07-05 14:26:33 +08:00
Saboteur7 08f0023cfd Merge pull request #2124 from 6vision/update_gemini_model
Update gemini 1.5model
2024-07-05 14:26:13 +08:00
Saboteur7 e311466ee6 Merge pull request #2128 from Maroon9/fix-docker-compose
fix:在docker-compose.yml文件中增加时区设置
2024-07-05 14:25:56 +08:00
wanxiangze 56789e68d7 fix:在docker-compose.yml文件中增加时区设置 2024-07-05 10:18:21 +08:00
6vision 87525bb383 update gemini model 2024-07-04 01:44:53 +08:00
6vision bb2880191a update gemini model 2024-07-04 01:22:55 +08:00
6vision 4f1acf26d6 Merge branch 'update-0626' of https://github.com/6vision/chatgpt-on-wechat into update-0626 2024-06-27 21:11:14 +08:00
6vision fc2d6b21ac update 2024-06-27 21:09:54 +08:00
zhayujie b9e84fefbd Merge pull request #2114 from 6vision/fix_dingtalk_group_chat
fix: dingtalk channel group chat bug
2024-06-27 10:29:51 +08:00
6vision 91f5ffb2d9 Correct the log information 2024-06-26 22:34:35 +08:00
6vision 70ff2341cb fix:dingtalk channel group chat bug 2024-06-26 22:10:58 +08:00
vision 74eed93497 Merge branch 'zhayujie:master' into update-0626 2024-06-26 15:15:32 +08:00
6vision d02e26c014 Update parameter descriptions for clarity 2024-06-26 15:14:29 +08:00
Wu_Cool 523cade7c3 新增高铁(火车)票查询插件 2024-06-26 09:13:40 +08:00
Wu_Cool e22c183ca9 新增高铁(火车)票查询插件 2024-06-26 09:11:04 +08:00
vision 3afd99da30 Merge pull request #2106 from 6vision/fix_sensitive
Fix TypeError in config drag_sensitive function
2024-06-24 22:04:56 +08:00
6vision f44979f983 Fix TypeError in config drag_sensitive function 2024-06-24 21:57:58 +08:00
6vision 095f9cc108 feat: support claude-3.5-sonnet model 2024-06-24 11:20:50 +08:00
zhayujie 1089076fce Merge pull request #2044 from Wang-zhechao/add-plugins-solitaire
添加微信接龙插件
2024-06-20 20:41:37 +08:00
Wang Zhechao 70344dd214 添加微信接龙插件 2024-06-04 22:39:59 +08:00
19 changed files with 144 additions and 29 deletions
+10 -4
View File
@@ -5,7 +5,7 @@
最新版本支持的功能如下:
-**多端部署:** 有多种部署方式可选择且功能完备,目前已支持微信公众号、企业微信应用、飞书、钉钉等部署方式
-**基础对话:** 私聊及群聊的消息智能回复,支持多轮会话上下文记忆,支持 GPT-3.5, GPT-4, GPT-4o, Claude-3, Gemini, 文心一言, 讯飞星火, 通义千问,ChatGLM-4Kimi(月之暗面), MiniMax
-**基础对话:** 私聊及群聊的消息智能回复,支持多轮会话上下文记忆,支持 GPT-3.5, GPT-4o-mini, GPT-4o, GPT-4, Claude-3.5, Gemini, 文心一言, 讯飞星火, 通义千问,ChatGLM-4Kimi(月之暗面), MiniMax
-**语音能力:** 可识别语音消息,通过文字或语音回复,支持 azure, baidu, google, openai(whisper/tts) 等多种语音模型
-**图像能力:** 支持图片生成、图片识别、图生图(如照片修复),可选择 Dall-E-3, stable diffusion, replicate, midjourney, CogView-3, vision模型
-**丰富插件:** 支持个性化插件扩展,已实现多角色切换、文字冒险、敏感词过滤、聊天记录总结、文档总结和对话、联网搜索等插件
@@ -18,6 +18,10 @@
3. 本项目主要接入协同办公平台,推荐使用公众号、企微自建应用、钉钉、飞书等接入通道,其他通道为历史产物已不维护
4. 任何个人、团队和企业,无论以何种方式使用该项目、对何对象提供服务,所产生的一切后果,本项目均不承担任何责任
## 演示
DEMO视频:https://cdn.link-ai.tech/doc/cow_demo.mp4
## 社区
添加小助手微信加入开源项目交流群:
@@ -42,8 +46,10 @@
# 🏷 更新日志
>**2024.06.20** [1.6.7版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.6.7),MiniMax模型、工作流图片输入、模型列表完善
>
>**2024.07.19** [1.6.9版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.6.9) 新增 gpt-4o-mini 模型、阿里语音识别、企微应用渠道路由优化
>**2024.07.05** [1.6.8版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.6.8) 和 [1.6.7版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.6.7)Claude3.5, Gemini 1.5 Pro, MiniMax模型、工作流图片输入、模型列表完善
>**2024.06.04** [1.6.6版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.6.6) 和 [1.6.5版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.6.5),gpt-4o模型、钉钉流式卡片、讯飞语音识别/合成
>**2024.04.26** [1.6.0版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.6.0),新增 Kimi 接入、gpt-4-turbo版本升级、文件总结和语音识别问题修复
@@ -169,7 +175,7 @@ pip3 install -r requirements-optional.txt
**4.其他配置**
+ `model`: 模型名称,目前支持 `gpt-3.5-turbo`, `gpt-4o`, `gpt-4-turbo`, `gpt-4`, `wenxin` , `claude` , `gemini`, `glm-4`, `xunfei`, `moonshot`等,全部模型名称参考[common/const.py](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/common/const.py)文件
+ `model`: 模型名称,目前支持 `gpt-3.5-turbo`, `gpt-4o-mini`, `gpt-4o`, `gpt-4`, `wenxin` , `claude` , `gemini`, `glm-4`, `xunfei`, `moonshot`等,全部模型名称参考[common/const.py](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/common/const.py)文件
+ `temperature`,`frequency_penalty`,`presence_penalty`: Chat API接口参数,详情参考[OpenAI官方文档。](https://platform.openai.com/docs/api-reference/chat)
+ `proxy`:由于目前 `openai` 接口国内无法访问,需配置代理客户端的地址,详情参考 [#351](https://github.com/zhayujie/chatgpt-on-wechat/issues/351)
+ 对于图像生成,在满足个人或群组触发条件外,还需要额外的关键词前缀来触发,对应配置 `image_create_prefix `
+1 -1
View File
@@ -67,7 +67,7 @@ def num_tokens_from_messages(messages, model):
elif model in ["gpt-4-0314", "gpt-4-0613", "gpt-4-32k", "gpt-4-32k-0613", "gpt-3.5-turbo-0613",
"gpt-3.5-turbo-16k", "gpt-3.5-turbo-16k-0613", "gpt-35-turbo-16k", "gpt-4-turbo-preview",
"gpt-4-1106-preview", const.GPT4_TURBO_PREVIEW, const.GPT4_VISION_PREVIEW, const.GPT4_TURBO_01_25,
const.GPT_4o, const.LINKAI_4o, const.LINKAI_4_TURBO]:
const.GPT_4o, const.GPT_4o_MINI, const.LINKAI_4o, const.LINKAI_4_TURBO]:
return num_tokens_from_messages(messages, model="gpt-4")
elif model.startswith("claude-3"):
return num_tokens_from_messages(messages, model="gpt-3.5-turbo")
+2
View File
@@ -130,4 +130,6 @@ class ClaudeAPIBot(Bot, OpenAIImage):
return "claude-3-sonnet-20240229"
elif model == "claude-3-haiku":
return "claude-3-haiku-20240307"
elif model == "claude-3.5-sonnet":
return "claude-3-5-sonnet-20240620"
return model
+4 -2
View File
@@ -24,7 +24,9 @@ class GoogleGeminiBot(Bot):
self.api_key = conf().get("gemini_api_key")
# 复用文心的token计算方式
self.sessions = SessionManager(BaiduWenxinSession, model=conf().get("model") or "gpt-3.5-turbo")
self.model = conf().get("model") or "gemini-pro"
if self.model == "gemini":
self.model = "gemini-pro"
def reply(self, query, context: Context = None) -> Reply:
try:
if context.type != ContextType.TEXT:
@@ -35,7 +37,7 @@ class GoogleGeminiBot(Bot):
session = self.sessions.session_query(query, session_id)
gemini_messages = self._convert_to_gemini_messages(self.filter_messages(session.messages))
genai.configure(api_key=self.api_key)
model = genai.GenerativeModel('gemini-pro')
model = genai.GenerativeModel(self.model)
response = model.generate_content(gemini_messages)
reply_text = response.text
self.sessions.session_reply(reply_text, session_id)
+2 -1
View File
@@ -399,6 +399,7 @@ class LinkAIBot(Bot):
return
max_send_num = conf().get("max_media_send_count")
send_interval = conf().get("media_send_interval")
file_type = (".pdf", ".doc", ".docx", ".csv", ".xls", ".xlsx", ".txt", ".rtf", ".ppt", ".pptx")
try:
i = 0
for url in image_urls:
@@ -407,7 +408,7 @@ class LinkAIBot(Bot):
i += 1
if url.endswith(".mp4"):
reply_type = ReplyType.VIDEO_URL
elif url.endswith(".pdf") or url.endswith(".doc") or url.endswith(".docx") or url.endswith(".csv"):
elif url.endswith(file_type):
reply_type = ReplyType.FILE
url = _download_file(url)
if not url:
+1 -1
View File
@@ -36,7 +36,7 @@ class Bridge(object):
self.btype["chat"] = const.QWEN
if model_type in [const.QWEN_TURBO, const.QWEN_PLUS, const.QWEN_MAX]:
self.btype["chat"] = const.QWEN_DASHSCOPE
if model_type in [const.GEMINI]:
if model_type and model_type.startswith("gemini"):
self.btype["chat"] = const.GEMINI
if model_type in [const.ZHIPU_AI]:
self.btype["chat"] = const.ZHIPU_AI
+1
View File
@@ -117,6 +117,7 @@ class ChatChannel(Channel):
logger.info("[chat_channel]receive group at")
if not conf().get("group_at_off", False):
flag = True
self.name = self.name if self.name is not None else "" # 部分渠道self.name可能没有赋值
pattern = f"@{re.escape(self.name)}(\u2005|\u0020)"
subtract_res = re.sub(pattern, r"", content)
if isinstance(context["msg"].at_list, list):
+1 -1
View File
@@ -163,7 +163,7 @@ class DingTalkChanel(ChatChannel, dingtalk_stream.ChatbotHandler):
elif cmsg.ctype == ContextType.PATPAT:
logger.debug("[DingTalk]receive patpat msg: {}".format(cmsg.content))
elif cmsg.ctype == ContextType.TEXT:
logger.debug("[DingTalk]receive patpat msg: {}".format(cmsg.content))
logger.debug("[DingTalk]receive text msg: {}".format(cmsg.content))
else:
logger.debug("[DingTalk]receive other msg: {}".format(cmsg.content))
context = self._compose_context(cmsg.ctype, cmsg.content, isgroup=True, msg=cmsg)
+1
View File
@@ -49,6 +49,7 @@ class DingTalkMessage(ChatMessage):
if self.is_group:
self.from_user_id = event.conversation_id
self.actual_user_id = event.sender_id
self.is_at = True
else:
self.from_user_id = event.sender_id
self.actual_user_id = event.sender_id
+1 -1
View File
@@ -44,7 +44,7 @@ class WechatComAppChannel(ChatChannel):
def startup(self):
# start message listener
urls = ("/wxcomapp", "channel.wechatcom.wechatcomapp_channel.Query")
urls = ("/wxcomapp/?", "channel.wechatcom.wechatcomapp_channel.Query")
app = web.application(urls, globals(), autoreload=False)
port = conf().get("wechatcomapp_port", 9898)
web.httpserver.runsimple(app.wsgifunc(), ("0.0.0.0", port))
+9 -5
View File
@@ -11,7 +11,7 @@ QWEN = "qwen" # 旧版通义模型
QWEN_DASHSCOPE = "dashscope" # 通义新版sdk和api key
GEMINI = "gemini"
GEMINI = "gemini" # gemini-1.0-pro
ZHIPU_AI = "glm-4"
MOONSHOT = "moonshot"
MiniMax = "minimax"
@@ -32,6 +32,7 @@ GPT4_TURBO_11_06 = "gpt-4-1106-preview"
GPT4_VISION_PREVIEW = "gpt-4-vision-preview"
GPT4 = "gpt-4"
GPT_4o_MINI = "gpt-4o-mini"
GPT4_32k = "gpt-4-32k"
GPT4_06_13 = "gpt-4-0613"
GPT4_32k_06_13 = "gpt-4-32k-0613"
@@ -51,16 +52,19 @@ LINKAI_35 = "linkai-3.5"
LINKAI_4_TURBO = "linkai-4-turbo"
LINKAI_4o = "linkai-4o"
GEMINI_PRO = "gemini-1.0-pro"
GEMINI_15_flash = "gemini-1.5-flash"
GEMINI_15_PRO = "gemini-1.5-pro"
MODEL_LIST = [
GPT35, GPT35_0125, GPT35_1106, "gpt-3.5-turbo-16k",
GPT_4o, GPT4_TURBO, GPT4_TURBO_PREVIEW, GPT4_TURBO_01_25, GPT4_TURBO_11_06, GPT4, GPT4_32k, GPT4_06_13, GPT4_32k_06_13,
GPT_4o, GPT_4o_MINI, GPT4_TURBO, GPT4_TURBO_PREVIEW, GPT4_TURBO_01_25, GPT4_TURBO_11_06, GPT4, GPT4_32k, GPT4_06_13, GPT4_32k_06_13,
WEN_XIN, WEN_XIN_4,
XUNFEI, GEMINI, ZHIPU_AI, MOONSHOT,
"claude", "claude-3-haiku", "claude-3-sonnet", "claude-3-opus", "claude-3-opus-20240229",
XUNFEI, ZHIPU_AI, MOONSHOT, MiniMax,
GEMINI, GEMINI_PRO, GEMINI_15_flash, GEMINI_15_PRO,
"claude", "claude-3-haiku", "claude-3-sonnet", "claude-3-opus", "claude-3-opus-20240229", "claude-3.5-sonnet",
"moonshot-v1-8k", "moonshot-v1-32k", "moonshot-v1-128k",
QWEN, QWEN_TURBO, QWEN_PLUS, QWEN_MAX,
MiniMax,
LINKAI_35, LINKAI_4_TURBO, LINKAI_4o
]
+5 -2
View File
@@ -45,8 +45,11 @@ class ChatClient(LinkAIClient):
elif reply_voice_mode == "always_reply_voice":
local_config["always_reply_voice"] = True
if config.get("admin_password") and plugin_config.get("Godcmd"):
plugin_config["Godcmd"]["password"] = config.get("admin_password")
if config.get("admin_password"):
if not plugin_config.get("Godcmd"):
plugin_config["Godcmd"] = {"password": config.get("admin_password"), "admin_users": []}
else:
plugin_config["Godcmd"]["password"] = config.get("admin_password")
PluginManager().instances["GODCMD"].reload()
if config.get("group_app_map") and pconf("linkai"):
+5 -5
View File
@@ -17,7 +17,7 @@ available_setting = {
"open_ai_api_base": "https://api.openai.com/v1",
"proxy": "", # openai使用的代理
# chatgpt模型, 当use_azure_chatgpt为true时,其名称为Azure上model deployment名称
"model": "gpt-3.5-turbo", # 支持ChatGPT、Claude、Gemini、文心一言、通义千问、Kimi、讯飞星火、智谱、LinkAI等模型,模型具体名称详见common/const.py文件列出的模型
"model": "gpt-3.5-turbo", # 可选择: gpt-4o, pt-4o-mini, gpt-4-turbo, claude-3-sonnet, wenxin, moonshot, qwen-turbo, xunfei, glm-4, minimax, gemini等模型,全部可选模型详见common/const.py文件
"bot_type": "", # 可选配置,使用兼容openai格式的三方服务时候,需填"chatGPT"。bot具体名称详见common/const.py文件列出的bot_type,如不填根据model名称判断,
"use_azure_chatgpt": False, # 是否使用azure的chatgpt
"azure_deployment_id": "", # azure 模型部署名称
@@ -95,8 +95,8 @@ available_setting = {
"group_speech_recognition": False, # 是否开启群组语音识别
"voice_reply_voice": False, # 是否使用语音回复语音,需要设置对应语音合成引擎的api key
"always_reply_voice": False, # 是否一直使用语音回复
"voice_to_text": "openai", # 语音识别引擎,支持openai,baidu,google,azure
"text_to_voice": "openai", # 语音合成引擎,支持openai,baidu,google,pytts(offline),azure,elevenlabs,edge(online)
"voice_to_text": "openai", # 语音识别引擎,支持openai,baidu,google,azure,xunfei,ali
"text_to_voice": "openai", # 语音合成引擎,支持openai,baidu,google,azure,xunfei,ali,pytts(offline),elevenlabs,edge(online)
"text_to_voice_model": "tts-1",
"tts_voice_id": "alloy",
# baidu 语音api配置, 使用百度语音识别和语音合成时需要
@@ -242,7 +242,7 @@ def drag_sensitive(config):
conf_dict_copy = copy.deepcopy(conf_dict)
for key in conf_dict_copy:
if "key" in key or "secret" in key:
if isinstance(key, str):
if isinstance(conf_dict_copy[key], str):
conf_dict_copy[key] = conf_dict_copy[key][0:3] + "*" * 5 + conf_dict_copy[key][-3:]
return json.dumps(conf_dict_copy, indent=4)
@@ -250,7 +250,7 @@ def drag_sensitive(config):
config_copy = copy.deepcopy(config)
for key in config:
if "key" in key or "secret" in key:
if isinstance(key, str):
if isinstance(config_copy[key], str):
config_copy[key] = config_copy[key][0:3] + "*" * 5 + config_copy[key][-3:]
return config_copy
except Exception as e:
+1
View File
@@ -6,6 +6,7 @@ services:
security_opt:
- seccomp:unconfined
environment:
TZ: 'Asia/Shanghai'
OPEN_AI_API_KEY: 'YOUR API KEY'
MODEL: 'gpt-3.5-turbo'
PROXY: ''
+1
View File
@@ -18,6 +18,7 @@ class Plugin:
if not plugin_conf:
# 全局配置不存在,则获取插件目录下的配置
plugin_config_path = os.path.join(self.path, "config.json")
logger.debug(f"loading plugin config, plugin_config_path={plugin_config_path}, exist={os.path.exists(plugin_config_path)}")
if os.path.exists(plugin_config_path):
with open(plugin_config_path, "r", encoding="utf-8") as f:
plugin_conf = json.load(f)
+9 -1
View File
@@ -22,7 +22,7 @@
},
"pictureChange": {
"url": "https://github.com/Yanyutin753/pictureChange.git",
"desc": "利用stable-diffusion和百度Ai进行图生图或者画图的插件"
"desc": "1. 支持百度AI和Stable Diffusion WebUI进行图像处理,提供多种模型选择,支持图生图、文生图自定义模板。2. 支持Suno音乐AI可将图像和文字转为音乐。3. 支持自定义模型进行文件、图片总结功能。4. 支持管理员控制群聊内容与参数和功能改变。"
},
"Blackroom": {
"url": "https://github.com/dividduang/blackroom.git",
@@ -31,6 +31,14 @@
"midjourney": {
"url": "https://github.com/baojingyu/midjourney.git",
"desc": "利用midjourney实现ai绘图的的插件"
},
"solitaire": {
"url": "https://github.com/Wang-zhechao/solitaire.git",
"desc": "机器人微信接龙插件"
},
"HighSpeedTicket": {
"url": "https://github.com/He0607/HighSpeedTicket.git",
"desc": "高铁(火车)票查询插件"
}
}
}
+64
View File
@@ -8,6 +8,7 @@ Description:
"""
import http.client
import json
import time
import requests
@@ -61,6 +62,69 @@ def text_to_speech_aliyun(url, text, appkey, token):
return output_file
def speech_to_text_aliyun(url, audioContent, appkey, token):
"""
使用阿里云的语音识别服务识别音频文件中的语音。
参数:
- url (str): 阿里云语音识别服务的端点URL。
- audioContent (byte): pcm音频数据。
- appkey (str): 您的阿里云appkey。
- token (str): 阿里云API的认证令牌。
返回值:
- str: 成功时输出识别到的文本,否则为None。
"""
format = 'pcm'
sample_rate = 16000
enablePunctuationPrediction = True
enableInverseTextNormalization = True
enableVoiceDetection = False
# 设置RESTful请求参数
request = url + '?appkey=' + appkey
request = request + '&format=' + format
request = request + '&sample_rate=' + str(sample_rate)
if enablePunctuationPrediction :
request = request + '&enable_punctuation_prediction=' + 'true'
if enableInverseTextNormalization :
request = request + '&enable_inverse_text_normalization=' + 'true'
if enableVoiceDetection :
request = request + '&enable_voice_detection=' + 'true'
host = 'nls-gateway-cn-shanghai.aliyuncs.com'
# 设置HTTPS请求头部
httpHeaders = {
'X-NLS-Token': token,
'Content-type': 'application/octet-stream',
'Content-Length': len(audioContent)
}
conn = http.client.HTTPSConnection(host)
conn.request(method='POST', url=request, body=audioContent, headers=httpHeaders)
response = conn.getresponse()
body = response.read()
try:
body = json.loads(body)
status = body['status']
if status == 20000000 :
result = body['result']
if result :
logger.info(f"阿里云语音识别到了:{result}")
conn.close()
return result
else :
logger.error(f"语音识别失败,状态码: {status}")
except ValueError:
logger.error(f"语音识别失败,收到非JSON格式的数据: {body}")
conn.close()
return None
class AliyunTokenGenerator:
"""
+24 -4
View File
@@ -15,9 +15,9 @@ import time
from bridge.reply import Reply, ReplyType
from common.log import logger
from voice.audio_convert import get_pcm_from_wav
from voice.voice import Voice
from voice.ali.ali_api import AliyunTokenGenerator
from voice.ali.ali_api import text_to_speech_aliyun
from voice.ali.ali_api import AliyunTokenGenerator, speech_to_text_aliyun, text_to_speech_aliyun
from config import conf
@@ -34,7 +34,8 @@ class AliVoice(Voice):
self.token = None
self.token_expire_time = 0
# 默认复用阿里云千问的 access_key 和 access_secret
self.api_url = config.get("api_url")
self.api_url_voice_to_text = config.get("api_url_voice_to_text")
self.api_url_text_to_voice = config.get("api_url_text_to_voice")
self.app_key = config.get("app_key")
self.access_key_id = conf().get("qwen_access_key_id") or config.get("access_key_id")
self.access_key_secret = conf().get("qwen_access_key_secret") or config.get("access_key_secret")
@@ -53,7 +54,7 @@ class AliVoice(Voice):
r'äöüÄÖÜáéíóúÁÉÍÓÚàèìòùÀÈÌÒÙâêîôûÂÊÎÔÛçÇñÑ,。!?,.]', '', text)
# 提取有效的token
token_id = self.get_valid_token()
fileName = text_to_speech_aliyun(self.api_url, text, self.app_key, token_id)
fileName = text_to_speech_aliyun(self.api_url_text_to_voice, text, self.app_key, token_id)
if fileName:
logger.info("[Ali] textToVoice text={} voice file name={}".format(text, fileName))
reply = Reply(ReplyType.VOICE, fileName)
@@ -61,6 +62,25 @@ class AliVoice(Voice):
reply = Reply(ReplyType.ERROR, "抱歉,语音合成失败")
return reply
def voiceToText(self, voice_file):
"""
将语音文件转换为文本。
:param voice_file: 要转换的语音文件。
:return: 返回一个Reply对象,其中包含转换得到的文本或错误信息。
"""
# 提取有效的token
token_id = self.get_valid_token()
logger.debug("[Ali] voice file name={}".format(voice_file))
pcm = get_pcm_from_wav(voice_file)
text = speech_to_text_aliyun(self.api_url_voice_to_text, pcm, self.app_key, token_id)
if text:
logger.info("[Ali] VoicetoText = {}".format(text))
reply = Reply(ReplyType.TEXT, text)
else:
reply = Reply(ReplyType.ERROR, "抱歉,语音识别失败")
return reply
def get_valid_token(self):
"""
获取有效的阿里云token。
+2 -1
View File
@@ -1,5 +1,6 @@
{
"api_url": "https://nls-gateway-cn-shanghai.aliyuncs.com/stream/v1/tts",
"api_url_text_to_voice": "https://nls-gateway-cn-shanghai.aliyuncs.com/stream/v1/tts",
"api_url_voice_to_text": "https://nls-gateway.cn-shanghai.aliyuncs.com/stream/v1/asr",
"app_key": "",
"access_key_id": "",
"access_key_secret": ""