Releases: modelscope/ms-swift
Releases · modelscope/ms-swift
v1.5.0
English Version
New features:
- Support multi-line inference
- Support multi node training
- Add benchmarks
- Support UI training, start by
swift web-ui
- Support VLLM inference
- Support RLHF(DPO) training
New tuners:
- SCEdit, adopted by TongYi Lab, uses lesser memory but produces better performance than LoRA, and can be used to replace ControlNet in a series of scenarios like Pose control/In-Painting/Out-Paining/Label-removing, etc.
New models:
- SUS series models
- Mixtral-MoE series models
- deepseek series models
- phi2-3b
- cogagent-chat/cogagent-vqa
- codegeex2-6b
New datasets:
Datasets used in RLHF:
- hh-rlhf
- stack-exchange-paired
中文版
SWIFT本月新版本已经发布!
新特性:
- 支持多行输入推理
- 支持多卡训练
- 添加了模型训练的benchmarks
- 支持界面训练和推理,通过
swift web-ui
开启 - 支持VLLM推理
- 支持RLHF(DPO)训练
新tuners:
SCEdit: 通义实验室自研的优秀U-Net微调框架,显存占用远小于LoRA,效果较LoRA更好,且可以替代实现ControlNet的效果,实现In-Painting/Out-Paining/标签去除/Pose控制等能力
新模型:
SUS系列模型
Mixtral-MoE系列模型
deepseek系列模型
phi2-3b
cogagent-chat/cogagent-vqa
codegeex2-6b
新数据集:
用于RLHF的数据集:
hh-rlhf
stack-exchange-paired
What's Changed
- update multi-line input (infer) by @Jintao-Huang in #196
- Fix model saving in new format by @tastelikefeet in #198
- Fix compatible error by @tastelikefeet in #201
- Fix bug 1206 by @Jintao-Huang in #202
- fix fp16 & full bug by @Jintao-Huang in #203
- Fix qwen-audio inference bug by @Jintao-Huang in #204
- Support multi node by @Jintao-Huang in #205
- fix typo bug by @Jintao-Huang in #206
- Support sus by @Jintao-Huang in #207
- Support cpu by @Jintao-Huang in #208
- Add Feat: Freeze Parameters, disable_tqdm by @Jintao-Huang in #210
- update dataset by @Jintao-Huang in #212
- Support lazy_tokenize, preprocess_num_proc by @Jintao-Huang in #211
- Support Mixtral MoE by @tastelikefeet in #217
- Add benchmark by @Jintao-Huang in #213
- support ui training by @tastelikefeet in #219
- Fix transformers 4.36 by @Jintao-Huang in #218
- Update mixtral-7b-moe by @Jintao-Huang in #221
- Compatible with peft>=0.7.0 by @tastelikefeet in #220
- fix dtype='fp16' sft bug by @Jintao-Huang in #227
- fix generation_config warning by @Jintao-Huang in #224
- Fix merge_lora & model_cache_dir bug by @Jintao-Huang in #229
- fix lazy_tokenize bug by @Jintao-Huang in #228
- Add inference UI and refactor machenism by @tastelikefeet in #230
- Support deepseek by @Jintao-Huang in #223
- relax version restriction by @tastelikefeet in #232
- fix bug 1218 by @Jintao-Huang in #235
- support deployment by @Jintao-Huang in #231
- update docs by @Jintao-Huang in #238
- Refactor some code by @tastelikefeet in #237
- fix typo bug by @Jintao-Huang in #239
- update readme & phi2-3b by @Jintao-Huang in #241
- Fix argument 1220 by @Jintao-Huang in #242
- Support CogAgent by @tastelikefeet in #243
- fix infer by @tastelikefeet in #244
- Support more peft tuners by @tastelikefeet in #245
- Fix copying additional files by @tastelikefeet in #247
- Add sft for codegeex2 by @tastelikefeet in #248
- fix issue #249 by @tastelikefeet in #250
- Feat/scedit by @jiangzeyinzi in #253
- Update 1228 by @Jintao-Huang in #254
- fix unicode error by @tastelikefeet in #259
- Update readme for SCEdit by @tastelikefeet in #258
- DPO by @tastelikefeet in #255
- update self-cognition by @Jintao-Huang in #261
- Fix/1229 by @tastelikefeet in #260
- fix trainer init by @tastelikefeet in #262
- fix bugs by @tastelikefeet in #263
- fix import by @tastelikefeet in #265
- Fix import by @tastelikefeet in #266
- update perf by @Jintao-Huang in #264
- fix bug by @tastelikefeet in #267
- Support win32 by @tastelikefeet in #268
Full Changelog: v1.4.0...v1.5.0
v1.4.0
English Version
New features:
- Support for self-awareness fine-tuning.
- Support for fine-tuning and inference of the AnimateDiff model in the AIGC direction.
- Support for flash attention in more models: qwen series, qwen-vl series, llama series, openbuddy series, mistral series, yi series, ziya series, using the use_flash_attn parameter.
- Support for multiple loras to take effect simultaneously.
New tuners:
- NEFTune
- ROME supports more models: chatglm
New models:
- AnimateDiff
- zephyr-7b-beta-chat, openbuddy-zephyr-7b-chat
- qwen-1_8b, qwen-1_8b-chat, qwen-1_8b-chat-int4, qwen-1_8b-chat-int8
- qwen-72b, qwen-72b-chat, qwen-72b-chat-int4, qwen-72b-chat-int8
- qwen-audio, qwen-audio-chat
- yi-34b-chat, codefuse-codellama-34b-chat
- tongyi-finance-14b, tongyi-finance-14b-chat, tongyi-finance-14b-chat-int4
- bluelm-7b, bluelm-7b-chat, bluelm-7b-32k, bluelm-7b-chat-32k
New datasets:
- hc3-zh, hc3-en
- codefuse-python-en, codefuse-eval-instruction-zh
- aishell1-zh, aishell1-mini-zh
中文版
新特性:
- 支持自我认知微调.
- 支持AIGC方向的AnimateDiff模型的微调与推理.
- 支持更多模型的flash attention: qwen series, qwen-vl series, llama series, openbuddy series, mistral series, yi series, ziya series. 使用use_flash_attn参数.
- 支持多个lora同时生效
新tuners:
- NEFTune
- ROME支持更多模型: chatglm
新模型:
- AnimateDiff
- zephyr-7b-beta-chat, openbuddy-zephyr-7b-chat
- qwen-1_8b, qwen-1_8b-chat, qwen-1_8b-chat-int4, qwen-1_8b-chat-int8
- qwen-72b, qwen-72b-chat, qwen-72b-chat-int4, qwen-72b-chat-int8
- qwen-audio, qwen-audio-chat
- yi-34b-chat, codefuse-codellama-34b-chat
- tongyi-finance-14b, tongyi-finance-14b-chat, tongyi-finance-14b-chat-int4
- bluelm-7b, bluelm-7b-chat, bluelm-7b-32k, bluelm-7b-chat-32k
新数据集:
- hc3-zh, hc3-en
- codefuse-python-en, codefuse-eval-instruction-zh
- aishell1-zh, aishell1-mini-zh
What's Changed
- Support Yi-6b sft by @tastelikefeet in #134
- fix CLI by @tastelikefeet in #135
- update readme by @tastelikefeet in #137
- Support xverse 65b sft by @tastelikefeet in #138
- Support bluelm by @Jintao-Huang in #140
- fix doc by @tastelikefeet in #143
- Add neftune by @tastelikefeet in #145
- Update sh by @Jintao-Huang in #144
- Add compatibility test and fix some problems with peft>=0.6.0 by @tastelikefeet in #146
- fix compatible with transformers>=4.35 by @Jintao-Huang in #148
- Update sh 1115 by @Jintao-Huang in #150
- Update doc by @tastelikefeet in #151
- support flash_attn by @Jintao-Huang in #152
- Fix bug: not work on peft<=0.5.0 by @tastelikefeet in #155
- fix register model bug by @Jintao-Huang in #154
- Support tongyi finance 14b by @Jintao-Huang in #157
- add check_model args and fix check_dataset by @Jintao-Huang in #159
- fix load_from_ckpt_dir bug by @Jintao-Huang in #161
- Update arguments by @Jintao-Huang in #162
- new feature: save_infer_result_to_jsonl by @Jintao-Huang in #163
- Feat 1121 by @Jintao-Huang in #165
- update readme and fix bug by @Jintao-Huang in #167
- Add cli merge lora by @Jintao-Huang in #168
- update code by @Jintao-Huang in #169
- support yi-34b-chat by @Jintao-Huang in #164
- Add animate diff by @tastelikefeet in #174
- update readme by @Jintao-Huang in #175
- Refine LoRA to peft by @tastelikefeet in #176
- support qwem-72b qwen-1_8b qwen-audio by @Jintao-Huang in #180
- Update wechat by @Jintao-Huang in #186
- Fix the slow inference speed bug in qwen AutoGPTQ by @Jintao-Huang in #187
- Support self cognition by @Jintao-Huang in #188
- update dataset model by @Jintao-Huang in #190
Full Changelog: v1.3.0...v1.4.0
v1.3.0 Release
English Version
New Features:
- Serving supported: LoRA and full-parameter training models are supported in vllm/chatglm.cpp/xinference deployment, check the documentation for details by
make docs
ordocs/source/GetStarted/Deployment.md
file. - Support training and inference with CLI and inference with Web-UI.
New Adapters:
- QALoRA
- Long-LoRA
- ROME
New Models:
- xverse-65b
- yi-6b
- ziya2-13b
- ziya2-13b-chat
- mistral-7b
- openbuddy-mistral-7b-chat
- mistral-7b-chat
- chatglm3-6b-base
- chatglm3-6b
- chatglm3-6b-32k
New Quantized Models:
- qwen-7b-chat-int4
- qwen-14b-chat-int4
- qwen-vl-chat-int4
- baichuan2-7b-chat-int4
- baichuan2-13b-chat-int4
- qwen-7b-chat-int8
- qwen-14b-chat-int8
中文版
新功能:
- 支持部署:全参数训练及LoRA训练支持以vllm/chatglm.cpp/xinference方式进行部署,可以通过
make docs
生成官方文档或查看docs/source/GetStarted/Deployment.md
文件 - 支持CLI方式运行训练和推理,同时支持以Web-UI方式运行推理
新的Adapters:
- QALoRA
- Long-LoRA
- ROME
支持训练和推理的新模型:
- xverse-65b
- yi-6b
- ziya2-13b
- ziya2-13b-chat
- mistral-7b
- openbuddy-mistral-7b-chat
- mistral-7b-chat
- chatglm3-6b-base
- chatglm3-6b
- chatglm3-6b-32k
支持训练和推理的新量化模型:
- qwen-7b-chat-int4
- qwen-14b-chat-int4
- qwen-vl-chat-int4
- baichuan2-7b-chat-int4
- baichuan2-13b-chat-int4
- qwen-7b-chat-int8
- qwen-14b-chat-int8
Feature Commits
- add lint script by @tastelikefeet in #94
- add document by @Jintao-Huang in #103
- Update framework.txt by @zzclynn in #105
- Feat/deepspeed by @Jintao-Huang in #39
- update sh by @Jintao-Huang in #106
- Add baichuan2 13b sh by @Jintao-Huang in #108
- Support mistral 7b by @Jintao-Huang in #112
- Support int4 by @Jintao-Huang in #116
- Support qwen int8 by @Jintao-Huang in #117
- support qa_lora by @tastelikefeet in #104
- Add longlora for llama by @tastelikefeet in #115
- Support ROME by @tastelikefeet in #121
- Feat 1018 by @Jintao-Huang in #119
- Add script for ROME by @tastelikefeet in #123
- Update doc by @tastelikefeet in #125
- Feat 1028 by @Jintao-Huang in #122
- update web ui and swift cli by @Jintao-Huang in #126
- ResTuning readme by @jiangzeyinzi in #129
- Update doc by @tastelikefeet in #131
Bug Fix:
- Fix qwen bug by @Jintao-Huang in #98
- fix ci by @tastelikefeet in #109
- fix resume from checkpointing bug by @Jintao-Huang in #110
- fix openbuddy template by @Jintao-Huang in #113
- update ziya2 by @Jintao-Huang in #114
- Fix bug in ROME by @tastelikefeet in #128
- fix rome by @tastelikefeet in #133
New Contributors
- @zzclynn made their first contribution in #105
Full Changelog: v1.2.0...v1.3.0
v1.1.1 release
Features:
- Add
make docs
command to build docs - Add notebook examples for stable diffusion model
- Fix some bugs
v1.1.0
v1.1.0
What's Changed
Features
- Support qwen sft by @tastelikefeet
- Add quantization by @Jintao-Huang
- Add dataset and README_CN by @Jintao-Huang
- Add polylm-13b, openbuddy-llama2-13b by @Jintao-Huang
- Add models and update sh, readme by @Jintao-Huang
- llm add chat by @Jintao-Huang
- llm support multimodal by @Jintao-Huang
- Feat/multi round chat by @Jintao-Huang
- Add prompt feature by @jiangzeyinzi
- Add openbuddy llama2 by @Jintao-Huang
- Add feat: only save model by @Jintao-Huang
- Add baichuan2 by @Jintao-Huang
- Add bloom by @Jintao-Huang
- Add internlm by @Jintao-Huang
- Support restuning/side/lora-embedding/qlora, etc by @tastelikefeet
Improvements
- Update README.md by @yingdachen
- Fix sample by @tastelikefeet
- Add comment by @Jintao-Huang
- Update sh2 by @Jintao-Huang
Fixes
- Fix not saving the last ckpt bug by @Jintao-Huang
- Fix ddp rank0 memory bug by @Jintao-Huang
- Fix template bug by @Jintao-Huang
- Fix template bug2 by @Jintao-Huang
- Fix ddp bug by @Jintao-Huang
- Fix ddp+graidient_checkpointing bug by @Jintao-Huang
- Fix httperror by @weedwardzhao1
New Contributors
- @yingdachen made their first contribution in PR #2
- @tastelikefeet made their first contribution in PR #3
- @Jintao-Huang made their first contribution in PR #7
- @jiangzeyinzi made their first contribution in PR #38
- @weedwardzhao1 made their first contribution in PR #71
Full Changelog: https://github.com/modelscope/swift/commits/v1.1.0