Support for peft 0.10.x version, with the default value of the tuner_backend parameter changed to peft. The interface of peft has been dynamically patched to support parameters like lora_dtype.
Support for vllm+lora inference.
Refactored and updated the README file.
Added English versions of the documentation. Currently, all documents have both English and Chinese versions.
Support for training 70B models using FSDP+QLoRA on dual 24GB GPUs. Script available at: https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/llama2_70b_chat/qlora_fsdp/sft.sh
Support for training agents and using the ModelScopeAgent framework. Documentation available at: https://github.com/modelscope/swift/blob/main/docs/source/LLM/Agent%E5%BE%AE%E8%B0%83%E6%9C%80%E4%BD%B3%E5%AE%9E%E8%B7%B5.md
Support for model evaluation and benchmark. Documentation available at: https://github.com/modelscope/swift/blob/main/docs/source/LLM/LLM%E8%AF%84%E6%B5%8B%E6%96%87%E6%A1%A3.md
Support for multi-task experiment management. Documentation available at: https://github.com/modelscope/swift/blob/main/docs/source/LLM/LLM%E5%AE%9E%E9%AA%8C%E6%96%87%E6%A1%A3.md
Support for GaLore training.
Support for training and inference of AQLM and AWQ quantized models.

New Models

MAMBA series models. Script available at: https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/mamba-1.4b/lora/sft.sh
DeepSeek VL series models. Documentation available at: https://github.com/modelscope/swift/blob/main/docs/source_en/Multi-Modal/deepseek-vl-best-practice.md
LLAVA series models. Documentation available at: https://github.com/modelscope/swift/blob/main/docs/source/Multi-Modal/llava%E6%9C%80%E4%BD%B3%E5%AE%9E%E8%B7%B5.md
TeleChat models. Script available at: https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/telechat_12b/lora/sft.sh
Grok-1 models. Documentation available at: https://github.com/modelscope/swift/blob/main/docs/source_en/LLM/Grok-1-best-practice.md
Qwen 1.5 MoE series models for training and inference.
dbrx models for training and inference. Script available at: https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/dbrx-instruct/lora_mp/sft.sh
Mengzi3 models for training and inference. Script available at: https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/mengzi3_13b_base/lora_ddp_ds/sft.sh
Xverse MoE models for training and inference. Script available at: https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/xverse_moe_a4_2b/lora/sft.sh
c4ai-command-r series models for training and inference.
MiniCPM series models for training and inference. Script available at: https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/minicpm_moe_8x2b/lora_ddp/sft.sh
Mixtral-8x22B-v0.1 models for training and inference. Script available at: https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/mixtral_moe_8x22b_v1/lora_ddp_ds/sft.sh

New Datasets

Support for the Ruozhiba dataset: https://github.com/modelscope/swift/blob/main/docs/source_en/LLM/Supported-models-datasets.md

What's Changed

Fix RsLoRA by @tastelikefeet in #567
Fix yi-vl merge lora by @Jintao-Huang in #568
Add doc for tuner module by @tastelikefeet in #571
update agent documentation by @tastelikefeet in #572
Update agent doc to fix some conflicts by @tastelikefeet in #573
support vllm lora by @Jintao-Huang in #565
Support llava by @Jintao-Huang in #577
fix app-ui max_length is None by @Jintao-Huang in #580
support train_dataset_mix_ds using custom_local_path by @Jintao-Huang in #582
Fix LRScheduler by @tastelikefeet in #586
compat with transformers==4.39 by @Jintao-Huang in #584
Fix weight saving by @tastelikefeet in #589
fix mix_dataset_sample float by @Jintao-Huang in #594
Refactor all docs by @tastelikefeet in #599
fix tiny bugs in docs by @tastelikefeet in #600
fix issue template and add a pr one by @tastelikefeet in #601
Fix/security template by @tastelikefeet in #603
update docs by @Jintao-Huang in #604
support Mistral-7b-v0.2 by @hjh0119 in #605
fix deploy safe_response by @Jintao-Huang in #614
Fix Adalora with devicemap by @tastelikefeet in #619
update ui by @tastelikefeet in #621
support TeleChat-12b by @hjh0119 in #607
fix save dir (additional_files) by @Jintao-Huang in #622
fix Telechat model by @hjh0119 in #623
Add Grok model by @tastelikefeet in #629
add missing files by @tastelikefeet in #631
support qwen1.5-moe model by @hjh0119 in #627
support Telechat-7b model by @hjh0119 in #630
support model Dbrx by @hjh0119 in #643
fix ui by @tastelikefeet in #648
fix typing hint by @Jintao-Huang in #649
support Mengzi-13b-base model by @hjh0119 in #646
support Qwen1.5-32b models by @hjh0119 in #655
fix plot error by @tastelikefeet in #651
Support FSDP + QLoRA by @tastelikefeet in #659
move fsdp config path by @tastelikefeet in #662
change the default value of ddp_backend by @tastelikefeet in #667
fix ui log by @tastelikefeet in #669
support Xverse-MoE model by @hjh0119 in #668
Support longlora for transformers 4.38 by @tastelikefeet in #456
add ruozhiba datasets by @tastelikefeet in #670
compatible with old versions of modelscope by @tastelikefeet in #671
Fix data_collator by @tastelikefeet in #674
[TorchAcc][Experimental] Integrate TorchAcc. by @baoleai in #647
update Agent best practice with Modelscope-Agent by @hjh0119 in #676
support c4ai-command-r model by @hjh0119 in #684
Support Eval by @tastelikefeet in #494
fix anchor by @tastelikefeet in #687
Fix/0412 by @tastelikefeet in #690
support minicpm and mixtral-moe model by @hjh0119 in #692
fix device_map 4 (qwen-vl) by @Jintao-Huang in #695
fix multimodal model image_mode = 'CMYK' (fix issue#677) by @Jintao-Huang in #697
feat(model): support minicpm-v-2(#699 ) by @YuzaChongyi in #699

New Contributors

@hjh0119 made their first contribution in #6...

Contributors

YuzaChongyi, Jintao-Huang, and 3 other contributors

Assets 2

09 Mar 07:54

Jintao-Huang

v1.7.0

0855df7

v1.7.0

New Features:

Added support for swift export, enabling awq-int4 quantization and gpt-int2,3,4,8 quantization. Models can be pushed to the Modelscope Hub. You can view the documentation here.
Enabled fine-tuning of awq quantized models.
Enabled fine-tuning of aqlm quantized models.
Added support for deploying LLM with infer_backend='pt'.
Added web-ui with task management and visualization of training loss, eval loss, etc. Inference is accelerated using VLLM.

New Tuners:

Lora+.
LlamaPro.

New Models:

qwen1.5 awq series.
gemma series.
yi-9b.
deepseek-math series.
internlm2-1_8b series.
openbuddy-mixtral-moe-7b-chat.
llama2 aqlm series.

New Datasets:

ms-bench-mini.
hh-rlhf-cn series.
disc-law-sft-zh, disc-med-sft-zh.
pileval.

What's Changed

Fix vllm==0.3 deploy bug by @Jintao-Huang in #412
Support deepseek math by @Jintao-Huang in #413
update support_vllm by @Jintao-Huang in #415
fix zero3 & swift lora by @Jintao-Huang in #416
Support peft0.8.0 by @tastelikefeet in #423
update readme by @Jintao-Huang in #426
fix pai open with 'a' by @Jintao-Huang in #430
default load_best_model_at_end=False by @Jintao-Huang in #432
support openbuddy mixtral by @Jintao-Huang in #437
support gemma by @Jintao-Huang in #441
Support ms bench mini by @Jintao-Huang in #442
Add roadmap and contributing doc by @tastelikefeet in #431
support peft format by @tastelikefeet in #438
update contributing.md by @Jintao-Huang in #446
fix link by @tastelikefeet in #447
Fix rlhf dataset by @tastelikefeet in #451
Add task management for webui by @tastelikefeet in #457
Support swift export by @Jintao-Huang in #455
Fix llm quantization docs by @Jintao-Huang in #458
fix get_vllm_engine bug by @Jintao-Huang in #463
use cpu export by @Jintao-Huang in #462
Fix llama2 generation config by @Jintao-Huang in #468
Support editing model_id_or_path by @tastelikefeet in #469
Support pt deploy by @Jintao-Huang in #467
Fix swift deploy bug by @Jintao-Huang in #470
fix deploy dep by @Jintao-Huang in #471
Support LLaMAPRO and LoRA+ by @tastelikefeet in #472
Support internlm2 1.8b by @Jintao-Huang in #473
fix deepseek moe device_map by @Jintao-Huang in #476
fix peft compatible bug by @tastelikefeet in #482
Fix deepspeed init bug by @Jintao-Huang in #481
fix scripts docs by @Jintao-Huang in #483
Update swift export and update docs by @Jintao-Huang in #484
support gptq export by @Jintao-Huang in #485
fix docs & readme by @Jintao-Huang in #486
fix app-ui bug by @Jintao-Huang in #488
Support peft0.9 by @tastelikefeet in #490
support torchrun_args for dpo cli and support web_ui model deployment by @slin000111 in #496
Support transformers 4.33.0 by @tastelikefeet in #498
Update deepspeed config by @Jintao-Huang in #500
move docs to classroom by @tastelikefeet in #503
Support yi 9b by @Jintao-Huang in #504
Update yi sh by @Jintao-Huang in #506

Full Changelog: v1.6.0...v1.7.0

Contributors

Jintao-Huang, tastelikefeet, and slin000111

Assets 2

21 Feb 08:09

Jintao-Huang

v1.6.1

dcf0a13

v1.6.1

New Models:

deepseek-math series

New Datasets:

sharegpt-gpt4-mini
disc-law-sft-zh
disc-med-sft-zh

Bug Fix

Fix vllm==0.3 & swift deploy bug.
Fix zero3 & swift lora bug.

Full Changelog: v1.6.0...v1.6.1

Assets 2

07 Feb 10:16

Jintao-Huang

v1.6.0

8e33c58

v1.6.0

New Features:

Agent Training
AIGC support: controlnet, controlnet_sdxl, dreambooth, text_to_image, text_to_image_sdxl
Compatibility with vllm==0.3.*

New Models:

qwen1.5 series
openbmb series

What's Changed

update openbmb sh by @Jintao-Huang in #361
Fix openbmb model name by @tastelikefeet in #362
support dpo cli and add examples controlnet and dreambooth by @slin000111 in #344
support openbmb minicpm by @Jintao-Huang in #364
Support agent training, etc. by @tastelikefeet in #352
fix tuner by @tastelikefeet in #365
Fix agent doc by @tastelikefeet in #366
Fix data format in readme by @tastelikefeet in #367
fix lazy_tokenize bug by @Jintao-Huang in #369
Fix length penalty by @Jintao-Huang in #371
fix loss by @tastelikefeet in #372
update compute loss by @Jintao-Huang in #375
fix system='' bug by @Jintao-Huang in #374
fix system='' bug by @Jintao-Huang in #378
Support PAI compat by @Jintao-Huang in #373
fix doc by @tastelikefeet in #376
Fix the conflict between agent and CT by @tastelikefeet in #379
fix cogagent_18b_chat sh typo error by @Jintao-Huang in #381
Fix loss scale by @tastelikefeet in #383
Feat/qwen1.5 by @tastelikefeet in #385
fix template name by @tastelikefeet in #389
update readme by @Jintao-Huang in #386
update readme by @Jintao-Huang in #390
Support max model len by @Jintao-Huang in #392
Support vllm max model len by @Jintao-Huang in #394
fix arguments bug by @Jintao-Huang in #395
support vllm 0.3 by @Jintao-Huang in #396
fix deepspeed_config_path bug by @Jintao-Huang in #398
fix file name by @slin000111 in #397
Add qwen1.5 scripts by @tastelikefeet in #393
fix many bugs by @Jintao-Huang in #399
Fix baichuan2 int4 bug by @Jintao-Huang in #400
Fix qwen1half deploy bug by @Jintao-Huang in #402
fix readme and test_llm by @tastelikefeet in #404
update readme by @Jintao-Huang in #405

Full Changelog: v1.5.4...v1.6.0

Contributors

Jintao-Huang, tastelikefeet, and slin000111

Assets 2

31 Jan 17:20

Jintao-Huang

v1.5.4

5eb8694

v1.5.4

New Features:

Default zero3.json file
Enhanced support for multi-modal models

New Models:

Orion series
Codefuse series
Internlm2-math series
Internlm2 series
Qwen2 series
Yi-vl series
Internlm-xcomposer2

What's Changed

Update orion 14b by @Jintao-Huang in #341
update codefuse series by @Jintao-Huang in #343
Support yi vl by @Jintao-Huang in #345
fix yi-vl finetune error by @Yimi81 in #347
Fix template encode bug by @Jintao-Huang in #348
Update internlm2 math by @Jintao-Huang in #349
Removing eos_token when doing inference. by @Jintao-Huang in #351
Support zero3 by @Jintao-Huang in #353
Support internlm xcomposer2 by @Jintao-Huang in #354
update qwen2 by @Jintao-Huang in #355
fix baichuan2 bug by @Jintao-Huang in #357
fix template_encode bug by @Jintao-Huang in #358
Fix issue 342 by @Jintao-Huang in #359
fix test_run.py by @Jintao-Huang in #360

New Contributors

@Yimi81 made their first contribution in #347

Full Changelog: v1.5.3...v1.5.4

Contributors

Jintao-Huang and Yimi81

Assets 2

09 Jan 13:42

tastelikefeet

v1.5.2

e1a6187

v1.5.2

English Version

Support show log in text box of web-ui
Support share=True in web-ui, only need to set WEBUI_SHARE=1 in environment variable
Support deactivate all adapters
Support more SFT arguments
Add longlora/qalora script
Support custom models in web-ui
ModelScope SWIFT studio released: https://www.modelscope.cn/studios/damo/Scalable-lightWeight-Infrastructure-for-Fine-Tuning/summary
Fix some bugs

中文版本

支持在web-ui中直接显示日志
支持share=True 仅需要在环境变量中设置WEBUI_SHARE=1
支持失活所有adapters
添加了更多SFT参数
添加了longlora/qalora的训练脚本
web-ui支持了自己注册的自定义模型
SWFT魔搭创空间上线了: https://www.modelscope.cn/studios/damo/Scalable-lightWeight-Infrastructure-for-Fine-Tuning/summary
修复了一些bug

What's Changed

fix chatglm3 template bug by @Jintao-Huang in #298
Support studio by @tastelikefeet in #300
fix text label by @tastelikefeet in #301
fix a bug may cause module on gpu throws error by @tastelikefeet in #302
fix_ziya_template_bug by @Jintao-Huang in #303

Full Changelog: v1.5.1...v1.5.2

Contributors

Jintao-Huang and tastelikefeet

Assets 2

07 Jan 12:36

tastelikefeet

v1.5.1

d9756cd

v1.5.1

English version

New Features

Support dtype settings in LoRA
Support deactivated tuners offloading
Support deployment with OpenAI format restful API
Make LongLoRA supports the latest llama2 code

新feature

支持LoRA设置dtype类型
支持将不使用的tuners offloading到cpu和meta设备上
支持OpenAI restful API方式的部署
LongLoRA支持最新的llama2代码

What's Changed

update docs by @tastelikefeet in #269
Update benchmark 0101 by @Jintao-Huang in #271
update benchmark by @Jintao-Huang in #276
Support dtype in lora by @tastelikefeet in #278
Support deploy by @Jintao-Huang in #275
update deploy client by @Jintao-Huang in #279
Support offload by @tastelikefeet in #281
fix tuner bug by @Jintao-Huang in #285
Support lora_modules_to_save by @Jintao-Huang in #284
update template by @Jintao-Huang in #286
support ModuleToSave original module offloading by @tastelikefeet in #282
Fix offload by @tastelikefeet in #288
fix scedit bug by @tastelikefeet in #290
fix bnb qwen bug by @Jintao-Huang in #289
update readme by @Jintao-Huang in #291
update readme by @Jintao-Huang in #292
Fix/longlora by @tastelikefeet in #294
support additional_trainable_parameters by @Jintao-Huang in #295
fix webui by @tastelikefeet in #296
fix trainer push to hub by @Jintao-Huang in #297

Full Changelog: v1.5.0...v1.5.1

Contributors

Jintao-Huang and tastelikefeet

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Features

New Models

New Datasets

What's Changed

New Contributors

Contributors

New Features:

New Tuners:

New Models:

New Datasets:

What's Changed

Contributors

New Models:

New Datasets:

Bug Fix

New Features:

New Models:

What's Changed

Contributors

New Features:

New Models:

What's Changed

New Contributors

Contributors

What's Changed

Contributors

English version

New Features

新feature

What's Changed

Contributors

Releases: modelscope/ms-swift

v2.0.5

v2.0.4

v2.0.3

v2.0.0

New Features

New Models

New Datasets

What's Changed

New Contributors

Contributors

v1.7.0

New Features:

New Tuners:

New Models:

New Datasets:

What's Changed

Contributors

v1.6.1

New Models:

New Datasets:

Bug Fix

v1.6.0

New Features:

New Models:

What's Changed

Contributors

v1.5.4

New Features:

New Models:

What's Changed

New Contributors

Contributors

v1.5.2

What's Changed

Contributors

v1.5.1

English version

New Features

新feature

What's Changed

Contributors