purple mountain lab
大语言模型GCG对抗攻击,支持攻击vicuna-7b-v1.5、vicuna-13b-v1.5、Llama-2-7b-chat-hf、Qwen-7B-Chat模型,内附新增模型适配指导
归档AI各类算法在开源基础上修改的自己线下实验代码。
后门攻击和防御方法
对抗样本检测方法LiBRe的源代码,LiBRe是一种轻量级贝叶斯对抗检测方法,通过将预训练DNN的最后几层转换为贝叶斯子模块(FADE变分),并利用预训练参数初始化进行微调,实现对多种对抗攻击的检测。该方法结合不确定性校正策略,无需对抗样本训练,在保持模型性能的同时高效检测对抗样本。
Please describe the organization's positioning / vision
Please attach the organization brochure
These companies or software are using our open source software:
Please send an application email to
If you feel that our open source software is helpful to you, please scan the QR code below to enjoy a cup of coffee.
Website:
Follow @aaa on Weibo
Email: