# gpt2-ml
**Repository Path**: gapyanpeng/gpt2-ml
## Basic Information
- **Project Name**: gpt2-ml
- **Description**: GPT2 for Multiple Languages
- **Primary Language**: Python
- **License**: Apache-2.0
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 2
- **Created**: 2020-02-29
- **Last Updated**: 2021-11-11
## Categories & Tags
**Categories**: machine-learning
**Tags**: None
## README
# **GPT2** for Multiple Languages
[](https://colab.research.google.com/github/imcaspar/gpt2-ml/blob/master/pretrained_model_demo.ipynb)
[](https://github.com/imcaspar/gpt2-ml)
[](https://github.com/imcaspar/gpt2-ml/releases)
[](https://github.com/imcaspar/gpt2-ml/issues)
[](https://github.com/imcaspar/gpt2-ml)
[**中文说明**](./README_CN.md) | [**English**](./README.md)
- [x] 简化整理 GPT2 训练代码(based on Grover, supporting TPUs)
- [x] 移植 bert tokenizer,添加多语言支持
- [x] 15亿参数 GPT2 中文预训练模型( 15G 语料,训练 10w 步 )
- [x] 开箱即用的模型生成效果 demo [#](https://github.com/imcaspar/gpt2-ml#google-colab)
- [ ] 15亿参数 GPT2 中文预训练模型( 50G 语料,训练 100w 步 ,**预计 12 月初发布**)
## 预训练模型
15 亿参数中文预训练模型 [**[Google Drive 下载]**](https://drive.google.com/file/d/1IzWpQ6I2IgfV7CldZvFJnZ9byNDZdO4n)
SHA256: 4a6e5124df8db7ac2bdd902e6191b807a6983a7f5d09fb10ce011f9a073b183e
训练语料来自 [THUCNews](http://thuctc.thunlp.org/#%E4%B8%AD%E6%96%87%E6%96%87%E6%9C%AC%E5%88%86%E7%B1%BB%E6%95%B0%E6%8D%AE%E9%9B%86THUCNews) 以及 [nlp_chinese_corpus](https://github.com/brightmart/nlp_chinese_corpus),清洗后总文本量约 15G
使用 [Cloud TPU Pod v3-256](https://cloud.google.com/tpu/docs/types-zones#types) 训练 10w 步

## Google Colab
只需两次鼠标点击(不包括 Colab 授权流程),体验 15 亿参数中文预训练模型生成效果:
[**[Colab Notebook]**](https://colab.research.google.com/github/imcaspar/gpt2-ml/blob/master/pretrained_model_demo.ipynb)
## 训练
## 免责声明
该项目中的内容仅供技术研究参考,不作为任何结论性依据。
## Citation
```
@misc{GPT2-ML,
author = {Zhibo Zhang},
title = {GPT2-ML: GPT-2 for Multiple Languages},
year = {2019},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/imcaspar/gpt2-ml}},
}
```
## Reference
https://github.com/google-research/bert
https://github.com/rowanz/grover
Research supported with Cloud TPUs from Google's TensorFlow Research Cloud (TFRC)