diff --git a/README.md b/README.md index cb2f04a3bbb0892aec22d2c4a65827f43a2c18a9..bd380e7d99ecf6d42439041d8ab081915f3c7338 100644 --- a/README.md +++ b/README.md @@ -1,11 +1,20 @@ +[](README_en.md) [](README.md) + # DeepSpark开源社区 +
+ Homepage + + +
+
+ 在万物皆算的时代,各领域应用层出不穷,算力必须支撑实际应用,通用性和未来可扩展性是评估算力的重要指标。天数智芯作为国内头部通用GPU高端芯片及超级算力系统提供商,截止2024年12月,已成功支持 400+ AI算法模型,覆盖训练和推理,与 400+ 家客户和生态伙伴建立合作,共同促进国内通用算力的发展,产品服务于智慧城市、数字个人、医疗、教育、通信、能源等多个领域。 天数智芯本着“平台共建、生态共享、产业共赢”的原则,致力于和行业伙伴一起打造[DeepSpark开源社区](https://www.deepspark.org.cn/),以来自开源回馈开源的方式,汇聚社区力量,助力客户加速应用落地和收获算力赋能,促进产业生态的完善和发展。 -DeepSpark开源社区目前主要致力于[百大应用开放平台](#百大应用开放平台)的打造和推广。除此之外DeepSpark社区于2023年3月开源上线了适用于国产通用GPU[天垓100](https://www.iluvatar.com/productDetails?fullCode=cpjs-yj-xlxl-tg100)的CUDA应用程序调试工 -具[ixGDB](https://gitee.com/deep-spark/ixgdb)。将来会有更多相关的项目和成果通过DeepSpark社区开源。 +DeepSpark开源社区目前主要致力于[百大应用开放平台](#百大应用开放平台)的打造和推广。除此之外DeepSpark社区于2023年3月开源上线了适用于国产通用GPU[天垓100](https://www.iluvatar.com/productDetails?fullCode=cpjs-yj-xlxl-tg100)的CUDA应用程序调试工具[ixGDB](https://gitee.com/deep-spark/ixgdb)。将来会有更多相关的项目和成果通过DeepSpark社区开源。 2023年8月,DeepSpark开源社区与[上海白玉兰开源开放研究院](http://baiyulan.org.cn/)签署了战略合作协议,旨在进一步促进人工智能开源事业共建共享,推动产业生态的完善和发展。2023年11月,DeepSpark社区与[启智社区](https://openi.pcl.ac.cn/)开展合作,社区用户可通过启智云脑提供的[天垓100算力](https://openi.pcl.ac.cn/iluvatar/TianGai100)训练来自DeepSparkHub的模型。 @@ -23,15 +32,15 @@ DeepSpark开源社区目前主要致力于[百大应用开放平台](#百大应 [DeepSparkInference](https://gitee.com/deep-spark/deepsparkinference)精选基于国产推理引擎IGIE和IxRT的推理模型示例和指导文档,部分模型提供了基于国产通用GPU[智铠100](https://www.iluvatar.com/productDetails?fullCode=cpjs-yj-tlxltt-zk100)的评测结果。 -### 天数智算软件栈 IXUCA +### 天数智芯智算平台 IXUCA -天数智算软件栈兼容主流GPU通用计算模型,提供支持主流GPU通用计算模型的等效组件、特性、API和算法,可助力用户便捷地实现系统或应用的无痛迁移。天数智算软件栈包括人工智能深度学习应用、主流框架、函数库、编译器及工具、运行时库及驱动。 +IXUCA兼容主流GPU通用计算模型,提供支持主流GPU通用计算模型的等效组件、特性、API和算法,可助力用户便捷地实现系统或应用的无痛迁移。天数智算软件栈包括人工智能深度学习应用、主流框架、函数库、编译器及工具、运行时库及驱动。 -- 天数智算软件栈集成了TensorFlow、PyTorch、百度飞桨PaddlePaddle等国内外主流的深度学习框架,提供与官方开源框架一致的算子,并针对天数智芯加速卡持续优化性能。 +- IXUCA集成了TensorFlow、PyTorch、百度飞桨PaddlePaddle等国内外主流的深度学习框架,提供与官方开源框架一致的算子,并针对天数智芯加速卡持续优化性能。 -- 天数智算软件栈提供IGIE推理框架和IxRT推理引擎,支持在天数智芯加速卡上实现最优推理性能。 +- IXUCA提供IGIE推理框架和IxRT推理引擎,支持在天数智芯加速卡上实现最优推理性能。 -- 天数智算软件栈的函数库不仅支持通用计算还提供了深度学习应用开发所需的基础算子,开发者可以便捷地调用这些算子灵活地构造各类深度神经网络模型以及其他机器学习领域的算法。 +- IXUCA的函数库不仅支持通用计算还提供了深度学习应用开发所需的基础算子,开发者可以便捷地调用这些算子灵活地构造各类深度神经网络模型以及其他机器学习领域的算法。 您可前往天数智芯官方网站的[资源中心](https://support.iluvatar.com/#/ProductLine?id=2)获取天数智算软件栈。 @@ -79,7 +88,7 @@ DeepSpark开源社区目前主要致力于[百大应用开放平台](#百大应 | 显存占用📊 | 模型稳定训练时实际消耗的GPU平均显存占用量 | GPU实时状态检测工具 | 取多次的显存占用量的平均值 | | 稳定度🔧 | 多次完整训练(均达到收敛值)的收敛值的稳定程度 | DeepSpark模型训练脚本输出 | 采用5次达到标准收敛值的完整训练,取收敛值的中值做为基准值,其它值对比基准值的差值百分比有1次不在(-0.01,+0.01)范围内,稳定度则递减20% | - 参考信息:[硬件评测结果](#硬件评测方法和结果) +参考信息:[硬件评测结果](#硬件评测方法和结果) - 1️⃣键式部署:全自动✅ 、数据可复现🔁、场景可寻源🔎 @@ -119,8 +128,6 @@ DeepSpark开源社区目前主要致力于[百大应用开放平台](#百大应 | 语音语义 | Tacotron2 | score(MOS):4.460 | sdk2.2,bs:128,8x,amp | 77 | 4.46 | 128\*8 | 0.96 | 18.4\*8 | 1 | | 新兴模型 | Wave-MLP | 80.1 | sdk2.2,bs:256,8x,fp32 | 1026 | 83.1 | 198\*8 | 0.98 | 29.4\*8 | 1 | -各维度说明,请见[评测体系](#评测体系)。 - -------- ## 社区 diff --git a/README_en.md b/README_en.md new file mode 100644 index 0000000000000000000000000000000000000000..6e1f743206eeb3c54746e33b6a58c4a06f096566 --- /dev/null +++ b/README_en.md @@ -0,0 +1,148 @@ +[](README_en.md) [](README.md) + +# DeepSpark Open Source Community + +
+ Homepage + + +
+
+ +In the time when everything is computable, applications in various fields are emerging rapidly. Computing power must support practical applications, with versatility and future scalability being crucial metrics for evaluating computing capabilities. As a leading domestic provider of high-end GPGPU chips and supercomputing systems, the Iluvatar CoreX has successfully supported 400+ AI algorithm models by December 2024. We have established collaborations with 400+ customers and ecosystem partners to jointly promote the development of domestic general-purpose computing power. Our products serve multiple fields including smart cities, digital individuals, healthcare, education, telecommunications, and energy. + +Guided by the principles of "co-building platforms, sharing ecosystems, and winning together in the industry," the Iluvatar CoreX is committed to collaborating with industry partners to establish the [DeepSpark Open Source Community](https://www.deepspark.org.cn/). By giving back to the open-source community through open-source contributions, we aim to gather community strength, help customers accelerate application deployment and benefit from computing power empowerment, and promote the improvement and development of the industry ecosystem. + +Currently, the DeepSpark Open Source Community is primarily focused on building and promoting the [Hundreds of Applications Open Platform](#hundreds-of-applications-open-platform). Additionally, in March 2023, the DeepSpark community open-sourced and launched [ixGDB](https://gitee.com/deep-spark/ixgdb), a CUDA application debugging tool compatible with the self-developed GPGPU [TianGai 100](https://www.iluvatar.com/productDetails?fullCode=cpjs-yj-xlxl-tg100). In the future, more related projects and achievements will be open-sourced through the DeepSpark community. + +In August 2023, the DeepSpark Open Source Community signed a strategic cooperation agreement with the [Shanghai Baiyulan Open Source Research Institute](http://baiyulan.org.cn/) to further promote the co-construction and sharing of AI open-source initiatives and drive the improvement and development of the industry ecosystem. In November 2023, the DeepSpark community collaborated with the [OpenI Community](https://openi.pcl.ac.cn/), enabling community users to train models from DeepSparkHub using the [TianGai 100 computing power](https://openi.pcl.ac.cn/iluvatar/TianGai100) provided by OpenI's cloud brain. + +We welcome industry partners, community users, and developers to contribute to the DeepSpark Open Source Community in any form. Your active participation is highly anticipated. + +-------- + +## Hundreds of Applications Open Platform + +As a leading domestic AI and general-purpose computing application development and evaluation platform, the Hundreds of Applications Open Platform carefully selects hundreds of open-source algorithms and models deeply integrated with industry applications. It supports mainstream ecosystem application frameworks and builds a multi-dimensional evaluation system tailored to industry needs, widely supporting various implementation scenarios. + +### Application Algorithms and Models + +[DeepSparkHub](https://gitee.com/deep-spark/deepsparkhub) selects hundreds of open-source application algorithms and models, covering various fields of AI and general-purpose computing. It supports mainstream intelligent computing scenarios in the market, including smart cities, digital individuals, healthcare, education, telecommunications, and energy. + +[DeepSparkInference](https://gitee.com/deep-spark/deepsparkinference) selects inference model examples and guidance documents based on the independant-developed inference engines IGIE and IxRT. Some models provide evaluation results based on the self-developed GPGPU [ZhiKai 100](https://www.iluvatar.com/productDetails?fullCode=cpjs-yj-tlxltt-zk100). + +### IXUCA (Iluvatar CoreX Unified Compute Architecture) + +IXUCA is compatible with mainstream GPGPU computing models, providing equivalent components, features, APIs, and algorithms that support mainstream GPU computing. It enables seamless migration of systems or applications with minimal effort. The IXUCA stack includes AI deep learning applications, mainstream frameworks, libraries, compilers and tools, as well as runtime libraries and drivers. + +- IXUCA integrates mainstream deep learning frameworks such as TensorFlow, PyTorch, and PaddlePaddle, delivering operators consistent with official open-source frameworks while continuously optimizing performance for Iluvatar CoreX acceleration cards. + +- IXUCA provides the IGIE inference framework and IxRT inference engine, enabling optimal inference performance on Iluvatar CoreX acceleration cards. + +- The libraries in IXUCA not only support general-purpose computing but also provide fundamental operators required for deep learning application development. Developers can conveniently utilize these operators to flexibly construct various deep neural network models and other machine learning algorithms. + +You can visit the [Resource Center](https://support.iluvatar.com/#/ProductLine?id=2) on Iluvatar CoreX's official website to obtain the IXUCA software stack. + +![IXUCA](resources/Iluvatar_stack.png) + +### Application Frameworks + +The Hundreds of Applications Open Platform supports mainstream application frameworks and toolkits both domestically and internationally. + + + + + + + + + + + + + + + + + + + + + + +
+ +### Multi-Dimensional Benchmark Standards + +The benchmark standards are widely applicable to hardware platforms, featuring a comprehensive system and simple deployment. + +- Support 6️⃣ dimensions + +| Dimension | Description | Data Source | Calculation Method | +|-------------|-----------------------------------------------------------------------|----------------------------------|--------------------------------------------------------------------------------------| +| Speed🚀 | Computing power per second for stable model training samples | DeepSpark training script output | Remove highest/lowest of 5 iterations, take mean of middle 3 values | +| Accuracy🎯 | Model convergence accuracy value | DeepSpark training script output | Record accuracy value at convergence | +| Linearity📈 | Linear scaling performance for cluster training (card/node linearity) | DeepSpark training script output | Multi-card/node speed divided by card/node count, compared to single-card/node speed | +| Power🔌 | Average GPU power consumption during stable training | GPU monitoring tool | Average of multiple power measurements | +| Memory📊 | Average GPU memory usage during stable training | GPU monitoring tool | Average of multiple memory measurements | +| Stability🔧 | Convergence value stability across multiple full training runs | DeepSpark training script output | 5 full training runs, median as baseline, 20% deduction if any value deviates by ±1% | + +Reference: [Hardware Benchmark Results](#hardware-evaluation-methods-and-results) + +- 1️⃣-click deployment: Fully automated ✅, reproducible data 🔁, traceable scenarios 🔎 + +- 0️⃣ platform dependencies: No framework restrictions, no source language restrictions, no hardware restrictions + +### Multi-Dimensional Benchmark System + +[Multi-Dimensional Benchmark System](https://mdb.deepspark.org.cn:8086) is an online evaluation tool developed based on the [Multi-Dimensional Benchmark Standards](#multi-dimensional-benchmark-standards). It conducts model training evaluations on BI-V100 and NV-V100 accelerator cards under equal conditions across six dimensions (speed, accuracy, linearity, power efficiency, memory efficiency, and stability), collects metrics, and displays them in six-dimensional radar charts, enabling users to comprehensively compare and evaluate the overall capabilities of GPU accelerators. Below is the list of currently supported models: + +![training model list](evaluation/Iluvatar/assets/mdb_model_list_1.png) + +For usage details, please refer to the [Multi-Dimensional Benchmark System User Guide](evaluation/Iluvatar/Mdims-benchmark.md). + +-------- + +### Hardware evaluation methods and results + +#### TianGai 100 GPGPU + +For evaluation methods, please refer to the [TianGai 100 Six-Dimension Benchmark Guide](evaluation/Iluvatar/six_dimension_howto.md). + +The results is as below: + +| Task | Model | Convergence Metric | Configuration(x->gpus) | Speed | Accuracy | Power(W) | Linearity | Memory Usage(GB) | Stability | +|-----------------------|------------|--------------------|------------------------|--------|----------|----------|-----------|------------------|-----------| +| NLP | BERT-large | 0.72 | sdk2.2,bs:32,8x,amp | 214 | 0.72 | 152*8 | 0.96 | 20.3*8 | 1 | +| Recommendation System | DLRM | AUC:0.75 | sdk2.2,bs:2048,8x,amp | 793486 | 0.75 | 60*8 | 0.97 | 3.7*8 | 1 | +| Image Classification | ResNet50 | top1 75.9% | sdk2.2,bs:512,8x,amp | 5221 | 76.43% | 128*8 | 0.97 | 29.1*8 | 1 | +| Image Segmentation | 3D U-Net | 0.908 | sdk2.2,bs:4,8x,fp32 | 12 | 0.908 | 152*8 | 0.85 | 19.6*8 | 1 | +| Object Detection | YOLOv5 | mAP:0.5 | sdk2.2,bs:128,8x,amp | 1228 | 0.56 | 140*8 | 0.92 | 27.3*8 | 1 | +| Text Detection | SATRN | 0.841 | sdk2.2,bs:128,8x,fp32 | 630 | 88.4 | 166*8 | 0.98 | 28.5*8 | 1 | +| Speech Recognition | Conformer | 3.72 | sdk2.2,bs:32,8x,fp32 | 380 | 4.79 | 113*8 | 0.82 | 21.5*8 | 1 | +| 3D Reconstruction | ngp-nerf | 0.0046 | sdk2.2,bs:1,8x,amp | 10 | 19.6 | 82*8 | 0.90 | 28.1*8 | 1 | +| Object Tracking | FairMOT | MOTA:69.8 | sdk2.2,bs:64,8x,fp32 | 52 | 69.8 | 132*8 | 0.97 | 19.1*8 | 1 | +| Large Model | CPM | 0.91 | sdk2.2,bs:128,8x,amp | 357 | 0.91 | 156*8 | 0.93 | 20.6*8 | 1 | +| Speech Synthesis | Tacotron2 | score(MOS):4.460 | sdk2.2,bs:128,8x,amp | 77 | 4.46 | 128*8 | 0.96 | 18.4*8 | 1 | +| New Model | Wave-MLP | 80.1 | sdk2.2,bs:256,8x,fp32 | 1026 | 83.1 | 198*8 | 0.98 | 29.4*8 | 1 | + +-------- + +## Community + +### Code of Conduct + +See [Code of Conduct](CODE_OF_CONDUCT.md). + +### Contact + +Contact . + +### Contribution + +Refer to each project's Contributing Guidelines. + +### License + +[Apache License 2.0](LICENSE). diff --git a/evaluation/Iluvatar/assets/mdb_model_list_1.png b/evaluation/Iluvatar/assets/mdb_model_list_1.png index 81e9c523487652498679d49c6f3abc98254b21ce..703340b1663ad36e22bd4985bc549e603e440090 100644 Binary files a/evaluation/Iluvatar/assets/mdb_model_list_1.png and b/evaluation/Iluvatar/assets/mdb_model_list_1.png differ diff --git a/resources/Iluvatar_stack.png b/resources/Iluvatar_stack.png index 0a4e443f1ee7240d1c6d3d299ef14c255877c37a..5995ce5089eb7d8b6e66094574297093f42646b0 100644 Binary files a/resources/Iluvatar_stack.png and b/resources/Iluvatar_stack.png differ