nav emailalert searchbtn searchbox tablepage yinyongbenwen piczone journalimg journalInfo searchdiv qikanlogo popupnotification paper paperNew
2024, 02, v.35 185-196
零知识证明中椭圆曲线运算的硬件加速方法研究
基金项目(Foundation): 北京邮电大学网络与交换技术全国重点实验室开放课题资助项目(No.SKLNST-2023-1-13)
邮箱(Email): lzq722@jiangnan.edu.cn;
DOI:
摘要:

目的:针对零知识证明协议纯软件部署存在的低时延、低功耗服务要求难以满足,以及硬件加速芯片协议适配性差和开发周期长的问题,提出了一种用于零知识证明的椭圆曲线点加计算流式计算架构。方法:实现了点加计算的硬件设计,对高位模运算设计了低时延、可扩展的硬件计算单元,在点加计算的各个计算阶段间规划数据流实现了流水设计,使用OpenCL与HLS,在基于FPGA的异构计算平台上,对不同规模的点乘、多标量乘法计算任务进行了软硬件协同加速。结果:在AMD Xilinx Alevo U50数据中心加速卡上,多标量乘法运算相比于AMD Ryzen 9 5900X(3.7 GHz)CPU单核及12核运行分别获得了41.5倍及3倍的加速比,硬件加速模块相比于纯软件方式获得了最高12.42倍的能效提升。结论:该计算架构有效提高了硬件资源利用率,降低了椭圆曲线运算时延以及功耗开销。

Abstract:

Aims: Aiming at the problems that the pure software deployment of the zero-knowledge proof protocol was difficult to meet the requirements of low latency and low power consumption services, as well as the poor protocol adaptability and long development cycle of the hardware acceleration chip, a streaming computing architecture for elliptic curve point addition computation for zero-knowledge proof was proposed. Methods: The point addition computation hardware was designed. A low-latency and scalable hardware computing unit was designed for high-position modulus operation; and a pipeline was designed by planning the data flow between each computing stage of point addition computation. OpenCL and HLS were used to conduct software-hardware collaborative acceleration for point multiplication and multi-scalar multiplication computing tasks of different scales on the FPGA-based heterogeneous computing platform. Results: On the AMD Xilinx Alevo U50 data center accelerator card, multiple scalar multiplication operations achieved 41.5 times and 3 times faster than single core and 12 core operations on the AMD Ryzen 9 5900X(3.7 GHz) CPU, respectively. The hardware acceleration module achieved a maximum energy efficiency improvement of 12.42 times compared with the pure software mode. Conclusions: This computing architecture effectively improves the utilization rate of hardware resources and reduces the latency and power consumption overhead of elliptic curve operations.

参考文献

[1] GOLDWASSER S,MICALI S,RACKOFF C.The knowledge complexity of interactive proof-systems[M]//Providing Sound Foundations for Cryptography:On the Work of Shafi Goldwasser and Silvio Micali.New York:ACM,2019:203-225.

[2] BITANSKY N,CANETTI R,CHIESA A,et al.From extractable collision resistance to succinct non-interactive arguments of knowledge,and back again[C]//Proceedings of the 3rd Innovations in Theoretical Computer Science Conference.New York:Association for Computing Machinery,2012:326-349.

[3] DELIGNAT-LAVAUD A,FOURNET C,KOHLWEISS M,et al.Cinderella:Turning Shabby X.509 certificates into elegant anonymous credentials with the magic of verifiable computation[C]//2016 IEEE Symposium on Security and Privacy (SP).New York:IEEE,2016:235-254.

[4] SASSON E B,CHIESA A,GARMAN C,et al.Zerocash:Decentralized anonymous payments from Bitcoin[C]//2014 IEEE Symposium on Security and Privacy.New York:IEEE,2014:459-474.

[5] HUANG H S,CHANG T S,WU J Y.A Secure file sharing system based on IPFS and blockchain[C]//Proceedings of the 2020 2nd International Electronics Communication Conference.New York:Association for Computing Machinery,2020:96-100.

[6] 厉贤斌,崔晨,翁理想,等.基于跨层连接的多通道DBiSAC网络欺凌检测模型[J].中国计量大学学报,2023,34(1):92-100.LI X B,CUI C,WENG L X,et al.A DBiASC cyberbullying detection model based on cross-layer connection[J].Journal of China University of Metrology,2023,34(1):92-100.

[7] 项倩红,陈烘,林华明.面向分布式环境的分层数据采集技术研究[J].中国计量大学学报,2021,32(1):74-82.XIANG Q H,CHEN H,LIN H M.Research on hierarchical data acquisition technology for distributed environment[J].Journal of China University of Metrology,2021,32(1):74-82.

[8] ZHANG Y,WANG S,ZHANG X,et al.Pipezk:Accelerating zero-knowledge proof with a pipelined architecture[C]//2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA).New York:IEEE,2021:416-428.

[9] PENG B O,ZHU Y,JING N,et al.Design of a hardware accelerator for zero-knowledge proof in blockchains[C]//International Conference on Smart Computing and Communication.Cham:Springer,2020:136-145.

[10] LU T,WEI C K,YU R J,et al.cuZK:Accelerating zero-knowledge proof with a faster parallel multi-scalar multiplication algorithm on GPUs[J].IACR Transactions on Cryptographic Hardware and Embedded Systems,2023(3):194-220.

[11] 黄平,梁伟洁.一种基于QAP问题的ZK-SNARK新协议[J].华南理工大学学报(自然科学版),2021,49(1):1-9.HUANG P,LIANG W J.A new ZK-SNARK protocol based on QAP problem[J].Journal of South China University of Technology(Natural Science Edition),2021,49(1):1-9.

[12] DANEZIS G,FOURNET C,KOHLWEISS M,et al.Pinocchio coin:Building zerocoin from a succinct pairing-based proof system[C]//Proceedings of the First ACM Workshop on Language Support for Privacy-Enhancing Technologies.New York:ACM,2013:27-30.

[13] GROTH J.On the size of pairing-based non-interactive arguments[C]//Annual International Conference on The Theory and Applications of Cryptographic Techniques.New York:Springer,2016:305-326.

[14] FILECOIN.Bellman:Zk-snark Library[EB/OL].(2018-01-01)[2023-12-01].https://github.com/zkcrypto/bellman.

[15] HANKERSON D,MENEZES A J,VANSTONE S.Guide to Elliptic Curve Cryptography[M].New York:Springer,2006:72-152.

[16] JAVEED K,WANG X.Low latency flexible FPGA implementation of point multiplication on elliptic curves over GF (p)[J].International Journal of Circuit Theory and Applications,2017,45(2):214-228.

[17] 车光宁,张钊锋.GF(2m)域上的低功耗可配置ECC点乘算法ASIC设计实现[J].微电子学与计算机,2018,35(1):15-20.CHE G N,ZHANG Z F.Low power consumption configurable ECC dot multiplication algorithm in GF (2m) domain is designed and implemented[J].Microelectronics & Computer,2018,35(1):15-20.

[18] MONTGOMERY P L.Modular multiplication without trial division[J].Mathematics of Computation,1985,44(170):519-521.

[19] KARATSUBA A.Multiplication of multidigit numbers on automata[J].Soviet Physics Doklady,1963,7:595-596.

[20] PIPPENGER N.On the evaluation of powers and related problems[C]//17th Annual Symposium on Foundations of Computer Science (sfcs 1976).Houston,TX,USA:IEEE,1976:258-263.

[21] 戴紫彬,易肃汶,李伟,等.椭圆曲线密码处理器的高效并行处理架构研究与设计[J].电子与信息学报,2017,39(10):2487-2494.DAI Z B,YI S W,LI W,et al.Research and design of efficient parallel processing architecture for elliptic curve cryptoprocessors[J].Journal of Electronics & Information Technology,2017,39(10):2487-2494.

[22] 杨晓辉,戴紫彬,李淼,等.面向椭圆曲线密码的处理器并行体系结构研究与设计[J].通信学报,2011,32(5):70-77.YANG X H,DAI Z B,LI M,et al.Research and design of processor parallel architecture for elliptic curve cryptography[J].Journal on Communications,2011,32(5):70-77.

[23] NIASAR M B,El KHATIB R,AZARDERAKHSH R,et al.Fast,small,and area-time efficient architectures for key-exchange on Curve25519[C]//2020 IEEE 27th Symposium on Computer Arithmetic (ARITH).Portland,OR,USA:IEEE,2020:72-79.

[24] SALARIFARD R,BAYAT-SARMADI S.An efficient low-latency point-multiplication over curve25519[J].IEEE Transactions on Circuits and Systems I:Regular Papers,2019,66(10):3854-3862.

[25] WU G M,HE Q W,JIANG J L,et al.A high-performance hardware architecture for ECC point multiplication over Curve25519[C]//2022 IEEE 30th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).New York:IEEE,2022:1-9.

基本信息:

DOI:

中图分类号:TP309.7

引用信息:

[1]丁冬,李正权.零知识证明中椭圆曲线运算的硬件加速方法研究[J].中国计量大学学报,2024,35(02):185-196.

基金信息:

北京邮电大学网络与交换技术全国重点实验室开放课题资助项目(No.SKLNST-2023-1-13)

检 索 高级检索

引用

GB/T 7714-2015 格式引文
MLA格式引文
APA格式引文