Software Optimization

About

As deep learning inference applications are increasing in embedded devices, an embedded device tends to equip neural processing units (NPUs) in addition to a multi-core CPU and a GPU. For fast and efficient development of deep learning applications, Software optimization technique is provided as the SDK for high-performance inference that delivers low latency and high-throughput for deep learning inference applications. We proposed various optimization parameters to accelerate a deep learning application with heterogeneous processors including multi-threading, pipelining, buffer assignment, and network duplication. Since the design space of allocating layers to diverse processing elements and optimizing other parameters is huge, we devise a parameter optimization methodology that consists of a heuristic for balancing pipeline stages among heterogeneous processors and fine-tuning process for optimizing parameters.

Publication

허정원, 김영진, 이지섭, 하순회, "N-Dolphin 임베디드 NPU를 사용한 3D 객체 탐지", 2024 한국컴퓨터종합학술대회 (KCC2024), Jun, 2024.

Choonghoon Park, Soonhoi Ha, "A Novel Throughput Enhancement Method for Deep Learning Applications on Mobile Devices With Heterogeneous Processors", IEEE Access, Vol. 12, pp. 38773-38785, Mar, 2024.

박충훈, 김장률, 하순회, "이기종 프로세서로 구성된 모바일 기기에서의 파이프라이닝을 통한 딥러닝 응용 처리량 향상", KIISE Transactions on Computing Practices, Vol. 29, No. 7, pp. 350-355, Jul, 2023.

Jangryul Kim, Jaewoo Son, Soonhoi Ha, "A Novel Technique to Support Deep Learning Applications in a Model-Based Embedded Software Design Methodology", IEEE Access, Vol. 11, pp. 54869-54880, Jun, 2023.

Jangryul Kim, Soonhoi Ha, "Energy-Aware Scenario-based Mapping of Deep Learning Applications onto Heterogeneous Processors under Real-time Constraints", IEEE Transactions on Computers, Vol. 72, Issue 6, Nov, 2022.

EunJin Jeong, Jangryul Kim, Soonhoi Ha, "TensorRT-based Framework and Optimization Methodology for Deep Learning Inference on Jetson Boards", ACM Transactions on Embedded Computing Systems, Vol. 21, Issue 5, Article No. 51, pp. 1-26, Sep, 2022.

박충훈, 김장률, 하순회, "모바일 기기에서의 파이프라이닝을 통한 딥 러닝 응용 처리량 향상", 2022 한국컴퓨터종합학술대회 (KCC2022), Jun, 2022.

EunJin Jeong, Jangryul Kim, Samnieng Tan, Jaeseong Lee, Soonhoi Ha, "Deep Learning Inference Parallelization on Heterogeneous Processors with TensorRT", IEEE Embedded Systems Letters, Vol. 14, Issue 1, pp. 15-18, Mar, 2022.

Samnieng Tan, Eunjin Jeong, Jangryul Kim, Jaeseong Lee, Soonhoi Ha, "Accelerating a Deep Learning Application by Parallelization and Pipelining on Heterogeneous Processors", KIISE Transactions on Computing Practices, Vol. 27, No. 10, pp. 497-502, Oct, 2021.

강두석, "Hardware-Aware Software Optimization Techniques for Convolutional Neural Networks on Embedded Systems", 서울대학교, Feb, 2021.

Samnieng Tan, Eunjin Jeong, Jangryul Kim, Jaeseong Lee, Soonhoi Ha, "Acceleration of Deep Learning Applications by Pipelining on NVIDA Jetson AGX Xavier", 2020 한국소프트웨어종합학술대회 (KSC2020), Dec, 2020.

오진우, 하순회, "이종 프로세서 환경에서의 복수의 딥 러닝 어플리케이션 스케줄링 기법", KIISE Transactions on Computing Practices, Vol. 26, Num. 7, pp. 303-311, Jul, 2020.

Duseok Kang, Jinwoo Oh, Jongwoo Choi, Youngmin Yi, Soonhoi Ha, "Scheduling of Deep Learning Applications onto Heterogeneous Processors in an Embedded Device", IEEE Access, Vol. 8, Mar, 2020.

한동식, 정승재, 홍혜선, 하순회, 장병탁, "이종 딥 뉴럴 네트워크와 태스크 그래프 인터페이스를 이용한 실시간 장면 서술", 2018 한국 군사과학 기술학회 추계학술대회, Nov, 2018.

Duseok Kang, Euiseok Kim, Inpyo Bae, Bernhard Egger, Soonhoi Ha, "C-GOOD: C-code Generation Framework for Optimized On-device Deep Learning", International Conference on Computer-Aided Design, Nov, 2018.

Duseok Kang, Jintaek Kang, Donghyun Kang, Sungjoo Yoo, Soonhoi Ha, "Joint Optimization of Speed, Accuracy, and Energy for Embedded Image Recognition Systems", Design Automation and Test in Europe, Mar, 2018.