Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Upon completing the design and training phases, deploying a deep learning model to specific hardware becomes necessary prior to its implementation in practical applications. To enhance the performance of the model, the developers must optimize it to decrease inference latency. Auto-scheduling, an automated approach that generates optimization schemes, offers a feasible option for large-scale auto-deployment. Nevertheless, the low-level code generated by auto-scheduling closely resembles hardware coding and may present challenges for human comprehension, thereby hindering future manual optimization efforts. In this study, we introduce ASight, a visual analytics system to assist engineers in identifying performance bottlenecks, comprehending the auto-generated low-level code, and obtaining insights from auto-scheduling optimizations. We develop a subgraph matching algorithm capable of identifying graph isomorphism among Intermediate Representations to track performance bottlenecks from low-level metrics to high-level computational graphs. To address the substantial profiling metrics involved in auto-scheduling and derive optimization design principles by summarizing commonalities among auto-scheduling optimizations, we propose an enhanced visualization for the large search space of auto-scheduling. We validate the effectiveness of ASight through two case studies, one focused on a local machine and the other on a data center, along with a quantitative experiment exploring optimization design principles.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TVCG.2025.3574194DOI Listing

Publication Analysis

Top Keywords

auto-scheduling optimizations
12
visual analytics
8
low-level code
8
performance bottlenecks
8
optimization design
8
design principles
8
auto-scheduling
7
asight fine-tuning
4
fine-tuning auto-scheduling
4
optimizations model
4

Similar Publications

Upon completing the design and training phases, deploying a deep learning model to specific hardware becomes necessary prior to its implementation in practical applications. To enhance the performance of the model, the developers must optimize it to decrease inference latency. Auto-scheduling, an automated approach that generates optimization schemes, offers a feasible option for large-scale auto-deployment.

View Article and Find Full Text PDF