Dissecting RISC-V Performance: Practical PMU Profiling and Hardware-Agnostic Roofline Analysis on Emerging Platforms

Abstract

As RISC-V architectures proliferate across embedded and high-performance domains, developers face persistent challenges in performance optimization due to fragmented tooling, immature hardware features, and platform-specific defects. This paper delivers a pragmatic methodology for extracting actionable performance insights on RISC-V systems, even under constrained or unreliable hardware conditions. We present a workaround to circumvent hardware bugs in one of the popular RISC-V implementations, enabling robust event sampling. For memory-compute bottleneck analysis, we introduce compiler-driven Roofline tooling that operates without hardware PMU dependencies, leveraging LLVM-based instrumentation to derive operational intensity and throughput metrics directly from application IR. Our open source toolchain automates these workarounds, unifying PMU data correction and compiler-guided Roofline construction into a single workflow.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…