Transformers in Uniform TC$^0$

David Chiang

Transformers in Uniform TC0

Abstract

Previous work has shown that the languages recognized by average-hard attention transformers (AHATs) and softmax-attention transformers (SMATs) are within the circuit complexity class TC0. However, these results assume limited-precision arithmetic: using floating-point numbers with O(log n) bits (where n is the length of the input string), Strobl showed that AHATs can be approximated in L-uniform TC0, and Merrill and Sabharwal showed that SMATs can be approximated in DLOGTIME-uniform TC0. Here, we improve these results, showing that AHATs with no approximation, SMATs with O(poly(n)) bits of floating-point precision, and SMATs with at most 2-O(poly(n)) absolute error are all in DLOGTIME-uniform TC0.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…