Looking into Black Box Code Language Models

Date:

arXiv:2407.04868v1 Announce Type: new
Abstract: Language Models (LMs) have shown their application for tasks pertinent to code and several code~LMs have been proposed recently. The majority of the studies in this direction only focus on the improvements in performance of the LMs on different benchmarks, whereas LMs are considered black boxes. Besides this, a handful of works attempt to understand the role of attention layers in the code~LMs. Nonetheless, feed-forward layers remain under-explored which consist of two-thirds of a typical transformer model’s parameters.
In this work, we attempt to gain insights into the inner workings of code language models by examining the feed-forward layers. To conduct our investigations, we use two state-of-the-art code~LMs, Codegen-Mono and Ploycoder, and three widely used programming languages, Java, Go, and Python. We focus on examining the organization of stored concepts, the editability of these concepts, and the roles of different layers and input context size variations for output generation. Our empirical findings demonstrate that lower layers capture syntactic patterns while higher layers encode abstract concepts and semantics. We show concepts of interest can be edited within feed-forward layers without compromising code~LM performance. Additionally, we observe initial layers serve as “thinking” layers, while later layers are crucial for predicting subsequent code tokens. Furthermore, we discover earlier layers can accurately predict smaller contexts, but larger contexts need critical later layers’ contributions. We anticipate these findings will facilitate better understanding, debugging, and testing of code~LMs.

Share post:

Subscribe

spot_imgspot_img

Popular

More like this
Related

Microrobot 시스템

Artedrone은 카테터가 뇌졸중 환자의 혈전을 회수하는 데 도움이되는 자석과...

Mbodi AI는 Y 콤비네이터에서 출시되어 산업용 로봇을위한 구체화 된 AI 개발

Mbodi는 ABB Robotics와 같은 파트너와 협력하고 있습니다. 출처 :...

Orbit 5.0은 Boston Dynamics의 Spot Quadruped Robot에 기능을 추가합니다.

Spot Quadruped의 궤도 5.0은 AI를 사용하여 사이트 건강에 대한...

VR에서 더 나은 시간 동안 자신을 해킹하십시오

헤드셋 하드웨어와 사려 깊은 소프트웨어 디자인의 발전에도 불구하고 가상...