Looking into Black Box Code Language Models

Date:

arXiv:2407.04868v1 Announce Type: new
Abstract: Language Models (LMs) have shown their application for tasks pertinent to code and several code~LMs have been proposed recently. The majority of the studies in this direction only focus on the improvements in performance of the LMs on different benchmarks, whereas LMs are considered black boxes. Besides this, a handful of works attempt to understand the role of attention layers in the code~LMs. Nonetheless, feed-forward layers remain under-explored which consist of two-thirds of a typical transformer model’s parameters.
In this work, we attempt to gain insights into the inner workings of code language models by examining the feed-forward layers. To conduct our investigations, we use two state-of-the-art code~LMs, Codegen-Mono and Ploycoder, and three widely used programming languages, Java, Go, and Python. We focus on examining the organization of stored concepts, the editability of these concepts, and the roles of different layers and input context size variations for output generation. Our empirical findings demonstrate that lower layers capture syntactic patterns while higher layers encode abstract concepts and semantics. We show concepts of interest can be edited within feed-forward layers without compromising code~LM performance. Additionally, we observe initial layers serve as “thinking” layers, while later layers are crucial for predicting subsequent code tokens. Furthermore, we discover earlier layers can accurately predict smaller contexts, but larger contexts need critical later layers’ contributions. We anticipate these findings will facilitate better understanding, debugging, and testing of code~LMs.

Share post:

Subscribe

Popular

More like this
Related

모듈식 모터 및 기어박스로 제품 개발이 간편해집니다.

후원자: 맥슨의 Parvalux.경쟁에서 승리하려면 엔지니어는 개발 시간을 단축하고 제품...

Draganfly, 병원 드론 배달 개념 증명 비행 완료

Draganfly는 Brigham 장군의 개념 증명을 통해 드론이 의료 분야의...

2024년 기후 기술 상위 10개 스토리

2024년에는 기후변화에 대처하는 기술 전기를 생산하는 연을 타고 구름...

Microsoft의 AI 생태계가 Salesforce 및 AWS를 능가하는 방법

AI 에이전트 일반적으로 사람의 개입이 필요한 작업을 수행하도록 설계된...