A breakdown and guide to the Commodore .PRG file format, and creating and running prg files on the VIC-20.
Commodore's .prg filetype is an executable file format used in the PET,
VIC-20, C64, and C128. It is a simple (BASIC?-Ha!) format, but the details of how a
.prg file is implemented on a particular Commodore system as well as
what the file format can and cannot do require some elaboration.
HEADER + PROGRAM
The gross anatomy of a Commodore .prg file is a two-byte header followed by
the sequence of bytes that comprise the body of the program.
HEADER: The two-byte header designates the memory location into which the program is to be loaded. This location is usually designated to be the start of the user's BASIC program RAM space. For the VIC-20 that address is 0x1001, and on the C64 it is 0x0801 (while the system's user RAM starts at 0x1000 or 0x0800, that low byte must remain a zero- value). When loading the program into main memory, the system skips the header and only loads the program byte sequence into memory starting at the designated memory location.
PROGRAM: By default, the program itself in the .prg is assumed to be a tokenized BASIC
program, but by using the SYS command the program can contain and run 6502 machine code. Thus a
.prg executable can be either a BASIC program, a 6502 assembly program[1], or a mix[2].
Once loaded, the .prg program can be run by typing RUN or
SYS [ADDR] into the system BASIC interpreter.
[1] Either: With a one-line BASIC Launch program, using RUN to launch;
or without, using SYS [START ADDR] instead of RUN to launch.
[2] Mixing assembly and BASIC requires a more nuanced understanding of the BASIC interpretor
and VIC-20 Kernal than is necessary to get into here.
Here is a breakdown of the hello-world.prg program produced by the simple
Hello World program
on this site.
NOTE: All 2-byte/ multi-byte values (e.g., for addresses) are in Little Endian format.
HEX DUMP
Hex dump of our simple hello-world.prg
00000000: 0110 0d10 0a00 9e28 3431 3131 2900 0000 .......(4111)...
00000010: 4c20 1048 454c 4c4f 2057 4f52 4c44 110d L .HELLO WORLD..
00000020: 00a2 008a 48bd 1210 f009 20d2 ff68 aae8 ....H..... ..h..
00000030: 4c22 10ea L"..
Header: The first two bytes 01 10 are the header, designating the memory address into which
the program will be loaded. Little-Endian format.
BASIC Launch Program: The next 14 bytes comprise a short BASIC program (tokenized) that launches
the machine language portion of the program. Adding these bytes to the beginning of a 6502 assembly
source code file effectively inserts this program at the start of your .prg file.
This 1-line BASIC program breaks down as follows:
BASIC: 10 SYS 4111
BYTES: 0d10 0a00 9e28 3431 3131 2900 0000
0x0d10 |
0x0a00 |
0x9e |
0x283431313129 |
0x00 |
0x0000 |
Pointer to addr of next BASIC line |
Line number of current BASIC line |
SYS command token |
PETSCII characters for BASIC command argument, here: (4111) |
Zero byte indicates end of current BASIC line |
Two consecutive zero bytes indicate end of BASIC program |
10 SYS 4111 tells the VIC-20 to being executing
6502 machine code starting at the address 4111, which is 0x100F. Consider that this BASIC
program begins at the address 0x1001 and comprises 14 bytes, ending at address 0x100E. This
means that the 6502 machine code begins as the next byte, 0x100F, which corresponds exactly
to our SYS 4111 call. In this way, we have successfully switched from executing a
BASIC program to a 6502 machine language program.
Machine Language Program: The machine language program is broken into two parts: data and instructions.
;== DATA =======================================================================
; Hello World + carr rtrn + cursor down + NUL term
DATA .BYTE $48,$45,$4C,$4C,$4F,$20,$57,$4F,$52,$4C,$44,$11,$0D,$00
;== MAIN =======================================================================
MAIN
LDX #$00 ; use X as offset
LOOP
TXA
PHA ; push X to stack
LDA DATA, X ; loads A w/ char
BEQ DONE ; If byte in A is zero, we're done string
JSR $FFD2 ; Addr of KERNAL CHROUT routine
PLA
TAX ; pull X off stack
INX ; increment X (offset into char data)
JMP LOOP
DONE
NOP
JMP DONE ; loop to keep msg on screen; RESTORE to quit
NOTE The 2-section layout here is merely a design choice. It has pros and cons. Modifications to the DATA section, such as adding or removing a byte, will alter any hardcoded addresses in the assembly code below; having to fetch data too far from instructions can result in more frequently crossing page boundaries. Then again, in 6502 assembly, moving things around and making changes necessarily can alter any hardcoded addresses, plus having data scattered throughout the file can be hard to maintain - so there is no real winning it. I like the consistency in sources of error if I stick to one particular schema.
The program itself is fairly straightforward: load the elements of an array (representing the petscii code for the chars in the string) using the X register as an offset (Addressing mode: Absolute,X). X is pushed and pulled from the stack at the top and bottom of each loop to preserve its value across calls to the PRINT_CHAR kernel subroutine.
Laslty, the program ends in an infinite loop to keep the message on the screen. Press RESTORE to exit/reset. The reason for not simply ended with an RTS is that upon control returning to the BASIC console with RTS, the screen is cleared (at least that is what my machine does).
Last updated Dec 2025