Linker Scripts and Embedded Systems
I realized that every time I talk about linker scripts, I say that most programmers won’t ever see one unless they work with embedded systems, but I never go deeper than this. I did that again in Making of Doppelgänger and Crypto/Reverse or how to write your first crypter and it is time to finally give some details about such an statement. In this article, I’ll dive a little bit into this topic so you’ll understand why linker scripts are important in embedded systems and hopefully you’ll learn some other things along the way.
I have to say that I’m not an embedded systems engineer myself, and most of my experience comes from hobbyist platforms like Arduino or other cheap microcontrollers. So, I apologize in advance to the Gods of Embedded Systems for any foolish things I may say. However, I think that especially because my knowledge is limited, I may do a good job explaining the basics without going too deep… just because I can’t

Said that, let’s go!
ATmega328p
The ATmega328p is arguably one of the most well-known microcontrollers out there. The reason is that it was selected for the Arduino project many years ago for its easy-to-use boards. Any electronics hobbyist will likely have a couple of Arduino UNOs lying around or even some 328p chips on a breadboard with a clock and a push button to reset it. This microcontroller, even when pretty modest, has demonstrated its capabilities in thousands of projects worldwide, and what it does, it does very well.
For those of you who don’t know, a microcontroller is like a mini computer in a chip. It includes the CPU, the memory, and a whole bunch of peripherals all in one single chip. These devices usually have a Harvard architecture, which means they have separate address spaces for code and data. Compare this with the Von Neumann architecture used for your regular computer, in which a single addressing space for data and code is used. Each architecture has it pros and cons but that is beyond the scope of this article. We just need to know that microcontrollers usually have a Harvard architecture, because this will be relevant a bit later.
For reference, these are the main features of the ATmega328p:
- AVR CPU at up to 16 MHz (usually you can change it and reduce, for example, consumption)
- 32 KB of Flash memory for code
- 2 KB of RAM
- 1 KB of EEPROM
- Lots of peripherals, timers, GPIO (General Purpose Input Output), ADC (Analog to Digital Converters), UARTs (Universal Asynchronous Receiver-Trasmitter or serial), DACs (Digital to Analog Converters), PWM (Pulse Width Modulation)…
Yes, those are kilobytes. Compare the 2 KB of RAM of this little guy with the GBs of memory of your computer. We’re working at a completely different scale here, many orders of magnitude below.
Code, Data, and Firmware
Let’s take a closer look at the memory in this chip.
First, we have 32KB of Flash. Flash is a special type of non-volatile memory; once written, it keeps its information even when the power is off. This is where the code of our program will end up. You might think, “That’s great, why don’t I use this for data, too?” Well, Flash memory has a drawback: reading is okay, but writing takes a while. So, it’s fine to flash (now you know where the word comes from) your microcontroller with code, but it’s not feasible to do that process to store data. Actually, depending on the microcontroller, you may even need a special hardware known as a programmer to flash the device. Also, some of those memories have to be written in one go, or at least in big blocks, so we need another type of memory to store data like, for instance, variables.
For that, we have RAM, which is the same as what you have in your computer. Well, actually, this is static RAM, and your computer uses dynamic RAM, but that difference doesn’t matter for our current discussion. What’s important is that RAM is on a separate bus, which means it’s impossible to execute the contents of the RAM as code. Think about it as a different chip inside the microcontroller.
Finally, there’s EEPROM (Electrically Erasable Programmable Read-Only Memory… what a name, eh?). This is something in between Flash and RAM. It’s not as difficult to write to as Flash, but it’s too slow to be used as regular RAM. EEPROM is normally used to store configuration values or information that may change from time to time but not very often, and need to survive power cycling. Usually, you can’t just write to an EEPROM address; you have to call some kind of function. Before EEPROMs, there were EPROMs, the same thing, but they couldn’t be erased with electricity, and UV light was used to delete the chip. Those chips usually has a small window in the encapsulation for that purpose. You can clearly see the advantage of using EEPROMs instead

So, what is the firmware you’ve heard about many times? Basically, it’s the code and data that will end up in Flash memory, and that’s why people usually refer to the process of writing the firmware to a Flash memory as “flashing the device.” The concept is the same for your router, phone, and other devices. The hardware will be different, but in the end, you are writing to some kind of Flash memory in the device.
But What Is in the Firmware?
According to what we’ve said so far, the firmware will contain the code, right? Well, indeed, the code will end up there together with… some data. The read-only data, like messages or numeric constants, shall end up in Flash as well as the constants used to initialize data. Those initialization values are copied over from Flash to RAM during the startup process.
This may be a bit confusing, let’s see an example. Imagine that your program has a global variable like:
int foo = 42;
The value 42 is a constant used to initialize a variable, so it needs to be stored somewhere. That’s the Flash memory. However, the variable foo will live in RAM while the application is executed. Whenever the microcontroller executes an instruction to read some data in a register, that data will be retrieved from RAM, not from Flash; so the variable has to live there. Remember the Harvard architecture. Also, if we want to write some value into the variable, that value will definitely end up accessing RAM, because the processor can only write data there.
However, as RAM doesn’t keep the values when the microcontroller is off, when we need a default value for a variable, we have to store it in Flash and have some startup code to copy those default values into RAM before the actual program starts.
Remember that I told you that the ATMega328p has a Harvard architecture and you cannot mix data and code? You may think that I’m fooling you now, but no, things in reality are always a little bit more complicated. That code and data separation is still true, but the microcontroller has some special instructions to read the Flash content, i.e. LPM, something different from the way the program usually accesses variables, .i.e. LD.
So, for a simple embedded system like an ATMega328p, the firmware is composed of the code and the initialization/constant values needed by the application, and that is what gets flashed into the device.
Using Regular Tools for Embedded Systems
The Arduino IDE uses avr-gcc to compile our code for our Arduino boards (or any other of the hundreds of supported boards by the platform). avr-gcc is a port of GCC for AVR microcontrollers and comes with its own toolchain, including all the usual tools: gcc, objdump, ld, etc. And it uses the ELF file format, just like your Linux box. However, an ELF binary cannot be directly executed by the microcontroller, so there are some extra steps to perform before getting a flasheable firmware.
And here’s where the linker script gets into the scene. Sorry for the long introduction, but I really think it’s needed.
Depending on the version of the Arduino IDE you have, the location of the tools may change, but usually, it deploys all the required programs under the home folder of your user, in a folder named .arduinoXX or something similar. There you’ll find all the required tools to work with your embedded device, including avr-gcc and avrdude for the AVR-based boards.
Navigating that folder deep down, you’ll find a folder named ldscripts that contains the linker scripts for different types of boards. For the ATMega328p, it uses a file named avr5.x. This linker script will layout all the C code in structures that can be later used to flash the device.
Let’s take a look at the main elements.
AVR5.x Linker Script
The file starts with several constants used along the script defining values that need to be used often. I’ll skip that, as the names of the constants are self-explanatory and you’ll know what they mean whenever you see them.
Then you find the memory layout, something you usually don’t need when linking code for your regular computer. In a sense, you can see this memory layout, somewhat like the segments you find in a regular ELF. Let’s take a look:
__DATA_REGION_ORIGIN__ = DEFINED(__DATA_REGION_ORIGIN__) ? __DATA_REGION_ORIGIN__ : 0x800060;
MEMORY
{
text (rx) : ORIGIN = 0, LENGTH = __TEXT_REGION_LENGTH__
data (rw!x) : ORIGIN = __DATA_REGION_ORIGIN__, LENGTH = __DATA_REGION_LENGTH__
eeprom (rw!x) : ORIGIN = 0x810000, LENGTH = __EEPROM_REGION_LENGTH__
fuse (rw!x) : ORIGIN = 0x820000, LENGTH = __FUSE_REGION_LENGTH__
lock (rw!x) : ORIGIN = 0x830000, LENGTH = __LOCK_REGION_LENGTH__
signature (rw!x) : ORIGIN = 0x840000, LENGTH = __SIGNATURE_REGION_LENGTH__
user_signatures (rw!x) : ORIGIN = 0x850000, LENGTH = __USER_SIGNATURE_REGION_LENGTH__
}
You can identify most of the memories we’d already mentioned. Yes, text is the flash, the one containing the code. You can see also a few more blocks for specific AVR stuff; however, even when avr-gcc provides ways to define this, things like the fuses or locks are done separately and not from the output of the linker. I’ve kept one of the constants, __DATA_REGION_ORIGIN__, as it’s relevant.
So, the text or flash memory starts at address 0. The next block is data that starts at 0x800060; however, that’s a virtual address that helps the linker to keep the different memory blocks (which are separated in the device) as one single linear block while it works out the different elements of the program. If you check the datasheet for the ATMega328p, there’s a section named “AVR Memories,” which talks about the three types of memory we’ve introduced. So, for the RAM, the first 0x60 bytes are reserved for the registers, and that’s why the address for .data is 0x800060. The 0x800000 will go away, and only the 60 will remain when the final binary is produced.
For the rest of the memory blocks, the linker script just keeps them separated (each one at its own virtual address) and all are offsetted at 0.
Sections
After the memory layout follows the sections. Here you’ll find many of the sections you also find in regular programs (after all, avr-gcc is gcc for AVR

). There’s a lot of information there, including special symbols you find in programs, but you’re usually not aware of. Anyway, we’ll just focus on the last part of the sections declarations.
For example, if you look at the .text section, it looks like this:
.text {
// Lots of stuff in here
} > text
The > text at the end means that all the contents between the curly brackets will end up in the text memory map, as defined earlier in the file (see previous section). This is pretty straightforward; we already know that the code will end up in the flash. The linker will start adding things and keep the count of the last address used so we can always build the different memory blocks incrementally. We’ll see this in a sec.
Now,let’s take a look at the data section.
.data :
{
PROVIDE (__data_start = .) ;
*(.data)
*(.data*)
*(.gnu.linkonce.d*)
*(.rodata) /* We need to include .rodata here if gcc is used */
*(.rodata*) /* with -fdata-sections. */
*(.gnu.linkonce.r*)
. = ALIGN(2);
_edata = . ;
PROVIDE (__data_end = .) ;
} > data AT> text
We can see how the data and .rodata sections end up in the data memory block, but what’s more important is the AT> text at the very end.
Virtual Address vs. Physical Address
If you had ever looked at the output of the program headers of an ELF file, you may have noticed the VirtAddr and the PhyAddr columns that always have the same value. And if you try to get information about that, you end up with some cryptic sentence saying that on systems where physical address are relevant… blah, blah.
NOTE
Here virtual address is used as contra-position to physical addresses in the sense that they do not really exist. Do not confuse with the virtual addresses in a computer that have a different meaning within the greater concept of OS memory management.
Well, there you go, the AT> in the declaration of the .data section before is the Physical Address of the ELF file, while the >data is the virtual address used by the linker to build the program. All the sections between the curly brackets will end up in the virtual address at 0x800060 and above, however, the physical address associated with them is attached at the end of the text memory block. That is, after the code that was introduced first. Remember I said that the linker keeps track of the size of each block so we can append stuff at the end of any section.
To illustrate this, I compiled a small program for the ATMega328p using the Arduino IDE and when passed the resulting .elf through readelf, getting the following:
Entry point 0x0
There are 3 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000094 0x00000000 0x00000000 0x001ec 0x001ec R E 0x2
LOAD 0x000280 0x00800100 0x000001ec 0x00006 0x00006 RW 0x1
LOAD 0x000286 0x00800106 0x00800106 0x00000 0x00029 RW 0x1
Section to Segment mapping:
Segment Sections...
00 .text
01 .data
02 .bss
Note how the .data section ends up in the second program header (memory block if you prefer) that has a virtual memory of 0x8000100 but the physical memory is just after the code that was in the .text section and ends up in the first Program Header. The start-up code will move whatever data in that segment that needs to end up in the RAM when the program starts, but firmware-wise, all that data has to, somehow, end up in the flash memory, which is the one that won’t be deleted after power off. Think of it as the hard-drive or persistent storage unit,
The .bss doesn’t really contain data, so the VirtAddr and PhyAddr are the same…
There you go, a case where physical addresses are relevant :).
Why all this mess?
You may be asking yourself this. I did. So I dig a bit deeper to understand why all these different addresses are involved. Won’t it be easier to just use the real address and that’s it?. Well, it turns out that this scheme makes much easier reuse all the already existing tools in many different ways. It will also allow us to use the same set of tools for many different architectures, leaving the specific details aside.
Overall, using an existing toolchain will bring us all the benefits of a probed and solid set of tools that is know to work efficiently. However, these toolchains doesn’t work out of the box with our microcontroller. They are designed to work in a unified address space where memory positions are laid linearly one after the other. In this schema the toolchain can perform serious optimizations, relocate code and move different sections around very easily (that is what it does for regular programs).
To get you in context. This is a simplified representation of the memory map on a regular computer (Von Neumann architecture) vs a microcontroller (Hardvard architecture).
+------------+ *
| Stack | *
~ ~ *
+------------+ * +------------+
| Free Space | * | Bootloader |
+------------+ * | |
~ ~ * ~ ~ +------------+
| Data | * +------------+ | Stack |
+------------+ * | rodata | ~ ~
~ ~ * | | | HEAP (opt) |
| Code | * | Code | | Data |
+------------+ * +------------+ +------------+
RAM FLASH RAM
So instead of a linear memory space, we have two parallel memory spaces where addresses overlaps. The address 0x100 may mean, Flash memory or RAM memory and they have to be accessed in different ways.
NOTE
You can have a heap implementingmallocandfreebut with 2Kb the RAM it is a bad idea to use it. The metadata required by each block (even for the simplest memory allocator you could write) can easily eat a lot of your scarce memory. We’ll talk about the booloader in a sec. Just note that it is located at the very end of the Flash memory. Also note that at the beginning of the Flash memory there are also some special data not just code.
So this is roughly how the program is build using standard tools.
- The compiler produces the right code to access the right memory areas. It’ll generate the right instructions to access SRAM or Flash depending on the data declaration. Sometimes it is necessary to use special instructions to be clear about which memory region will be used (see
PROGMEMorpgm_read_byteintrinsics). - The compiler will produce a relocation when needed. When it won’t generate code like, load variable1 into register r24 it’ll just emit a relocation saying. Hey linker, whenever it is time, fill in the position of variable1 in here. This way, variables can be moved around freely at the linker stage.
- Then the linker kicks in. It knows that variable1 lives, for example, at virtual address
0x800100(because all global variables ended up in the.datasection that is mapped in that address by the linker script), it knows that such an address represents physical RAM address0x100, so it will patch the code with the correct value using the relocation information produced by the compiler.
Hope you get the point of using standard tools and why all this virtual vs physical address is needed. Let’s continue.
Flashing the Device
When it comes down to flashing, the Virtual Addresses are not really needed, and the physical address are the ones that rule the flashing process. From the generated ELF, a raw binary file can be created extracting the relevant parts of the ELF file (for example, using objcopy).
Depending on the device and development environment, a bootloader may be added to the firmware. For the ATMega328p, this is usually placed at the end of the flash (see “Boot Loader Support” chapter in the datasheet), so you get a 32 Kb file (to fill the whole flash memory of the device) with your code and data (as per what we explained above) at the beginning, the bootloader at the end, and a lot of 0xff in between, which means… erase the flash memory in those positions.
The bootloader is a small program that hobbyist boards include for easy reprogramming. In the general case, once the flash is programmed, the microcontroller will run whatever code it was flashed with. If you want to change the program, you need to reflash it. The standard way to flash those microcontrollers requires the use of some especial HW and wiring (connecting specific pins in the microcontroller to specific pins in the programmer, or ground or Vcc).
The bootloader is a program that is executed at boot and waits for some signal (data on a serial port or some GPIO activation). If that is detected, it goes into a self-programming mode that allows the user to update the flash using, i.e.the serial port. There is nothing magical about the serial port; it’s just the bootloader that will read data from it and write it into the flash memory using another special instruction (SPM). Remember the Harvard architecture. These bootloaders are very convenient adding a lot of flexibility to the development cycles in exchange from some memory that becomes unavailable (it’s used by the bootloader). You can find detailed information on how to do all this in the ATMega328p datasheet.
The flashing is usually done by a different program, which, for the case of the ATMega328p (for AVR micrcontrollers in general), is avrdude. This program also allows us to set the fuses, program EEPROM, set lock bits, or install bootloaders in addition to flashing the flash. avrdude can also talk to the bootloader and is used by the Arduino IDE for programming the device—it’s just the wiring that’s different.
In case you do not know the fuses and locks are special memory areas on the microcontroller that controls the microcontroller configuration and allows to lock parts of the device respectively. Fuses allows us to configure things like the clock speed the microcontroller will work with, elements as the size of the bootloader (that defines where in the flash memory will end up), the way the device will boot (bootloader or main app), size of bootloader, configure the watchdog, debugging settings, etc… The locks allows us to set protection (read/write) on the program memory and the bootloader so people cannot easily read the code. This are critical values that when misconfigured can brick your device.
So, even when the compiler and the linker script provide capabilities to define all these elements (take a look at the original memory map at the beginning of the paper), those are programmed separately using avrdude, at least for this device. Other devices will have other specifics and work slightly differently, however, I hope you’ve got the overall view now on how all this stuff works and why embedded development is so cool.
SUMMARY
Well, hope I managed to clarify a little bit (honestly I’m not completely sure about that) how embedded systems development tools work and how they can be programmed using standard toolchains but with special linker scripts to exactly match the target device memory map. Be free to add any comment, correction or ask any further question in the comments. As I said, I’m not really an expert on this topic but I’ll try my best to answer

.
Original post by pico, from the 0x00sec forum.