Beaglebone: What was that whole GCC thing about?
Hi ECE 231 students, remember how running code in earlier coding classes was as easy as just running: python code.py
? All the sudden, we have this mysterious gcc
command we feed our code instead, and have to provide these other arguments just to get another program that actually runs our code! Let's take a dive into what that command actually does behind the scenes, and why we may want to use C in the first place.
Let's play as GCC!
Consider the following code segment you may run on your beaglebone:
#include <stdio.h>
int main() {
/* Initialize some variables for led state, and file pointer. */
char led_state = '1';
FILE *led_pin = (FILE *)0x0;
/* Open the file, and return if the file cannot be opened. */
if ((led_pin = fopen("/sys/class/leds/beaglebone:green:usr3/brightness", "r+")) == (FILE *)0x0) return 1;
/* Write the led state to the file we opened. */
fwrite(&led_state, sizeof(led_state), 1, led_pin);
/* Close the file when we are done, as is good practice. */
fclose(led_pin);
/* Exit. */
return 0;
}
This code is relatively easy to understand, especially with the comments. This code uses the UNIX filesystem provided by the operating system running onboard the beaglebone to control a light on the device itself. The only problem, is that feeding this code directly to the processor on the beaglebone will not work! So, let's see what GCC does with this instead.
GCC's output
.LC0:
.string "r+"
.LC1:
.string "/sys/class/leds/beaglebone:green:usr3/brightness"
main:
stp x29, x30, [sp, -32]!
mov x29, sp
mov w0, 49
strb w0, [sp, 23]
str xzr, [sp, 24]
adrp x0, .LC0
add x1, x0, :lo12:.LC0
adrp x0, .LC1
add x0, x0, :lo12:.LC1
bl fopen
str x0, [sp, 24]
ldr x0, [sp, 24]
cmp x0, 0
cset w0, eq
and w0, w0, 255
and w0, w0, 1
cmp w0, 0
beq .L2
mov w0, 1
b .L4
.L2:
add x0, sp, 23
ldr x3, [sp, 24]
mov x2, 1
mov x1, 1
bl fwrite
ldr x0, [sp, 24]
bl fclose
mov w0, 0
.L4:
ldp x29, x30, [sp], 32
ret
Wow, things got hard to understand real quick! GCC has taken our human readable C code and translated it to something "machine readable", which we call assembly. In this case, this assembly is the syntax we would use for the ARM64 architecture, which is what the beaglebone uses.
Why did GCC do this? Well, the C code is a string of ASCII characters, which is a sequence of bytes, zeros and ones. This format is readable to us, but takes up a lot of space in bytes to be readable to us. The processor is designed to be as efficient as possible, so we have converted it to a format similar to a baking recipe, where the steps are small, but we have many of them. The advantage of this is that we can represent any line in our assembly code with a smaller string of bytes, that the processor could actually interpret.
For example, consider the line:
mov w0, 49
This line will ultimately be interpreted as the hex sequence 0x52800620
, which
is the binary 0b01010010100000000000011000100000
, which would actually be ran on our processor! This representation only took four bytes, which is pretty good!
Pretty cool, huh?