Back to Basics: Software Language Beyond Bytecode
February 07, 2024
Story
In our last article, we looked at two of the most basic concepts in software: variables and operators. Without these two integral parts of software “language” it’s nearly impossible to do anything. Now, we’re going to build on those simple concepts to see how they’re used to execute more complex functions and operations.
Welcome to Back to Basics, a series where we’re going to be reviewing basic engineering concepts that may require a more complex explanation than a quick Google search could provide.
But before you start plugging numbers into your variable boxes and making them do things, there’s an important layer to consider: abstraction.
I Promise it’s Art! It’s Just Abstract!
Previously, we focused mainly on base level machine code (binary) — the language that computers actually read. But that’s extraordinarily difficult for humans to read and write in. To make it easier for ourselves, we abstract that language into something we can actually understand.
Let’s look at an example of some code. This code could be as simple as the instructions “add 2+3”. At the most basic level, it might look a little something like this:
01100101 01110101 00001110 11001011 10010111 01100101 01110101 00001110 11001011 10010111 01100101 01110101 00001110 11001011
When a computer looks at software, all it sees are the zeroes and ones of binary code. As a software developer, we don’t really want to be writing code in binary, though. Miss a single one or zero, and EVERYTHING breaks. To avoid that, we abstract the language down a level. Take a look at this:
MOV 00000010, R1 MOV 00000011, R2 ADC R1, R2, R2
Slightly better, right? What happened here was we added a layer in between the code we write, and the code the computer sees. The above example is called bytecode. Bytecode works by using an interpreter to turn the “words” like the operators “MOV” and “ADC” into corresponding lines of binary.
With that, we’ve abstracted! Coding got slightly easier, but we lost something along the way — there are always trade offs. If we were writing just machine code, our software would be very, very tight. It would take no extra space in memory or time for it to run, because the computer can natively understand it. With the abstraction down to bytecode, we now have to take the time and space to have an interpreter.
Bytecode isn’t the furthest we can abstract… Nowadays, nobody really wants to be telling a computer what to do using mystical short lines of commands. If we want the computer to add two numbers, we want to just have to write 2+3.
To get there, we’ll have to abstract further:
int main()
{ int x = 2; int y = 3; x = x+y; printf(“%d”,x); }
Now that is fairly easy to understand, and we can clearly see what’s going on, right? Even if you’ve never coded a single line, you can likely guess what that piece of software would do.
This is C programming. C is a language that uses another layer of abstraction, called a compiler, that takes our very human-readable code and squeezes it through several processes to translate it into machine code.
Again, there’s a tradeoff. That code is now readable and much easier to write, but it requires a compiler, making it larger and slower than when we were just writing bytecode. Now, this isn’t to say that abstraction is bad. The tradeoff does exist, but the difference can be infinitesimally small, on a level that we can’t even perceive.
Through abstraction, we gain a more complicated, yet easier to write and understand, set of things that we can tell the computer to do, allowing for us to transcend using computers as simple calculators.
Taking Control of Your Calculator
When writing in a high-level programming language, we can use control structures that allow us to do more than just simple operations in order. Computers are, at their core, calculators, and they run instructions in the sequence provided to them.
Control structures allow you, as a programmer, to break the bounds of this one-after-the-other routine. The two main types of control structures are loops and switches.
Loops allow for an instruction or set of instructions to be run multiple times before going to the next line of code. An example of a loop would be one that adds x to y until y is larger than 15.
while(y<15){ y =x+y; }
Switches and “if” statements allow us to test whether something is true or false, or whether something matches, and then do something different depending on the result. For example, an “if” statement could test whether x = 5, and if it does, the computer adds x to y. Otherwise, it will just move on to the next line of code.
if(x=5){ y= x+y; }
By using combinations of these control structures, we can build complicated logical trees that tackle difficult and equally complex problems.
Congratulations! We’ve abstracted our way into modern high-level coding! Jackson Pollock could never. Now that we’re armed with easier-to-understand language and some control structures, we can move on in our next article to syntax and what makes languages different from each other.
This is part two of a series on Software Language Essentials, as part of our regular Back to Basics Feature. Click here for part one: Software Language Essentials.
Click here for more Back to Basics.