This is only a preview of the May 2025 issue of Practical Electronics. You can view 0 of the 80 pages in the full issue. Articles in this series:
Items relevant to "The Skill Tester 9000, part one":
Articles in this series:
Articles in this series:
Articles in this series:
Articles in this series:
Articles in this series:
Articles in this series:
|
MAX’S COOL BEANS
By Max the Magnificent
WEIRD & WONDERFUL
ARDUINO PROJECTS
Part 5: creating & optimising display algorithms
I
’m a big fan of the black-and-white
movies of yesteryear. Having said
that, I can’t imagine modern films
without colour. The light-emitting
diode (LED) array on the retro games
console we are currently constructing
already looks spectacular when displaying black (off) and white (all elements
on), but it will look even more awesome when we add colour to the mix.
Each pixel (‘picture element’) in our
array has red, green and blue (RGB)
components. The brightness of each
of these components is specified by
an eight-bit (one byte) field, which can
represent 2^8 = 256 different combinations. We use these combinations to
represent values from 0 (fully off) to
255 (fully on) in decimal.
Those numbers can also be written
as 0b00000000 to 0b11111111 in binary
Color values are
in hexadecimal
or 0x00 to 0xFF in hexadecimal.
When we combine our three 8-bit
RGB elements, we get 24-bit colour
depth, providing 16,777,216 possible
hues, shades, tints and tones. However,
in our console, we will limit ourselves
to just 12 colours (besides black and
white). These will be our primary colours and their corresponding secondary and tertiary counterparts.
Primary colours are any three (or
more) colours that can be mixed to
provide a range of hues. In our case,
the three primary colours we’ll use
are red, green and blue. Our three secondary colours—yellow, cyan and magenta—are obtained by mixing pairs of
our primary colours in equal amounts.
Our six tertiary colours—flush
orange, chartreuse, spring green, azure,
electric indigo and rose—are obtained
0° Primary
Red
330° Tertiary
30° Tertiary
FF 00 00
Rose
Flush Orange
FF 00 80
FF 80 00
300° Secondary
60° Secondary
Magenta
FF 00 FF
11
Yellow
0
1
2
10
270° Tertiary
Electric Indigo
9
80 00 FF
FF FF 00
90° Tertiary
Chartreuse
3
80 FF 00
4
8
7
6
5
240° Primary
120° Primary
00 00 FF
00 FF 00
Blue
Green
210° Tertiary
Azure
00 80 FF
150° Tertiary
180° Secondary
Cyan
00 FF FF
Spring Green
00 FF 80
Fig.1: a 24-bit RGB colour wheel showing some easy-to-generate colours.
36
by mixing pairs of our primary colours
in unequal amounts (100% of one and
50% of the other), as shown in the Fig.1
colour wheel.
It shows the colour values in hexadecimal (without the “0x” prefix), so
full-off values are 00, half-on values are
80 (128 in decimal) and full-on values
are FF (255 in decimal). As discussed
in my previous column, to keep our
power consumption at a reasonable
level (and to avoid making our eyes
water), we are limiting ourselves to
driving our pixels at only 12.5% of
their full potential.
This means that our programs will actually use values of 0x00, 0x10 and 0x20
(decimal 0, 16 and 32) for our full-off,
half-on and full-on values, respectively.
Reinventing the wheel
We will now modify one of our existing programs to display these colours on different columns. You can
find a downloadable copy of this new
program, named CB-May25-Code-01.
txt, as part of the May 2025 download
package from the PE website: https://
pemag.au/link/ac4y
First, we define our colours, as shown
in Listing 1(a). We are using the convention that the listing number (1 in
this case) corresponds to the numerical part of the matching code file name
(“01” here).
From previous discussions, we know
that our Arduino compiler can only
work with 8-bit, 16-bit and 32-bit integer values. However, we are using
24-bit values in our definitions. For
example, we are defining WHITE to
be 0x202020U, where each digit represents four bits.
When we use these definitions in
the program, the compiler will automatically pad the most-significant eight
bits with zeroes to boost them to the
required 32-bit width.
Practical Electronics | May | 2025
Listing 1(b): the colour wheel array definition.
Listing 1(a): our colour definitions.
The U characters at the end of our
definitions are to ensure that the compiler understands we want to treat
these as unsigned integer values. If we
omit these U (or u) characters, 99% of
the time, there won’t be a problem because the compiler will figure things
out correctly. However, it’s always best
to add these characters, nipping any
strange behaviours and unexpected
bugs in the bud.
NUM_COLORS (line 34) is defined
as 12 because we are counting only the
primary, secondary and tertiary colours
(with black and white as special cases).
Also observe that MAX_COLOR (line
35) is defined as (NUM_COLORS - 1).
We could have defined this as 11, but
this makes it easier should we decide
to change the number of colours later.
Doing this incurs no overhead to our
program’s runtime speed (at least, in
this example) because the C/C++ preprocessor will determine that the value
of MAX_COLOR is 11 before handing things over to the compiler. To be
honest, I’m not sure if we are going
to need this definition (if not, we can
always remove it later), but I thought
I’d include it ‘just in case’.
Whenever we create an equationbased definition like the one shown
on line 35, it’s critical to wrap the
equation part in parentheses (round
brackets), ie, ‘(’ and ‘)’. Otherwise,
the compiler may misinterpret our
intent, which can cause our programs
to misbehave.
Our next task is to declare an array of
our colour values, as shown in Listing
1(b). These act as the code equivalent of
our colour wheel. Numbers 0 through
11 in the centre of Fig.1 are the same
as the index values in our array. So,
Practical Electronics | May | 2025
ColorWheel[0] is
RED, ColorWheel[1]
is FLUSH_ORANGE,
ColorWheel[2] is
YELLOW etc.
Our colour wheel
array will allow us
to select colours
algorithmically, like
ColorWheel[i], where
the value of i has been
determined by an algorithm (one we have
yet to write).
Let there be colour!
illuminate diagonal lines with different
colours (see the file named CB-May25Code-02.txt). From my previous column,
we know that we have 23 diagonals, but
we have only 12 colours, which means
we end up repeating them.
The only change from our previous program is the body of our loop()
function, as seen in Listing 2(a). This
should be reasonably understandable.
The for() loop commencing on line 67
uses the diag variable to cycle through
our 23 diagonals (0 to 22).
On line 69, we employ the diag variable to specify the column of interest
and to index into our ColorWheel[]
array. We also employ the % (modulo)
operator to restrict the index to the allowable set of 0 to 11 values.
The modulo operator returns the remainder from an integer division, and
we’ve defined NUM_COLORS to be 12.
This means that when diag is 0 to 11,
diag % NUM_COLORS will return 0
to 11. When diag is 12, diag % NUM_
COLORS will return 0; When diag is
13, diag % NUM_COLORS will return
1 and so forth.
I just posted a video on YouTube
of this program in action on my array
(https://youtu.be/74LTs6bOiAg).
In the loop() function shown in Listing 1(c), we employ the SetColColor()
utility function we created last month.
This function accepts two arguments:
the number of the column we wish
to change, and the colour we wish to
change it to.
The for() loop that begins on line 67
uses the col variable to cycle through
the columns from 0 to 11. Line 69
employs the col variable for two different tasks: to specify the column
of interest, and to index into our
ColorWheel[] array.
Starting with the left-hand column,
lines 67 through 72
light the first 12 columns (0 to 11) with
the colours from our
array, with a delay
of ON_TIME between each column.
We leave column 12
untouched (black),
then lines 75 and
76 light the righthand column (13)
with white.
On line 77, we
pause for a delay of
GAP_TIME. Then,
on lines 80 through
85, we cycle through
each of the columns, switch them
off again. This all
looks a lot cooler in
the real world than
you might imagine.
Just for giggles
and grins, I created
a modified version
of this program to Listing 1(c): lighting each column with a different colour.
37
interested in the two colours either
side of the base colour’s complement
(rose and flush orange here).
We can determine the index values
of our two split complementary colours as (i + (NUM_COLORS / 2) – 1) %
NUM_COLORS and (i + (NUM_COLORS
/ 2) + 1) % NUM_COLORS.
We generate the index value for the
complementary colour as before, then
we add or subtract one to get the index
values of the split complementary colours on either side.
Tricky dicky
Listing 2(a): lighting diagonals with a different colour, repeating colours as required.
mentary colour is red, and vice versa,
as illustrated in Fig.2(a).
A little earlier, we noted that the main
reason for creating our colour wheel
array is that it allows us to select colours algorithmically. As an example,
suppose we have an index value of i =
6, which equates to cyan on our wheel.
We can determine the index value of its
complementary colour as iCmp = (i +
(NUM_COLORS / 2)) % NUM_COLORS.
Once again, we
are
employing the %
0
11
1
(modulo) operator
to restrict the new
10
2
10
2
(calculated) index
to the allowable set
9
3
9
3
of 0 to 11 values.
We k n o w t h a t
8
4
8
4
NUM_COLORS is
12, meaning that
7
5
7
5
NUM_COLORS / 2
= 6, i + 6 = 12 and
12 % 12 = 0.
The high contrast
of complementary
(a) Complementary
(b) Split Complementary
colours creates a
lively look, but they
0
0
can be jarring to the
11
1
11
1
eye, which obliges
us to use them judi10
2
ciously.
Since we are
9
3
9
3
here, let’s briefly
consider some ad8
4
8
4
ditional colour com7
5
binations, starting
with the concept of
split complementary, illustrated in
Fig.2(b). Besides the
(c) Triadic
(d) Analogous
base colour (cyan in
this case), we are
Fig.2: interesting colour combinations.
That’s very complementary
We covered colour theory in the
November 2020 issue, but that was
five years ago, so it’s worth refreshing
our minds. We may end up using this
knowledge as part of our future game
programs.
Every colour on our wheel has a complementary counterpart that’s located
180° around the wheel. Consider the
colour cyan, for example. Its comple-
38
As is usually the case, although all
of this may appear simple at first, there
are traps for the unwary.
The term ‘edge case’ refers to a scenario that occurs at an input constraint
boundary. This is where the behaviour
of the system might change, or unexpected problems could arise. These
cases are important in software development, testing and system design because they often reveal hidden bugs.
Edge cases are different from more
complicated ‘corner cases’, which refers
to situations in which a system, algorithm, or program behaves unexpectedly or fails due to input values that
are at the extreme ends (or ‘corners’)
of those possible.
Consider the case of the analogous
colours illustrated in Fig.2(d). Assuming a starting index value of i,
it’s tempting to think that we could
generate the index values of our two
analogous colours using (i – 1) and
(i + 1). While this will work for mainstream i values from 1 to 10, it will
fail for edge case i values of 0 and 11,
which occur either side of the boundary between 0° and 360°.
With an i value of 11, (i – 1) = 10,
which is what we want, but (i + 1) =
12, which is outside the bounds of our
colour wheel array. Whenever one of
our programs reads (or, worse, writes)
outside the bounds of an array, we will
not be happy with the result.
We can address this (i + 1) problem
case by adding our trusty % (modulo)
operator into our equations, which will
give us (i – 1) % NUM_COLORS and (i
+ 1) % NUM_COLORS. Now, when i =
11, (i + 1) = 12 and 12 % NUM_COLORS
= 0, which is what we want.
The other edge case for us to consider is when i = 0. In this case, (i + 1)
= 1, which is what we want, but (i – 1)
= -1, which we do not want at all. Unfortunately, our % (modulo) operator
won’t help us here, because (with the
Arduino’s C/C++) it will return a remainder of –1, which is still outside
the bounds of our colour wheel array.
The solution here may appear a
little unintuitive at first, but it makes
Practical Electronics | May | 2025
perfect sense when you think about
it. Before we subtract anything, we
first need to travel all the way (360°)
around our colour wheel in a clockwise
direction by adding NUM_COLORS
into the mix.
This means we will calculate the
index values of our two analogous
colours as (i + NUM_COLORS – 1) %
NUM_COLORS and (i + NUM_COLORS
+ 1) % NUM_COLORS. Now, when i
= 0, (i + NUM_COLORS – 1) = 0 + 12
– 1 = 11, and 11 % 12 = 11, which is
what we want. Meanwhile, (i + NUM_
COLORS + 1) = 0 + 12 + 1 = 13, and 13
% 12 = 1, which is also what we want.
You think this is all obvious? Good.
Just to make sure, why don’t you jot
down the equations to generate the triadic colours as illustrated in Fig.2(c).
You can find my solutions at the end
of this column.
The time has come
The time has come to talk about
something that is unexciting, although
it is more interesting than cabbages
and kings. This is another topic we
covered deep in the mists of time
(the September 2020 column), but
it’s worth mentioning again. There
are various ways to declare integer
variables in C/C++. We can summarise them as follows:
• signed int
• signed short int
• signed long int
• unsigned int
• unsigned short int
• unsigned long int
Signed integers can represent both
positive and negative values, while
unsigned can represent only positive
values. As we’ve already discussed, an
8-bit field can represent 256 different
combinations. So, an 8-bit signed integer can represent values in the range
-128 to +127 (including 0), while an
8-bit unsigned integer can represent
values in the range 0 to +255.
The reason the upper limits are 127
and 255, not 128 and 256, is because
we need to represent zero too, so we
have to sacrifice one of the non-zero
numbers at the extremes to keep within
our limit of 256 different values.
When we see a number like 42, we
assume it to be positive. This means
we don’t need to write +42, although
we can include the ‘+’ if we want to.
We do need to use a minus sign if the
number is negative, like -42.
If we don’t declare an integer as
being signed or unsigned, the compiler will assume it’s signed. This means
that we don’t need to use the signed
keyword, although we can include it
Practical Electronics | May | 2025
(x2, y2)
(x1, y1)
(x2, y2)
(x2, y2)
(x1, y1)
(0, 0)
(0, 0)
(a)
(x1, y1)
(0, 0)
(b)
(x1, y1)
(c)
(0, 0)
(x2, y2)
(d)
Fig.3: four ways to specify horizontal and vertical lines.
for clarity if we want to. We must use
the unsigned keyword if we want the
compiler to treat an integer as being
unsigned.
Next, there are three sizes of integer: int, short int and long int (there are
larger ones, but we don’t need them
just yet!). If we specify short or long,
we can omit the int part if we wish,
because the compiler will assume we
are talking about integers.
One thing that can prove tricky is
that the C/C++ standards don’t explicitly define the size of an int, a
short, or a long. All these standards
say is that an int must be a minimum
of two bytes (16 bits), a short must
be a minimum of two bytes (16 bits)
and a long must be a minimum of
four bytes (32 bits).
Apart from this, all bets are off,
which can cause problems when porting a program from one computer to
another.
With the Arduino Uno R3, both the
int and short are 16 bits, while the long
is 32 bits. The Arduino compiler also
understands the type byte, which it
treats as an 8-bit unsigned integer, but
this typically won’t be accepted or understood by other compilers.
Over time, fixed-width integer types
were added to the standards. The
signed types include int8_t, int16_t
and int32_t, while the unsigned types
include uint8_t, uint16_t and uint32_t.
Many beginners don’t like the “look”
of these types, preferring to use things
like int, long and unsigned long. As we
will soon see, however, the fixed-width
types can prove extremely useful. In
fact, you may remember that we used
the uint32_t type to define the colour
parameters in the utility functions we
created in our last two columns.
Graphics utilities (lines)
Our existing SetRowColor() and
SetColColor() utility functions are essentially special cases of line-drawing
routines. We could replace both these
functions with something a little more
generic, like a DrawLine() function that
can draw both horizontal and vertical
lines with arbitrary lengths.
A line can be defined by two distinct
(x, y) endpoints that we might call (x1,
y1) and (x2, y2). Unfortunately, even
though we are restraining ourselves to
consider only horizontal and vertical
lines, we still have four possibilities,
as illustrated in Fig.3.
Remember that I’m visualising (and
manipulating) my array under the assumption that its origin, (0, 0), is in
the lower left-hand corner. The way
I’m planning to create my function is
that I will draw my lines from (x1, y1)
to (x2, y2). Since I wish to keep things
as simple as possible, I’m planning
on incrementing my x and y values
as required.
This limits me to drawing horizontal
lines, as in Fig.3(a), and vertical lines,
like in Fig.3(c).
What about the lines shown in
Fig.3(b) and Fig.3(d)? If we were desperate, then there are various approaches we could use to address these. Let’s
consider the horizontal case as an example. We could start by testing to
see if x1 < x2, in which case we have
a Fig.3(a) situation that we can draw
by incrementing from x1 to x2.
However, if x1 > x2, then we have
a Fig.3(b) scenario that we can draw
by decrementing from x1 to x2 (or we
could swap our x1 and x2 values and
then return to incrementing from x1
to x2).
The problem with doing things this
way is that we are increasing the complexity of our function and the number
of tasks it needs to perform, which (not
surprisingly) will increase the time to
perform them.
An alternative approach, which is
what I’m going to do, is to say that our
function will only handle horizontal lines where x1 ≤ x2 and vertical
lines where y1 ≤ y2. The advantage of
using ≤ rather than < is that—as you
will see—our routine will draw zerolength (1-pixel) lines in which x1 = x2
and y1 = y2.
Just to make things official (and cover
our posteriors), we will use a common
programmer’s ‘get out of jail free’ card,
meaning that if x1 > x2 and/or y1 > y2,
the output from and actions performed
by this function will be undefined.
This is going to be a lot less scary
than you might imagine. I just created
a little test program (in the file named
CB-May25-Code-03.txt). Our DrawLine()
function appears between lines 83 and
92. If we exclude the function declaration and all the { } squiggly brackets,
39
Listing 3(b): our first line drawing test code.
example, you will have to implement
your DrawLine() function accordingly.
Graphics utilities (rectangles)
Listing 3(a): our new DrawLine() function, along with the renamed DrawPixel().
there are only three lines of code, as
shown in Listing 3(a).
This function has five parameters.
The first four represent the ends of the
line in terms of (x1, y1) and (x2, y2)
values, while the fifth is the colour we
wish to use. Observe that we are making
use of our new fixed-width data types
(the reason for this will become obvious later in this column).
Also observe the call to the DrawPixel()
function on line 89. It’s just a new incarnation of our old SetPixelColorXY()
function. I renamed this little scamp to
make it consistent with the collection of
graphics utility functions we are creating.
(x2, y2)
For my first test, I changed the loop()
function to repeatedly draw and erase
a red line between (3, 4) and (8, 4), as
shown in Listing 3(b).
After triumphantly watching my first
line flash for about an hour (I’m easily
amused), I experimented with a vertical
line and a few edge conditions, just to
make sure everything was tickety-boo.
That all went well, so I’m happy.
Still, before we proceed to the next
topic, remember that I’ve implemented my DrawLine() function, assuming
that (0, 0) is in the bottom-left corner of
the array. If you’ve created your array
with the origin in the top-left corner, for
(x1, y1)
(x1, y1)
(x2, y2)
(0, 0)
(0, 0)
(a)
(x1, y1)
(x2, y2)
(x2, y2)
(0, 0)
(b)
(x1, y1)
(0, 0)
(c)
(d)
Fig.4: four rectangle specification possibilities.
13 = NUM_COLS – 1
Fig.5: starting with a quick visual.
11 = NUM_COLS – 3
9
8
5
4
3
2
1
0
0
1
2
3
4
5
6
7
8
NUM_COLS = 14
40
9 10 11 12 13
9 = NUM_ROWS – 1
6
7 = NUM_ROWS – 3
NUM_ROWS = 10
7
Another utility function that might
come in handy is the ability to draw
rectangles, which encompass squares
as a subset. We can define the outer
limits of a rectangle with two distinct
(x, y) points that we might call (x1, y1)
and (x2, y2). As for a line, there are
four possibilities illustrated in Fig.4.
The same arguments we discussed
for drawing lines apply to drawing rectangles. To cut a long story short (the
opposite of how I like to do things),
our function is going to implement
only the option reflected in Fig.4(a).
Actually, we are going to create a
small suite of rectangle drawing routines. We’ll start with one we’ll call
DrawRectOutline(), which will draw
only the outline of the rectangle. There
are lots of ways we could do this.
The way I chose to go is captured in
a test program (in the file named CBMay25-Code-04.txt) and illustrated in
Listing 4(a).
Like our line-drawing function, this
has five parameters. The first four represent the corners of our rectangle,
while the fifth is the colour we wish
to use. We won’t dissect this function
here, except to note it places four calls
to our DrawLine() function to draw the
edges to our rectangle.
For my first test, I modified the loop()
function to repeatedly draw and erase
a red rectangle outline between (3, 3)
and (9, 7), as shown in Listing 4(b).
Once I’d verified that this worked
as hoped planned, I tweaked the corners of the rectangle to check that our
algorithm handles edge conditions.
These tests included extreme cases like
a one-line thick rectangle, and even a
single-pixel rectangle.
Random rectangles
Let’s step things up a bit. We will
now repeatedly draw (and then clear)
rectangles with random sizes and colours. They can appear anywhere on
our array. The only rule to enforce is
that there must be at least one blank
pixel inside the rectangle’s outline. I
find that sketching out a quick visual
Practical Electronics | May | 2025
Listing 4(b): the rectangle outline drawing test.
Listing 4(a): our DrawRectOutline() function is pretty straightforward.
representation helps me wrap my brain
around what I need to do (Fig.5).
We will start by generating our (x1,
y1) values, followed by our (x2, y2)
values. While this may seem the obvious choice, there’s nothing to stop us
from doing things the other way round.
As our array is 14 × 10 (width ×
height) pixels in size, we’ve defined
NUM_ROWS and NUM_COLS in our
programs to be 10 and 14, respectively.
The green and blue squares in Fig.5
represent a couple of worst-case (well,
boundary) conditions. From these,
knowing the size of our array, we can
say that our x1 value can be any integer
between 0 and 11. Once we’ve generated our x1 value, we know that our
x2 value can be any integer between
(x1 + 2) and 13.
Similarly, we can say that our y1
value can be any integer between 0 and
7. Once we’ve generated our y1 value,
we know our y2 value can be any integer between (y1 + 2) and 9.
Rather than using hard-coded values
in our code, we will future-proof things
by basing any equations on our NUM_
ROWS and NUM_COLS definitions on
the off-chance we decide to change the
size of our array one day.
Bearing all this in mind, I created
a new version of our program with a
modified version of the loop() function
(in the file named CB-May25-Code-05.
txt), shown in Listing 5(a).
We are using the Arduino’s random()
function to generate our random values.
Based on C/C++, this function accepts
two arguments that we might call min
and max.
Observe line 68. Why are we using
a max value of (NUM_COLS – 2) instead of the (NUM_COLS – 3) value
depicted in Fig.5 (and similarly for
line 69)? Similarly, on line 71, we are
using a max value of NUM_COLS instead of the (NUM_COLS – 1) value
shown in Fig.5.
Well, with the Arduino’s random()
function, the min value is inclusive, while
the max value is exclusive. This means
the function will generate a random
integer between min and (max – 1).
Practical Electronics | May | 2025
There is a reason for this, although it
doesn’t particularly suit us here.
So, a max value of NUM_COLS will be
employed as the (NUM_COLS – 1) value
we wish to use by the random() function.
Similarly, a max value of (NUM_COLS – 2)
will be treated as (NUM_COLS – 3) by
the random() function.
One final point to consider is that
if max is equal to (min + 1), we are essentially asking the function to generate a random value between min and
min, which leaves it no option but to
return a value of min.
I just posted a video on YouTube
of this program running on my array
(https://pemag.au/link/ac4w).
I’m sorry for the poor quality
of this video. Even though I’m
running each active pixel at only
12.5% of its full-on brightness,
that’s still enough to saturate my
camera. The colours look much
more vibrant in real life.
All we are doing is using our
DrawLine() routine to draw a series of
horizontal lines. In the loop() function, we exchange the two calls to the
DrawRectOutline() function for calls
to our new DrawRectSolid() function.
Let’s also create a DrawRectFilled()
function, which will draw a rectangle filled with a different colour from
its outline. The code with this added
is named CB-May25-Code-07.txt and
the new function is illustrated in
Listing 7(a).
We now have two colour parameters,
called outlineColor and fillColor. This
function makes calls to our existing
DrawRectOutline() and DrawRectSolid()
functions. In turn, they call our
DrawLine() function, which calls our
DrawPixel() function. At the conceptual
Listing 5(a):
drawing
random
rectangles.
Yet more rectangles
Now let’s add a DrawRectSolid()
function, which will draw a rectangle filled with the same colour
as its outline. This is getting
easier as we go. The new version of our program containing
the DrawRectSolid() function is
named CB-May25-Code-06.txt
and the new function is shown
in Listing 6(a).
Listing 6(a): our
DrawRectSolid() function.
Listing 7(a): our DrawRectFilled() function.
41
Clock Cycles
Data
Memory
(SRAM)
The MCU
has limited
resources
Program
Memory
(Flash)
Fig.6: trade-offs of limited MCU resources.
Listing 7(b): our updated loop() function for drawing random boxes on the 'screen'.
‘bottom of the pile’, our DrawPixel()
function calls our GetNeoNum() function (we’ll see why this is important
shortly).
We are basing each new function
on one or more existing functions that
have already been tested and verified,
thereby making our lives a lot easier.
Our updated loop() function is illustrated in Listing 7(b). Observe lines
82 and 86, where we’ve swapped out
our previous calls to the DrawRectSolid() function with calls to our new
DrawRectFilled() function and added
the second colour argument to each.
I decided to use complementary
colour pairs for the outline and fill
colours. On line 76, we generate a
random index value between 0 and
11, then on line 77, we generate the
index value for the complementary
colour to this using the equation we
described earlier. All I can say is that
the result looks “awesome” (and you
can quote me on that)!
Resource limitations
This rest of this column is going to be
a bit heavy, so we will take things stepby-step and simplify things a little, in
that we will consider only mainstream
microcontroller and compiler implementations. Otherwise, we will find
42
ourselves going down a rabbit hole.
A microcontroller unit (MCU) is
driven by a system clock signal. This
may range from a few kilohertz (thousands of hertz) to hundreds of megahertz (hundred millions of hertz), where
1Hz (hertz) is one cycle per second.
Generally, we want our programs
to run as fast as they can. Sometimes,
however, we need to make trade-offs,
because doing something fast might
consume more memory.
One way to visualise this is as a triangle with the vertices representing
clock cycles, data memory (SRAM) and
program memory (flash), as in Fig.6.
All these are finite resources—we can’t
exceed the maximum of any of them.
All MCUs contain an arithmetic logic
unit (ALU) that can perform simple
operations on integers. These include
logical operations like AND, OR and
XOR, mathematical operations like
ADD and SUB (subtract), and other
operations like CMP (compare).
CMP may be thought of as a hybrid
instruction that combines logical and
mathematical attributes. It subtracts
two values, which is a mathematical operation, but it doesn’t store the
result. Instead, it updates the processor’s status flags, like the zero, carry,
negative and overflow flags.
The processor subsequently employs
logical operations in conjunction with
these flags to make determinations (is
A equal to B) and decisions (if A is
equal to B, then…).
Some microcontrollers have 8-bit
(one byte) data paths and ALUs that
operate on such values. Others have
16-bit (two byte) data paths and ALUs.
Some have 32-bit (four byte) data paths
and ALUs, while a few go beyond that.
If your integers are x bits wide and
you are running on an x-bit machine,
performing an operation like AND or
ADD will consume only a single clock
cycle. However, if your integers are 16
bits wide and you are running on an 8-bit
machine, then performing an operation
like AND or ADD will usually consume
two clock cycles (one for each byte).
As previously discussed, the Arduino
Uno R3 has an int of 16 bits (two bytes)
and a long of 32 bits (four bytes), but
the CPU is an 8-bit model, so operations like AND or ADD on these types
will take two and four clock cycles,
respectively.
Instructions like MUL (multiply)
and DIV (divide) can be implemented
using lots of simpler operations, like
shift, add and subtract. To speed things
up, some MCUs contain a hardware
multiplier and some contain both a
multiplier and a divider. Others have
to resort to software routines to do all
the work (which is slower).
The ATmega328P microcontroller
used in the Arduino Uno includes
an 8-bit hardware multiplier. This
allows it to perform a single-cycle 8-bit
by 8-bit multiplication and store the
16-bit result in two 8-bit registers in
two clock cycles.
Multiplying two 16-bit integers on
the ATmega328P requires breaking the
operation into multiple 8-bit by 8-bit
multiplications. The complete 16-bit
multiplication typically takes around
10–12 clock cycles.
The ATmega328P does not have a
hardware divider, meaning division
must be implemented in software using
repeated subtraction, bit-shifting, or
some other algorithm. As a result, an
8-bit ÷ 8-bit division can consume
40–80 clock cycles, while a 16-bit ÷
16-bit division can suck up 250–600
clock cycles (depending on the operand values etc).
Practical Electronics | May | 2025
utility functions—
is our old friend,
GetNeoNum(). As
8-bit Values
16-bit Values
you may recall, we
ADD (+), SUB (–)
1
2
originally implemented this using
MUL (*)
2
10 to 12
16-bit integer data
DIV (/)
40 to 80
250 to 600
types. We also used
the modulo operMOD (%)
50 to 90
300 to 600
ator. Based on the
AND (&), OR (|), XOR (^)
1
2
table shown in
Fig.7, I think we are
SHL (<<), SHR (>>)
1
2
justified in saying,
CMP
1
2
“Eeek!”.
I just created a
If (a <condition> b) then...else
2 to 3
3 to 4
simple test program
(in the file named
Fig.7: instructions vs. clock cycles on the ATmega328.
CB-May25-Code-08.
Worst of all are modulo operations. txt) that has everything stripped out
Since performing a modulo requires apart from key definitions, instantiaa division operation plus a multipli- tions, global variables and our setup(),
cation and subtraction, it takes a bit loop() and GetNeoNum() functions. As
longer than division.
a reminder, the GetNeoNum() function
Rooting around the ATmega328P’s is shown in Listing 8(a). The rationale
data sheet (https://pemag.au/link/ac4x), behind the overall architecture of this
coupled with a little investigative re- function was discussed in the March
porting, reveals some interesting nu- 2025 issue.
merical factoids that I’ve summarised
When we first created this function,
in Fig 7.
the way it performed seemed quite
An if() statement requires a CMP spiffy. With our newfound knowledge,
operation followed by a conditional we might start looking at it with a more
branch. The conditions can include jaundiced eye.
==, !=, <, >, <= and >=. The conditional
Although we
branch will consume one clock cycle if might turn our
the branch is not taken or two if it is.
noses up at a
16MHz clock,
Testing times
that’s still sixteen
As we noted, the ATmega328P mi- million clock
crocontroller powering the Arduino cycles in every
Uno R3 is an 8-bit device. It has 2kiB second. That’s
of RAM (data memory) and 32kiB of a lot of clock
flash (program memory). The system cycles and, to be
clock runs at 16MHz by default. These honest, I haven’t
numbers establish the boundaries of noticed any lag
our resource triangle in Fig.6.
in the response
The utility function at the conceptual of my array.
‘bottom of the pile’—the one that’s ultiOn the other
mately called by all our other graphics
hand, anything
Clock Cycles
Listing 8(a): our original GetNeoNum() using ints.
Practical Electronics | May | 2025
we can do that manages to both minimise memory and increase performance
must be a good thing. If we do this now,
we will reap the benefits in the future
when we create our games.
Except for the addition of a global
integer variable called CycleNum that
is initialised to 0, there’s nothing in the
definitions and suchlike that we haven’t
seen before. The test itself is found in the
loop() function as seen in Listing 8(b).
The bulk of the action takes place
in the nested loops between lines 67
and 75. This is where we cycle through
every pixel in the array, setting the
pixel colours.
What’s with line 72, where we generate the colour? In the past, I’ve had
tussles when trying to benchmark my
code with the Arduino’s compiler. Besides regular cunning optimisations, if
it thinks that any of the things in your
code aren’t doing anything useful,
there’s a chance it will eliminate them.
So, I’m using line 72 to generate a
varying index that will access a subset
of colours (0 to 7) in our colour wheel,
where the colours will change every
time we cycle through the loop.
As an aside, if you are interested in learning more about compilers, there’s an awesome book called
Listing 8(b): our test sequence using ints.
43
Writing a C Compiler by Nora Sandler
and published by No Starch Press (https://
nostarch.com/writing-c-compiler).
However, be advised that this is a meaty
morsel that will make your brain ache
(well, it did mine).
The Arduino’s micros() function returns the number of microseconds (millionths of a second) since the program
started running. We call this function
on lines 65 and 77, and use the returned
values in line 78 to determine the time
taken to cycle through all the pixels.
Oodles of optimisations
We’ll start by generating an ‘as-is’
baseline (it can only get faster from
here). When we look at the instruction
clock cycle combos in Fig.7, coupled
with our GetNeoNum() function in Listing 8(a), we see that there’s an elephant
in the room and a fly in the ointment (I
never metaphor I didn’t like). That is
our use of the modulo operator on line
97, where we wrote, if ((y % 2) == 0).
All we are using this operator for is
to determine if we are dealing with an
odd or even row. Essentially, we are
dividing the y value (the row number)
by 2 and looking at the remainder (the
bit that ‘drops off the end’).
However, all we really need to do is
check to see if the least-significant bit
of the y value is 0 or 1. We can achieve
the same effect by saying if ((y & 1) == 0).
Since we are currently working with
16-bit integers, our & operation will
consume 2 clock cycles, as compared
to the % operation which consumed a
whopping 300–600 clock cycles.
I just made this change in the file
named CB-May25-Code-09.txt.
Next, we can change all of our int variables (16 bits) into fixed-width int8_t
data types (8 bits). This includes the
return type of the GetNeoNum() function itself. I did that in the file named
CB-May25-Code-10.txt.
As seen in Fig.8, the results are… interesting. Remember that 1µs is a millionth of a second and our system clock
is running at 16MHz, so that means
we have 16 clock cycles in each 1µs.
Our baseline is… well, what it is. We
can’t really draw any conclusions at
this stage. The reason there’s a range of
time values is that the code performs
different combinations of operations
on each pass through the main loop.
Now look at the results from our first
optimisation attempt. What happens
when we replace the ((y % 2) == 0)
with ((y & 1) == 0) in our GetNeoNum()
function? The answer is… absolutely
nothing! How can this be after the big
buildup I’ve given you?
The answer is in the Arduino’s clever
compiler. When it saw the ((y % 2) ==
0) in our original code, it said “Ah ha!”
to itself. Since the compiler could see
that this operation never changed, it
replaced it with ((y & 1) == 0) without
our knowing. As a result, our first two
tests are identical (which explains the
identical results).
Our next test, where we replace
16-bit int variables with their int8_t
counterparts, does provide some improvements in both program size and
runtime (phew!).
I really wanted to see what the effect
the % operator would have. So, I added
one more test (in the file named CBMay25-Code-11.txt). On line 72 in the
main loop() function, I replaced our
original (x + y + CycleNum) & 7 with
(x + y + CycleNum) % 8, which will
give the same functional result.
Since the values associated with this
operation change each time through the
loop, the compiler has no choice but
to implement the % operator as written. Wow! The code has increased by
68 bytes (not too bad), but the processing time has increased by a whopping
1612 to 1640 clock cycles! And this is
with our new 8-bit integers.
Now I want to know how bad it would
be with our original 16-bit integers (I
will run that test and share the results
in a future column). Also, I have some
ideas for more optimisations. Why don’t
you noodle on this yourself and we’ll
compare notes later?
Tricky triadics
Earlier, I tasked you with working
out the equations we could use to generate triadic colours as illustrated in
Fig.2(c). When you think about it, this
is not dissimilar to the way we generated our split complementary colours.
We can determine the index values
of our two triadic colours using the
P=Program D=Data B=Bytes
P-Storage (B)
D-Storage (B)
Time (µs)
Baseline
4178
308
1040 to 1044
Replace % with &
4178
308
1040 to 1044
Replace int with int8_t
4140
308
968 to 972
Add a % in loop() function
4208
308
2580 to 2612
Fig.8: the results of my attempts at speed optimisation of the code.
44
expressions (i + (NUM_COLORS / 3)) %
NUM_COLORS and (i + ((NUM_COLORS
* 2) / 3)) % NUM_COLORS.
From our previous discussions, we
already know that, starting with our
original colour (cyan in this example), we always want to generate the
index values for our new colours by
travelling clockwise around our colour
wheel and then using the % (modulo)
operator to restrict the results to our 0
to 11 allowable index values.
Based on this, the first of these
equations is relatively obvious. We
start by adding (NUM_COLORS / 3) to
our current index value, which will
take us ⅓ (120°) clockwise around
the colour wheel.
To generate the second value, we
need to travel ⅔ (240°) around the
colour wheel in the clockwise direction. Our knee jerk reaction might be
to calculate this as ((NUM_COLORS /
3) * 2); that is, divide by three to generate the 120° value, then multiply by
two to generate the desired 240° value.
Instead, I decided to use ((NUM_
COLORS * 2) / 3), multiplying by two
to generate a 720° value and then dividing by three to generate the desired
240° value. Why? A divide followed
by a multiply returns the same result
as a multiply followed by a divide…
or does it?
The problem is that integer divisions in C/C++ discard (truncate) any
remainder. In our current implementation, we have 12 colours, so either
scenario will work because 12 can be
divided by 3 without any remainder.
For the sake of argument, though,
suppose we decided to use 14 colours in the future. In this case, 14 ÷ 3
= 4.66, which would be truncated to
4 and 4 × 2 = 8. By comparison, 14 ×
2 = 28 and 28 ÷ 3 = 9.33, which will
be truncated to 9.
Nine is closer to where we want to
be than eight. The point is that, when
working with integers, we typically
try to perform multiplications (that
make things bigger) before divisions
(that make things smaller) so we don’t
lose too much information during intermediate steps.
Blushing in shame!
It’s happened again. I said that we
were going to perform some more experiments with our 8-segment bar graph
display, but we simply don’t have the
space or time. We will do this next
month, I promise!
Over to you. If you have any thoughts
that you’d care to share on anything
you’ve read here, please feel free to
contact me at max<at>clivemaxfield.com.
Until next time, make sure you have
PE
a good one!
Practical Electronics | May | 2025
Useful Bits and Pieces from Our Arduino Bootcamp series
Arduino Uno R3 microcontroller module
Solderless breadboard
8-inch (20cm) jumper wires (male-to-male)
Long-tailed 0.1-inch (2.54mm) pitch header pins
LEDs (assorted colours)
Resistors (assorted values)
Ceramic capacitors (assorted values)
16V 100µF electrolytic capacitors
Momentary pushbutton switches
Kit of popular SN74LS00 chips
74HC595 8-bit shift registers
https://pemag.au/link/ac2g
https://amzn.to/3O2L3e8
https://amzn.to/3O4hnxk
https://pemag.au/link/ac2h
https://amzn.to/3E7VAQE
https://amzn.to/3O4RvBt
https://pemag.au/link/ac2i
https://pemag.au/link/ac2j
https://amzn.to/3Tk7Q87
https://pemag.au/link/ac2k
https://pemag.au/link/ac1n
www.poscope.com/epe
Other stuff
Soldering guide
Basic multimeter
https://pemag.au/link/ac2d
https://pemag.au/link/ac2f
Components for Weird & Wonderful Projects, part 1
4-inch (10cm) jumper wires (optional)
8-segment DIP red LED bar graph displays
https://pemag.au/link/ac2l
https://pemag.au/link/ac2c
Components for Weird & Wonderful Projects, part 2
144 tricolour LED strip (required)
Pushbuttons (assorted colours) (required)
7-Segment display modules (recommended)
2.1mm panel-mount barrel socket (optional)
2.1mm inline barrel plug (optional)
25-way D-sub panel-mount socket (optional)
25-way D-sub PCB-mount plug* (optional)
15-way D-sub panel-mount socket (optional)
15-way D-sub PCB-mount plug^ (optional)
https://pemag.au/link/ac2s
https://pemag.au/link/ac2t
https://pemag.au/link/ac2u
https://pemag.au/link/ac2v
https://pemag.au/link/ac2w
https://pemag.au/link/ac2x
https://pemag.au/link/ac2y
https://pemag.au/link/ac2z
https://pemag.au/link/ac30
- USB
- Ethernet
- Web server
- Modbus
- CNC (Mach3/4)
- IO
- PWM
- Encoders
- LCD
- Analog inputs
- Compact PLC
* one required for each game cartridge you build
^ one required for each auxiliary control panel you build
Components for Weird & Wonderful Projects, part 3
22 AWG multicore wire kit
20 AWG multicore wire kit
https://pemag.au/link/ac3m
https://pemag.au/link/ac3n
Components for Weird & Wonderful Projects, part 4
5V Buck converter module
9V 2A power supply (optional)
Bench power supply (optional)
I have no idea why
everyone calls me
"Mad Max"!
https://pemag.au/link/ac2m
https://pemag.au/link/ac46
https://pemag.au/link/ac48
- up to 256
- up to 32
microsteps
microsteps
- 50 V / 6 A
- 30 V / 2.5 A
- USB configuration
- Isolated
PoScope Mega1+
PoScope Mega50
- up to 50MS/s
- resolution up to 12bit
- Lowest power consumption
- Smallest and lightest
- 7 in 1: Oscilloscope, FFT, X/Y,
Recorder, Logic Analyzer, Protocol
decoder, Signal generator
Practical Electronics | May | 2025
45
|