Encoding Morse Code

In September 2022, I was working on developing firmware for a Romanian PocketCube, ROM-2, at ORION Space. An 8-bit microcontroller, ATmega328p, was used as on-board computer which had to handle satellite mission plans as well as a series of image packets from the camera payload during image transmission. This required us to be economical with the memory usage by the software of on-board computer, and transmitting telemetry as Morse code was one of them.

The idea behind Morse code is to transmit pulses of continuous wave with varying lengths of transmissions representing dots and dashes (often called dits and dahs). It is with these dits and dahs that the characters of message are encoded. Software that transmits Morse code should generate dits and dahs corresponding to each character of the message. This correspondence between characters and their Morse code needs to be encoded and stored somewhere in the software, which can easily be implemented using a look-up table. However, due to the beforementioned reason, the constrained problem statement can be stated as follows: Pack Morse code corresponding to each character into a C/C++ data-type char/uint8_t to occupy the least memory space, which is a byte or 8 bits.

Morse code

Morse codes for alphabets, numbers, and special characters has been standardized by International Telecommunication Union (ITU) as International Morse Code.

Alphabets

a .-	b -...	c -.-.	d -..	e .
f ..-.	g --.	h ....	i ..	j .---
k -.-	l .-..	m --	n -.	o ---
p .--.	q --.-	r .-.	s ...	t -
u ..-	v ...-	w .--	x -..-	y -.--
z --..

Numbers

1 .----	2 ..---	3 ...--	4 ....-	5 .....
6 -....	7 --...	8 ---..	9 ----.	0 -----

Special characters

, --..--	. .-.-.-	! -.-.--	: ---...	; -.-.-.
( -.--.	) -.--.-	" .-..-.	@ .--.-.	& .-...
? ..--..

It should be noted that the there are more Morse codes than the ones listed above, and not all characters and unicode have their own Morse code.

Time duration of dit is considered to be the fundamental unit of time in Morse codes, which, in turn, depends on the speed (usually measured in words per minute) of Morse code transmission. Other time intervals are expressed in terms of the time duration of a dit as shown below:

dah: 3 dits
between Morse signals: 1 dit
between letters: 3 dits
between words: 7 dits

Encoding Morse code

The length of Morse codes vary from one (eg. e and t) to six (eg. numbers) which means that a byte is an enough placeholder to encode Morse codes into it. We can easily encode the bits with 0s and 1s representating dits and dahs respectively. The question is how do we know how far to parse? In other words, how do we distinguish, for example, a (.-) from j (.---)? Had there been three states in binary, we could use the third bit to be our stop bit. Obviously we don't have that luxury so we need to look for possibility for utilizing unused bits of the byte to store the length of Morse code.

Mini-Morse

If the length of Morse code is less than six, first five bits of a byte are used to store the Morse signals i.e. 0s for dits and 1s for dahs. The last three bits will be used to store the length of Morse code in Big Endian binary format. For example:

e	.-	0 - - - - 1 0 0
a	.-	0 1 - - - 0 1 0
d	-..	1 0 0 - - 1 1 0
c	-.-.	1 0 1 0 - 0 0 1
1	.----	0 1 1 1 1 1 0 1.

Note: It really does not matter if the unused bits are assigned either zeros or ones.

When the length of Morse code is six, we can no longer use last three bits to encode the length because we would only have 5 bits left to encode the Morse code. However, in this case, we can observe that the last two bits of the byte would both contain ones (i.e. - - - - - 1 1), which is never the case for lengths from one to five. We can take advantage of this fact and use the first six bits of the byte to represent Morse signals, while the last two bits are ones. For example:

,	-.-.-.	1 0 1 0 1 0 1 1
.	.-.-.-	0 1 0 1 0 1 1 1.

Decoding Morse code from the encoded byte starts with detection of length of Morse code from the last three bytes^*. Morse signals are then parsed from the first bit of the byte to the length of the code.

^*If length is found to be seven then it means Morse code has length six (How? Why?).

Micro-Morse

I came across this idea in April 5, 2023 which rekindled my interest in this topic. This idea is much simpler than Mini-Morse and does not require us to encode the length of Morse code.

In this approach we start by encoding Morse signals from the first bit of the byte. Once done, the remaining bits are filled with opposite of the last Morse signal. i.e. if the Morse code ends with dah (0), then remaining bits are filled with ones. Here are some examples:

e	.-	0 1 1 1 1 1 1 1
a	.-	0 1 0 0 0 0 0 0
d	-..	1 0 0 1 1 1 1 1
c	-.-.	1 0 1 0 0 0 0 0
1	.----	0 1 1 1 1 1 0 0

To decode the Morse signals from the encoded byte, we start detecting bit change from the last bit. This detection will help identify the Morse length and parsing that many bits from the first bit of the byte gives us the required Morse code.

Conclusions

Both of the approaches fulfill our problem statement. Detection of the bit change in Micro-Morse helps to determine the length of the Morse code which is already present in the encoded byte of Mini-Morse. Fundamentally, both the techniques are identical after the length is determined.

If you are interested, you can find the software for both of them here.