...implementing real security...
This page is dedicated to the development of the "harmless little
board" or "HLB" for short.
However this is not the project's main web-page, please see
This page is intended to:
The "harmless little board" is a device for high-security encryption of voice- and data-traffic for transport using insecure public networks like the analog telephone net, ISDN, LAN, the internet and others.
While such devices are already commercially available, their price is high and their security doubtful. Many of these devices implement their own protocols and algorithms, incompatible with other devices and potentially weak because they are not based on common standards and/or not reviewed the public. Others have been found containing back doors, put in intentionally by some interested "third parties".
In contrast to this, we build a device with real security.
This does not mean that the device is secure for everything under any circumstances. Instead we give a clear definition of the expected operating environment for the device to be secure. But within the limitations described there, this should be a very secure device. This is achieved by using these design guidelines:
The development-team consists of several individuals, considering the idea
to develop and build such a device attractive enough to devote some(?) of
their rare spare time to that. Even though most of the development is
currently done in hamburg/germany, everyone willing to contribute is
Whereever you are, the internet will provide a fast way to share thoughts and material.
To get in touch with current developments and the people involved, it's best if you subscribe to the hlb mailing-list first.
The "harmless little board" may be freely build by private, non-commercial users. No license fee or other restrictions apply. You may modify the hard- and/or software to serve your special needs if you wish to. You can build your hardware completely "from scratch", buying all the hardware from "trusted sources" and having the build-process in your own hands.
No commercial usage is allowed so far.
The developers reserve the right to use this design commercially.
Once the design is stable enough we plan to grant third parties the
non-exclusive right to produce and sell this device. This should give
users with no skill to assemble their own board and program their own
eproms the possibility to use the "harmless little board", too.
Boards should be quite cheap (we plan to be <$150), but in any case you
will still have the option to build one for yourself and pay only for
the material (assuming your time is for free).
We are not doing this work as part of a commercial project, but because we like the idea of what we build.
Our development-environment is based on LINUX-hosts, using the GNU-toolchain (GCC,GAS,LD, etc.) to cross-compile our code for the HLB target-platform. Newlib is used to provide a compact libc, gnu multiprecision library serves as the basis for public-key crypto calculations. As a gsm-codec we use the "free" reference implementation by the "TU Berlin". Because all the tools and libraries used to build the HLB-firmware are available in source form, the security of these packages may be verified. The GNU-toolchain is also very common and used with may projects, thereby making it easy to integrate software orginally written for different platforms.
We use "C" as the preferred programming language for the project. Our sources are mainly written in C, along with some assembler-code for the boot-loader and low-level functions for efficiency. Optionally we could also add sources in one of the many other programming languages supported by the GNU-toolchain.
Heart of the device is the 80(120) MHz 32-bit risc microprocessor SH7709(A) (hitachi SH3-series), equipped with 128 Kbyte (flash-)ROM and 4 Mbyte SDRAM.
The user-interface consists of a "phone-line-emulation" where you can connect a standard telephone-set for audio i/o. Local to the board there is an additional small keyboard (4*4 matrix), an LC-display with 2 lines each 16 character wide (no real graphics possible), a smart-card interface and (optionally) a hardware random-number generator. There is also a serial port to (optionally) connect to your pc, this is used for encrypted transmission of data and to upload new firmware to the board itself.
For the network side we simply provide another serial port. There you will have to connect an external standard hayes-modem (min. 28800 baud) for analog lines or an external isdn terminal-adapter for isdn-lines. We intentionally do not integrate these parts with the rest of our design for two reasons:
The hardware is very flexible and it's relatively simple to add new devices. So we might design variants with on-board-ethernet, graphics-display, pcmcia-slot, irda-interface or any other medium in the future. However we want to keep things simple, so we start with the basic board for now.
The basic functionality of the device is not too complex. Once a connection has been established, compressed and encrypted audio-samples are exchanged over the line. Gererating the session-key used to encrypt the traffic may be done using a public-key algorithm for key-exchange (for people you've never met before) or using pre-exchanged secret keys (maximum security).
Secret information is stored on the user's chip-card and should therefore be carried personally by the user. Stealing the user's chip-card will not make old encrypted conversations less secure. It only affects conversations carried out with this chip-card in the future. So it's not dangerous if someone takes the card away from you, but you must be able to notice this (so that you don't use this card to make calls in the future). Once you've ended a conversation (or if you restart a new key-exchange) the old session key is deleted everywhere, so then there is no more possibility to restore the contens of previous conversations.
The algorithms for speech-compression and secret- as well as public-key encryption are (due to our modular design) not fixed. The defaults are full-rate-gsm speech-compression for the codec, twofish as secret-key-algorithm and diffie-hellman as public-key-algorithm.
If you connect a pc to the HLB, you can have encrypted data-exchange. In this case the HLB emulates a standard hayes-modem to the pc. This makes it possible to use standard, insecure (i.e. windows-)software for secure exchange of data.
The protocol on the network side will be compatible to "pgp-fone".
This software-package was recently released with full source-code and being
compatible with it makes it simple to have an encrypted conversation between
a pc- or mac-computer (running pgp-fone) and the HLB. It saves us the trouble
of writing our own pc- and mac-software.
However it has the drawback that the protocol is not exactly what we need. So we will probably have to extend the protocol in some way or implement our own protocol in parallel as well. But being compatible with pgp-fone is something we really want and it is therefore the first and default protocol we support.
The firmware within the HLB is typical for an embedded system. The boot-loader starts slowly from flash/eprom, sets up the entire hardware to a known state, copies itself to sdram, optionally starts a debugger or s-record-upload, finally searches for a valid system-image stored in flash/eprom (or via serial upload), copies the system to sdram and starts it. This system is already our "application program" and contains most of the device functionality. It may be updated locally with the pc via serial cable or (optionally) from a remote site using crypto to authenticate and verify the software-upload.
Cpu-load during speech-phase is about 60%. This includes full-duplex gsm-encoding/decoding (about 50%) as well as about 10% for secret-key-crypto. Speech-phase is the only phase which must be handled in realtime. Generation of public keys (currently a lengthy procedure taking several seconds) can partly be done in advance and is therefore not relevant to realtime performance. The numbers for cpu-load were taken without "hand-optimizing" the relevant c-sources, using a board clocked at 80 MHz. While it might be possible to significantly reduce cpu-load by rewriting some parts in assembler, we do not depend on that. The board is fast enough to run unoptimized c-versions of the algorithms in realtime.
At the moment we do not really have an "operating system". We have a set of bios-like calls and interrupt-service-routines for i/o-handling, virtual memory handling (currently not used) and multitasking (currently not used). Current software runs as a single task using direct physical adressing. This helps to keep things simple and therefore enhance security.
Future software might require direct connections to the internet and therefore an IP-stack on the HLB. We might then use RTEMS or LINUX as the HLB operating system. Both have already been ported to the processor used within the HLB. While offering great options this has the drawback of making it very hard to verify that the software is secure, mainly because the total code-size for the project will explode compared to the "minimal-system" approach.
This section serves as a mirror for the download of current HLB hard- and
software releases to increase server redundancy.
The material contained here is a copy of what is also available on the projects main web-page http://www.ccc.de/hlb/.
See section 3 of this web-page, "additional material" for some extra stuff not present on the projects main web-page.
This section provides all the files necessary to build and understand the
current HLB hardware.
This includes circuit-diagrams, pcb-layout and drilling plans as well as photos of our prototype-boards.
The current hardware is "board 2.0". This board was build as a prototype to evaluate the functionality and performance of this platform. It has no major bugs and we got it up running quickly. It was successful in that it showed that the platform is suitable for our application and that all the basic hardware-concepts are ok.
However this is not our final hardware. There are a few minor problems with that board and we want to implement some circuits different next time. We also want to increase the level of integration and use more smd-parts in order to reduce board-space. This has lead to the ongoing development of "board 2.1". The new board will be available early this century. Some material related to current design-efforts can be found in section 3 of this web-page, "additional material". Unless you are willing to heavily modify your hardware for compatibility in the future, dont build a "board 2.0" and wait for the "board 2.1"-design instead.
This section provides all the files necessary to build and understand the
current HLB firmware.
This includes sources for the boot-loader, the main "application program" and anything else needed.
To download software for your hosts GNU toolchain see section 3.3.
The software contained in this sections is at this point rather useless for the general user wishing to make a secure phone call. These are development shapshots useful for developers working on this project. Other people might find it of value for a detailed understanding of the machine.
This section contains additional material usually not present on the projects main web-page. This includes additional peripheral devices not part of the "main-board"-hardware, infos on LINUX and RTEMS ports to the HLB, very new updates for parts of the HLB-firmware and similar material.
Here is some additional hardware you might consider using with your HLB.
This section contains information on how to build a hardware random random number generator. Such a device is necessary to generate huge amounts of cryptographic secure random bits, needed for key generation and some cryptographic protocols.
The device shown here should be used with every HLB. While there are other methods to generate random bits, they can not really be secure in the absence of a source of significant entropy in the system. The HLB main-board has an input-pin for this purpose, you should connect an external source of analog noise to this pin. And this is where this circuit connects to. This random number generator is a small stand-alone device delivering analog noise.
We intentionally do not integrate this functionality with the HLB main-board for two reasons:
The circuit shown here is for a prototype board. It is not exactly what we would need with the HLB. Instead it contains additional circuits (to deliver a digital noise output in addition to the analog output) not necessary for the HLB. So the circuit could be simplified if used with the HLB. However the digital output is nice if you want to operate the circuit on the printer port of your pc or with other systems having no analog input lines. Actually the prototype board as it was really built contains even more circuits. It generates its own 2 Hz clock, samples the digital noise signal with that clock and flashes leds according to the result. Looks really nice, you can see the random bits changing the leds.
Noise is generated using a zener-diode operating at the zener-voltage. This physically generates wide-band noise, visible as a small voltage across the zener-diode. This signal is then amplified with 3 transistors, filtered to a smaller bandwidth and finally demodulated using a standard fm-demodulator. The resulting audio is the circuits analog noise signal. This is similar to an fm-radio tuned to an empty channel, just that we connect a zener-diode to the antenna-input to have a good noise-signal.
This section contains information on how to build a circuit to interface a loudspeaker and a microphone to the HLBs audio ad/da-converter. Such a device is necessary if you need a high quality microphone input and a clear, high volume loudspeaker output for your HLBs audio-port.
The circuit shown here was optimized for high quality using standard parts. This results in a lot of parts. Nevertheless the circuit-board could be quite small when layouted for smd. But even though its quality is great, this circuit is not intended to be part of the standard HLB main-board, mainly because of its complexity. We use it as a "reference device" to monitor or generate audio signals directly at the ad/da-converters pins.
The microphone signal is amplified with a transistor first, then passes through a high-gain bandpass filter optimized for speech. The signal level at the output of these filters is held constant by regulating the signal gain of the first transistor stage. This filtered and gain-regulated microphone signal is then brought to the circuits high-level symmetrical output stage, directly driving the a/d-converters differential input pins. The microphone is muted by hardware when no call is made. For that an input pin is provided which connects to an external on-hook/off-hook switch. This signal is debounced before driving the audio mute switches and optionally driving the hook input at the cpu. Nearly any type of microphone may be connected (dynamic, carbon, capacitor), selectable by jumper.
The loudspeaker output is driven by a TBA820 low-cost audio amplifier. The amplifiers audio bandwidth is limited to speech in order to reduce its noise. The TBA820 audio input comes from the circuits high-level symmetrical input stage, driven directly by the d/a-converters differential output pins. The loudspeaker is also muted by hardware when no call is made (via the debounced hook switch input). Output power is 500 mW with 5 volts supply using a 4 ohm speaker.
The circuit-diagram shows the TBA820 supply connected to 5 volt, but if you have a higher voltage available (16 volt max.) it is better to use this as its supply. The rest of the circuit should be left operating at 5 volt. This does not only increase output power to 1200 mW, but also has the advantage of increased decoupling between the high-power output stage (driving the loudspeaker) and the microphone amplifier (operating on very small signals) by using two independent supply volatges.
This section contains information on how to build a circuit to interface your telephone to the HLBs audio ad/da-converter. Such a device is necessary for the phone line emulation, an essential part of our design.
With our standard application such a circuit does not need to be able to generate a ring signal on the line (requires high voltages with several watts). This is because the users phone will only ring if there is a call on the "real" line coming from outside. In this case the telecom will provide the voltage necessarry for ringing on their line and the user should connect his ringer to that line. This can be done in several ways:
The users telephone is connected with the usual 2-wire interface to the circuit. There this signal is split to 4-wire with a bridge circuit build around an op-amp. This 4-wire signal can then be connected to the audio codec. Additionally there is a constant current source feeding the users phone. It also detects on-hook/off-hook condition and delivers it as a digital output signal to the cpu.
The bridge circuit needs a local emulation of the 2-wires remote end (users phone). So it can correctly substract away its own transmit signal from the 2-wire signal in order to generate the receive signal. This emulation of the users phone is formed by the 1K/68n parts near the op-amp. This is a generic emulation suitable for most standard phones. It is good for more than 10dB attenuation (local transmit signal at the local receive output). By hand-optimizing these values for your phone, attenuation may be increased up to 18dB. For very old phones the 68n capacitor should not be used, but with most new phones it improves emulation and bridge performance.
The transistor used with the constant current source (typ. 2N2905) needs some cooling. It usually dissipates about 300mW, with 750mW being the worst-case maximum. So select an appropriate cooling hat for reliable operation.
The digital hook output signal indicates the line current. While the line is normally sourced with 20mA, the hook threshold is about 5mA. The hook signal has an on-delay of less than 10mSec and an off-delay of less than 5mSec. So decoding of pulse-dialing with 40mS-breaks is possible.
This section contains information about the HLBs "glue logic" and some of the ways to implement it.
This section tries to explain why the glue is needed and how it works.
Hitachis SH-series of embedded cpus has many advantages like linear 32-bit architecture, "free" development tools, glueless interface to all kinds of memory (including sdram), a rich set of integrated peripherals (3 serial ports, timer, pcmcia, irda, i/o-pins, dma etc.) and is fast enough to run our algoritms. But when compared to traditional DSPs, it lacks the possibility of glueless interfacing to DSP-like serial audio codecs. The serial ports are not compatible with standard codecs, so either you must convert formats here or you must exchange data with the codecs not using the serial ports. In any case we need some glue logic here. It is also the only place in the design where the cpu needs the help of external logic.
Commerial designs usually go the second way and implement the necessary logic within their gate-arrays. They encode/decode the codecs serial frames, converting serial data to/from 16-bit parallel words which are then quickly exchanged with the cpu using the standard parallel system bus.
While this provides a clean solution, it requires a substantial amount of logic when implemented with TTLs (something we still want to support) and a connection to the cpu using a parallel bus with many wires. We decided to take a different approach and convert serial data formats, something that turned out to need less logic and much less wires when connected to the cpu. The circuit connects with 5(6) wires to the cpu and with another 6 wires to the codec. It can be implemented using 6 standard TTL/CMOS-chips (available as smd).
When designing the necessary logic between the cpu and the codec we must understand their serial needs first. Both have several serial operating modes and special timing requirements. See the respective manuals for details. The following text tries to explain why we had to choose this particular configuration and what this means for the glue logic.
The cpus serial port is closest to what we need if operated in synchronous mode. Hardware then consists of an 8-bit shift-register for the receiver and transmitter each, potentially operating full-duplex sharing a common clock signal. For audio codec application we usually operate on 16-bit words, therefore the cpu must handle this as 2 seperate 8-bit packs in respect to the serial port hardware. The synchronous serial port me be clocked either with an internal clock (which is then output to the clock pin) or with an external clock signal.
When using an external clock (thereby allowing an external source to determine the time when the audio sample is shifted in/out) the cpu must be very fast reacting between the first and the second of the two 8-bit packs forming an audio sample. Because the serial port has only 1 byte of buffering an overflow error would occur otherwise. This places unacceptable timing restrictions on the software. Instead we let the cpu determine when its ready to transfer the 8-bit packs (maybe as a result of an external interrupt request), thereby also generating its own serial clock at this moment.
The audio codec also has the option of either generating its own internal clock or be clocked from an external source to shift serial sample data. As we just decided to let the cpu generate and output its own serial clock, for simplicity we would like to operate the audio codec with this clock as its external clock source, not generating its own internal clock. The codec has two options when operated with an external serial clock, depending on how the sample rate is determined. Either the codec internally generates the sample rate itself, then the transfer of serial sample data must be synchronous to that sample clock with a maximum timing-difference of less than 1 uS. As the start of the cpus serial transfer is determined by its software, such a timing not not easily possible. The other codec mode when operated with an external clock is to make the sample rate dependent on the moment when the cpu starts the serial transfer. This moment depends on the HLBs software with variable latencies. Because we want no jitter in the sampling rate this solution is unacceptable, too. So there is no way in using an external clock for the codecs serial transfers, instead we must let the codec generate and output its own internal clocks.
These requirements alrealy dictate most of the glue logic function. We know that both the cpu and the codec generate their own timing and serial shift clocks, so we will need an external buffer to hold sample date in between. A single 16-bit shift register is enough here. Both the cpu and the codec use a single shift clock delivering 16 clock pulses to simultaniously shift in and out an 16-bit audio sample (operating full-duplex). Then we arrange for the codec and the cpu to access the shift-register alternately.
The exchange of audio data starts when to timer within the codec has determined that it is time for a new sample. Its hardware will then start a serial transfer, simultaniously shifting out the last a/d-sample and shifting in the next d/a-sample. After this shifting is done, an interrupt is generated to the cpu. A short time later the cpu will react, simultaniously shifting in the last a/d-sample and shifting out the next d/a-sample.
The hardware consists of the 16-bit shift register, along with a multiplexer on the shift registers clock and data input. While the codec delivers its 2.5 MHz shift clock all the time, its frame-sync signal indicates when the real shift of audio data takes place. The frame sync signal becomes active at the beginning of the first bit and goes to inactive on the end of the 16.st bit time. So this frame-sync signal may be used to operate the multiplexer on the shift registers data input. Generating the control signal for the multiplexer on the shift registers clock input is more complicated. The codecs clock is active all the time, so in order to avoid a clock spike at the beginning of the 17.st bit time (frame-sync goes to inactive and the clock to active at the same time here, overlap possible) the clock signal must be deactivated already in the middle of the 16.st bit time. This requires a 5-bit counter clocked in the middle of each bit-cell to count the data bits within each frame-sync and deactivate the clock as soon as it has counted to 16. For this purpose a flip-flop driving the clock multiplexer is provided. It is set by the edge of frame-sync becomming active and cleared by the bit-counter reaching 16 (in the middle of bit 16). The bit-counter itself is cleared by frame-sync not being active. The multiplexer on the shift registers clock input may be simplified by knowing that the cpus shift-clock will remain in the inactive state when no transfer is taking place. However the cpus data output pin will keep its level after shifting is completed, requiring a "full" multiplexer on the shift-registers data input. The output of the shift register is connected to the cpu and codec in parallel, as they ignore this signal while they are not shifting themself.
This section contains information on several ways to implement the glue logic.
This section contains information on the glue logic implemented using
standard TTL chips.
This version was developed, built and tested as part of "board 2.0". It works well.
This section contains information on the glue logic implemented using
the "ispLSI1016E" made by lattice semiconductor.
This version was developed for "board 2.1" but will probably not be used because the ispLSI1016E is a big 44-pin chip.
The chip costs about 3..6 $ in small quantities and the "ispLSI1016E" is compatible with the "MACH211".
Here is some additional software you might consider using with your HLB.
The RTEMS operating systems could be ported to the HLB. It has already been ported to the hitachi SH1-processor (part of the general RTEMS distribution) which is a subset of the SH3-processor used within the HLB. So porting should be quite straightforward. However no real work has yet been done on porting RTEMS to the HLB.
Porting RTEMS to the HLB is expected to provide:
For further information on RTEMS see:
The LINUX operating system could be ported to the HLB. It has already been ported to the hitachi SH3-processor (part of the general LINUX distribution) which is the processor used within the HLB. So porting should be quite straightforward. Some successful work has been done in getting LINUX up and running on a commercial board similar to the HLB.
Porting LINUX to the HLB is expectoed to provide:
For results of getting LINUX up on a board similar to the HLB see:
For further information on LINUX on the SH3 see:
This section provides all the files necessary to build and understand the
current HLB development environment.
This includes binutils, gcc and the libraries along with instructions how to configure, compile and install them on your host.
Installation-procedures assume that you are working on a LINUX-based host or something close to that. If you want to use a dos or windows-based development environment you can do so, but you are on your own compiling the tools or using a cygnus GNU dos-distribution. You will run into all sorts of trouble with filenames, forward/backword slashes, makefile compatibility and such things. We do not recommend or support this configuration. Install LINUX or BSD instead !
Installation requires that you download several public packages first (binutils, gcc, newlib, gmp). Finally you download our installation script. Edit the installation script to reflect your needs. Then make sure all the files are in the same directory and run the installation script. It tries to configure, compile and install the complete toolchain.
Impressum: Volker Ernst, Grundstr.19, 20257 Hamburg, Germany.
All material © Copyright by Volker Ernst unless noted otherwise.