The Semiconductor Security War
If you want to check out the video first, it is below:
When I finished this video, I definitely felt like I did not do justice to the field of semiconductor security. The textbooks were miles thick and a video can only be so long.
People have been asking me about merchandise. I am not really a merch guy, but here’s a shirt that I have been wearing and I like it.
Modern chips own your life. For instance, take the A15 SOC that is sitting inside your iPhone. Inside that chip are multiple security assets of high corporate value: encryption keys, developer keys, DRM keys, and so on.
Furthermore, imagine how much of your life's business is conducted through your mobile phone. For instance, my phone has my biometric information, my bank access information, passwords to all my services, and so on.
Software security protections are frequently implemented with the tenet that "trust starts in silicon". But a house cannot be built on soft sand. Likewise, a secure system cannot be architectured on top of compromised hardware.
In this video, I want to talk about the daunting problem of maintaining security in today's modern semiconductors.
Why Compromise Hardware?
Why compromise hardware? An attacker can have a variety of goals. And defenders have to consider all of them.
They might want to outright disable or destroy the system, usually at a specified time in the future. These include kill switches, backdoors, or control circuitry.
Or they might be looking to just change the chip's behavior. For instance, pirates look to compromise cable TV cards so that they can get access to cable TV for free. iPhone jailbreaks might also go under this category as well.
Or they might want to leak or gain access to sensitive information stored on the device - like the encryption keys.
Or they might be seeking to steal IP from the chip itself. Integrated circuit counterfeit and piracy is a real thing and can cost companies millions of dollars.
Furthermore, the stolen IP can be used to find additional, more damaging vulnerabilities in the overall chip design.
So there are a lot of purely commercial reasons to compromise hardware. And that assumes you aren't even a person of interest to some nation-state.
Non-Invasive Attacks
These compromises can be introduced in a couple of different ways: Invasive or Non-invasive. Let's start with the latter. Non-invasive attacks do not require any work to be done on the device prior to the attack. Because of this, attacks are often very scalable and little evidence is left after the deed is done. For this reason, they are considered very dangerous.
For instance, a timing side-channel attack. This is a passive attack where you try to acquire sensitive data by measuring the computation time in a piece of hardware.
I know it sounds crazy. But if you know the algorithm and the values of its input, you can calculate its output. This is helpful for extracting encryption keys and passwords.
Examples of timing attacks have been presented in conferences against hardware implementations of RSA - a venerable public key cryptography system. The infamous Spectre security vulnerability is exploited with a timing attack.
## Invasive Attacks and Trojans
An invasive hardware attack involves changing the physical layout of one or more integrated circuits. There are a number of ways to do this. For instance, someone might try to do this by swapping a legitimate design with an illegitimate one.
But the invasive attack of greatest recent concern would be to insert additional logic to the design - a hardware Trojan. These are malicious, intentional modifications of a circuit that results in undesired behavior.
One very controversial example of this is Bloomberg's 2018 report about Super Micro Computer. I know that article was extremely controversial.
And it made Bloomberg a rag in some people's eyes. Regardless of whether or not it is actually true, the attack described is a realistic threat vector. Just because this specific incident didn't happen does not mean it cannot happen at all.
The Design Process
I think it makes sense to briefly stop here so that we can review several major steps within the chip design and fabrication process. This will help us better understand how hardware Trojans can be inserted into a chip as it gets designed and fabricated.
Semiconductor design starts with a specification. That specification is turned into a high level design representation of the chip - referred to as an RTL - using a language like Verilog.
Modern SOCs are architected by integrating hundreds of pre-designed and pre-verified hardware blocks or "cores". Cores can be RTL designs - "soft" cores as it is called - or designs already laid out - "hard" cores.
This IP block design methodology allows for design reuse, cost reduction, and helps meet time-to-market constraints. Without it, I don't think modern semiconductor design is possible at a commercial level.
These cores are provided by an ecosystem of industry players - Original Equipment Manufacturers, Semiconductor design houses, or even by the foundry itself. None of these cores can be trusted and inserted into the final design as is without verification and testing.
Then that high level representation is converted into a netlist, a bunch of gates.
These are done with third party EDA tools like Cadence Genus Synthesis Solution.
The gates in the netlist are then placed and physically routed using EDA software.
The design is then transmitted to a foundry for fabrication, usually in a file format called GDSII. After fabrication, the wafer is cut and packaged before being shipped into the rest of the supply chain.
Compromises of various types depending on their various goals and situations can be introduced all throughout this design stage process - including at the fabrication level.
A Trojan inserted during the specification and the design stage before fabrication is called a pre-silicon attack.
A Trojan inserted during fabrication is called an in-silicon attack. And anything after that is referred to as a post-silicon attack.
Dependencies
What makes Trojans so hard to find is that we often don't know their type, size, or location. It is possible for someone - a disgruntled employee, a nation-state, anyone - to introduce malicious logic into these IPs via an untrusted third-party IP vendor, an untrusted foundry, a component taken off-the-shelf or even through an untrusted EDA tool.
The source of such an introduced flaw can be easily hidden, found long after the damage is done.
Furthermore, they are frequently designed to only activate during rare conditions - which aren't easily covered during the verification checks. These covert attacks as they are called can sit unnoticed for many years.
In and Post Silicon Tests
Many existing analyses are geared towards finding Trojans at the Foundry level. This is especially the case if you are fabbing that chip at a location not previously vetted to be secure.
Some foundries have done something called split-manufacturing. This is an obfuscation technique where different untrusted foundries fab different parts of the chip. No single foundry has a whole view of the final product.
The most extreme method to detect a Trojan in a fabbed chip would be to take a ground-up approach. You de-package the entire chip, reverse engineer it, and look at it layer by layer so that you can see exactly how it has been fabbed.
Naturally this gets you the best results, but the chip is totally unusable at the end. You might have guaranteed this particular chip to be alright, but what about others the Fab might have made? So we can't say this is all that scalable.
The most common, non-destructive approach is something called a side channel technique. They look at a chip's signals - its power, timing, temperature, radiation signature etc - and compare them against what's given off by some trusted "golden" version of the chip.
So you do the destructive teardown examination for one such fabbed chip to get that first "golden version". Then compare its golden signals against those of subsequent chips in the same run. Sudden changes or delays in the current, timing profile, radiation, or power signals hint at the presence of a Trojan.
As node processes get more advanced, chips get more complicated and transistors get smaller, which means that side channel variation analysis gets more subtle. Is that change just normal? Or is something else afoot? Something to think about.
And golden version detection techniques are not so effective at the design specification and RTL stages, because there is often no golden version of the chip design to compare against.
Pre-Silicon Tests
The Trojan adversary only wants the Trojan to activate under rare conditions. This means inserting it into a rare branch of the design - a metaphorical back alley. So ideally, you only want to include the design code you use into the chip. Nothing more and nothing less.
I talked a little bit about verification in my previous video. Verification tests are frequently run for finding and debugging manufacturing errors.
Those tests simulate normal working conditions. You might understand the weakness of this approach though when it comes to finding Trojans. You are basically hoping that the test hits and triggers the Trojan's rare activation case by sheer luck.
Researchers have proposed other methods. For instance, coverage analyses that look at the percentage of lines of HDL code that gets activated during an intensive verification test. Code that sits unactivated gets flagged as a potential Trojan.
And then one of my favorites - ring oscillation. A ring oscillator describes a device composed of an odd number of NOT gates linked together in a ring.
A NOT gate is a logic that negates something. For instance, from "Your Mom" to "Not your Mom".
Designers put ring oscillators into their IC design and listen to their output frequency. If a Trojan within the design activates, it adds new gates into the loop that changes the oscillators' frequency. Like a fly caught in a spider web, I guess.
Design for Trust
The reality though is that you are basically reaching out into the dark in hope of getting lucky. You can’t count on that. So you will also want to have good security design practices - to "design for trust" as it is called. There’s a few ways to do this.
The first way is to design to make it easier for the techniques I just talked about to detect a hidden Trojan. To facilitate a side-channel signal analysis, you might design the chip to minimize its background side channel signals. That way you can better hear potential side channel variations when running the analysis.
Another way is to make it harder for potential Trojan adversaries to insert the Trojan in the first place. For instance, camouflaging the logic gates in a semiconductor layout. This keeps attackers from understanding the original design, preventing them from successfully inserting a Trojan.
Here is a nifty one. Modern designs often have unused spaces that have nonfunctional "filler" cells put in. A common Trojan insertion technique is to replace a filler cell with a Trojan.
But what if you were to wire up the filler cells into a circuit and then test them? If someone replaced a filler cell, then this will let us know there is something wrong.
Conclusion
Increasing globalization, complexity, and aggressive commercial requirements have opened up more opportunities to steal and compromise semiconductors at the design level. The reality is that many vendors have to balance the security issue against these commercial concerns.
I’ve listed just a small subset of the available detection techniques to find Trojans across the whole field. To use them all is impractical. Since these commercial concerns are aggressive, more attention should be paid to making it harder for Trojans and other invasive attacks to be implanted in the first place. Prevention over treatment.
As with their software brethren, the hardware security industry is engaged in an ever-escalating battle with their adversaries. There is no golden bullet to address all of these concerns. A broad range of security steps should be taken, with the goal of making life for the other side much harder on the whole.