README 3.27 KiB
Here we run some attacks to try to reproduce the same traces in the logic analyser as an execution of the code in the BRAM. To run the attacks from gdb, run the following commands (with openOCD running): $ gdb-multiarch -ex "set architecture armv7" -ex "target extended-remote localhost:3333" --command=exec_from_bram.gdb $ gdb-multiarch -ex "set architecture armv7" -ex "target extended-remote localhost:3333" --command=attack_0.gdb [...] To observe the traces in the logic analyser, run the following commands (there is no need to run the attacks first as we saved the data from the logic analyser): $ make -C ./hw/logic_analyzer/decode clean $ make -C ./hw/logic_analyzer/decode In the attacks, we read in the BRAM to have the same traces as a fetch. Also, we add indirect branches and re-configure CoreSight and the MMU to try to have the same "trace_data" in the TPIU. When we run the attacks from gdb, we use openOCD. 0) The first step is to proceed to a fetch and execution of the code in the BRAM to save the signals for a future comparison. It is recommended to hard-reset the board between each try so that we get a determistic behaviour for CoreSight. 1) In the first attack, we show that we must use 9 registers to read 8 times 32 bits from the BRAM. Otherwise, this does not reproduces the same signals. 2) In the second attack, we show that, if we "add r10, pc, #16" and "mov pc, r10" after each read in the BRAM, this introduces a delay between the reads. Hence a mismatch with an execution of the code in the BRAM. 3) In the third attack, we show that, if we prepare several registers with the destination addresses first, then only "mov pc, [rx]" after each read in the BRAM, we obtain the same traces for a read access as a fetch (there is a limit since the number of registers is fixed by the architecture). We have a mismatch on the "trace_data" in the TPIU. This is because we do not have the same destination addresses and, if we configure the MMU to ouput the same addresses, we will have a match. 4) In the fourth attack, we reconfigure the MMU and we re-run the the code in the BRAM with the same destination addresses as the third attack. Once again, it is recommended to hard-reset the board between each try so that we get a determistic behaviour for CoreSight. Note: the third attack might need to be run several times to match the signals from a fetch. But eventually, we have a match: this means that the hardware monitor can be tricked. The solution is to have enough indirect branches in the code in the BRAM so that an attacker runs out of registers to prepare the attack. ARM v7 microprocessors have thirteen general-purpose 32-bit registers, R0 to R12; plus three 32-bit registers with special uses, SP, LR, and PC. (see ARM v7 architecture reference manual) The attacker cannot use PC as a register for the attack, but (s)he can use SP and LR. So, 15 registers are available. The attacker must use 9 registers to read in the BRAM and simulate a fetch. Then, the attacker can only prepare 6 indirect branches before running the attack. So, to defend ourselves, we must have a code that contains at least 7 packs of 8 words of code, each with an indirect branch. This forces the attacker to re-create the desination address after the 6th read and introduces a delay just like in the first attack.