Writeup DGSE - CTF - Le Polyglotte

This article describes my solution for the 150-point challenge called “Le Polyglotte”.

Introduction

Nous avons intercepté un fichier top secret émanant d'Evil Country, il est très certainement en rapport avec leur programme nucléaire.
Personne n'arrive à lire son contenu.
Pouvez-vous le faire pour nous ? Une archive était dans le même dossier, elle peut vous servir
Le flag est de la forme : DGSESIEE{x} avec x un hash que vous trouverez

If you want to test it yourself, message.pdf file is available here and secrets.zip is available here.

Start

We have a PDF file in which is written Top Secret and many horizontal bars. It’s quite strange but I’m pretty sure that this file must hide a lot of things. We also have a zipped and password protected file, it seems to be protecting two files.

unzip secrets.zip
Archive:  secrets.zip
[secrets.zip] hint.png password:
password incorrect--reenter:
   skipping: hint.png                incorrect password
   skipping: info.txt                incorrect password

Analysis

First, I would like to take a closer look at the PDF file. I decided to display the first 100 lines.

head -n 100 message.pdf

Information identified

JavaScript code

<script>var flag = [91,48,93,97,97,57,51,56,97,49,54];</script>
<script>for(i=0;i<flag.length;i++){flag[i] = flag[i]+4} alert(String.fromCharCode.apply(String, flag));</script>

From the second piece of JavaScript code, I deduce that the elements are decimal values. The script is not there for nothing, so I decide to execute it, here is the result returned.

_4aee=7<e5:

Not much information. After that I still wonder what the ASCII value of these bytes is.

#! /usr/bin/env python

chars = [91,48,93,97,97,57,51,56,97,49,54]

print(''.join(chr(char) for char in chars))
python wtf.py
[0]aa938a16

So we have a string that already looks more like something.

It was only after I understood that by removing the +4 from the JavaScript code that the returned string was the same.

Base 16 chars

<5b 31 5d 34 64 38 36 32 64 35 61>
<43 65 20 64 6f 63 75 6d 65 6e 74 20 63 6f 6e 63 65 72 6e 65 20 6c 20 6f 70 65 72 61 74 69 6f 6e 20 73 6f 6c 65 69 6c 20 61 74 6f 6d 69 71 75 65 2e 0a 43 65 74 74 65 20 6f 70 65 72 61 74 69 6f 6e 20 65 73 74 20 73 74 72 69 63 74 65 6d 65 6e 74 20 63 6f 6e 66 69 64 65 6e 74 69 65 6c 6c 65 20 65 74 20 6e 65 20 64 6f 69 74 20 65 6e 20 61 75 63 75 6e 20 63 61 73 20 ea 74 72 65 20 64 65 76 6f 69 6c 65 65 2e 20 0a 4c 65 73 20 69 6e 66 6f 72 6d 61 74 69 6f 6e 73 20 73 75 72 20 6c 20 6f 70 65 72 61 74 69 6f 6e 20 73 6f 6e 74 20 64 69 73 73 65 6d 69 6e e9 65 73 20 64 61 6e 73 20 63 65 20 66 69 63 68 69 65 72 2e 0a 43 68 61 71 75 65 20 70 61 72 74 69 65 20 64 65 20 6c 20 69 6e 66 6f 72 6d 61 74 69 6f 6e 20 65 73 74 20 69 64 65 6e 74 69 66 69 65 65 20 70 61 72 20 75 6e 20 6e 6f 6d 62 72 65 20 70 61 72 20 65 78 20 3a 20 0a 5b 30 5d 61 65 37 62 63 61 38 65 20 63 6f 72 72 65 73 70 6f 6e 64 20 61 20 6c 61 20 70 72 65 6d 69 e8 72 65 20 70 61 72 74 69 65 20 64 65 20 6c 20 69 6e 66 6f 72 6d 61 74 69 6f 6e 20 71 75 20 69 6c 20 66 61 75 74 20 63 6f 6e 63 61 74 65 6e 65 72 20 61 75 20 72 65 73 74 65 2e>

The first hexadecimal character block represents the next ASCII value.

[1]4d862d5a

Even more interesting. Earlier we recovered [0], now [1], I’m actually catching on to the fact that we’re going to have to build a string with [x] elements.

The second block, much more consequent, validates my suppositions.

Ce document concerne l operation soleil atomique.
Cette operation est strictement confidentielle et ne doit en aucun cas être devoilee. 
Les informations sur l operation sont disseminées dans ce fichier.
Chaque partie de l information est identifiee par un nombre par ex : 
[0]ae7bca8e correspond a la première partie de l information qu il faut concatener au reste.

Cracking

At that moment I would like a hint and therefore more information. This is what the zipped file can offer us. I don’t know the password, so let’s go for a little brute-force.

fcrackzip -uDp /usr/share/dict/rockyou.txt -v secrets.zip
found file 'hint.png', (size cp/uc 137659/137622, flags 9, chk 76d4)
found file 'info.txt', (size cp/uc    109/   107, flags 9, chk 9748)
checking pw doomedlover

PASSWORD FOUND!!!!: pw == finenuke

Nice password! Once decompressed we have a hint.png image that contains a fish, a blowfish exactly. Could it be a link with the symmetric-key block cipher? The content of the second file is the following.

cat info.txt
Ange Albertini
key='\xce]`^+5w#\x96\xbbsa\x14\xa7\x0ei'
iv='\xc4\xa7\x1e\xa6\xc7\xe0\xfc\x82'
[3]4037402d4

Data extraction

First of all, Ange Albertini is a very well known person in the world of file formats, reverse engineering and so many other things. This confirms once again the challenge we face: finding data where there isn’t normally any. If you know, he gave an interview on a NoLimitSecu podcast, very interesting!

The second and third lines start respectively with key= and iv= most certainly a link with an encryption algorithm… Finally we have an additional element which is added to our list of searched elements.

The variable key contains different types of encoding, the first step is to put everything in hexadecimal. Here’s what I get.

key=0xce5d605e2b35772396bb736114a70e69

So we have a key, an initialization vector, a cipher and a file. All we have to do now is to decrypt it!

openssl enc -blowfish -d -iv c4a71ea6c7e0fc82 -in message.pdf -out output -K ce5d605e2b35772396bb736114a70e69
bad decrypt
139762295595840:error:06065064:digital envelope routines:EVP_DecryptFinal_ex:bad decrypt:crypto/evp/evp_enc.c:583:

It didn’t work well. After reading a few things about polyglot files, I thought that maybe I shouldn’t decipher but encrypt…

openssl enc -blowfish -e -iv c4a71ea6c7e0fc82 -in message.pdf -out output -K ce5d605e2b35772396bb736114a70e69

No message, I have the feeling that it goes well. Strange, isn’t it?

file output
output: JPEG image data

We have now a picture! After adding the right extension (.jpeg) and opening it, we have a nice nuclear explosion… It must be hiding something.

binwalk output.jpeg

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
72613         0x11BA5         ELF, 64-bit LSB shared object, AMD x86-64, version 1 (SYSV)

An ELF in a picture! Why not! However, binwalk cannot extract it with the -e flag. We’ll have to use the good old dd. The binwalk tool gives us information about the offset where the data is located. With the skip argument we place the cursor on this offset to extract it.

dd skip=72613 bs=1 if=output.jpeg of=elf.bin
17091+0 records in
17091+0 records out
17091 bytes (17 kB, 17 KiB) copied, 0.0271926 s, 629 kB/s

Did we extract what we intended?

file elf.bin
elf.bin: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=dc9ae4a3811cf6a4acd62a1a5ab812bfd81fbe8d, for GNU/Linux 3.2.0, not stripped

Yes, it seems so. Well now we’ll have to run this ELF to see what it asks us or does.

chmod +x elf.bin
./elf.bin
Operation Soleil Atomique
Entrez le mot de passe : booom!
Mauvais mot de passe

Reverse engineering

I like the radare2 tool for reverse engineering. Launch the first analysis with aaaa.

radare2 elf.bin
[0x00001140]> aaaa
[x] Analyze all flags starting with sym. and entry0 (aa)
[x] Analyze function calls (aac)
[x] Analyze len bytes of instructions for references (aar)
[x] Check for objc references
[x] Check for vtables
[x] Type matching analysis for all functions (aaft)
[x] Propagate noreturn information
[x] Use -AA or aaaa to perform additional experimental analysis.
[x] Finding function preludes
[x] Enable constraint types analysis for variables

The first thing to do to understand what the program does (if it is not stripped, here it is not) is to go to the memory address of the main() function.

[0x00001140]> s main
[0x00001333]>

I discovered not long ago the integration of the decompiler in radare2. I have already installed this tool, but you can find all the information here.

Time to decompile now.

[0x00001333]> pdg
undefined8 main(undefined8 argc, char **argv)
{
    int64_t iVar1;
    uint8_t *arg1;
    undefined8 uVar2;
    int64_t in_FS_OFFSET;
    char **var_b0h;
    int64_t var_a4h;
    uint32_t var_94h;
    char *dest;
    int64_t var_88h;
    int64_t var_80h;
    int64_t var_50h;
    char *src;
    int64_t canary;

    canary = *(int64_t *)(in_FS_OFFSET + 0x28);
    sym.imp.puts("Operation Soleil Atomique");
    sym.imp.printf("Entrez le mot de passe : ");
    sym.imp.fgets(&src, 0x10, _reloc.stdin);
    iVar1 = sym.imp.strlen(&src);
    arg1 = (uint8_t *)sym.imp.malloc(iVar1 + 1);
    sym.imp.strcpy(arg1, &src, &src);
    sym.checkpassword((int64_t)arg1);
    if ((arg1[1] ^ *arg1) == 0x69) {
        if ((arg1[2] ^ arg1[1]) == 0x6f) {
            if ((arg1[3] ^ arg1[2]) == 0x38) {
                if ((arg1[4] ^ arg1[3]) == 0x56) {
                    if ((arg1[5] ^ arg1[4]) == 0x50) {
                        if ((arg1[6] ^ arg1[5]) == 0x57) {
                            if ((arg1[7] ^ arg1[6]) == 0x50) {
                                if ((arg1[8] ^ arg1[7]) == 0x56) {
                                    if ((arg1[9] ^ arg1[8]) == 6) {
                                        if (arg1[9] == 0x34) {
                                            sym.imp.puts("Bravo");
                                            sym.imp.exit(0);
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
    sym.imp.puts("Mauvais mot de passe");
    uVar2 = 0;
    if (canary != *(int64_t *)(in_FS_OFFSET + 0x28)) {
        uVar2 = sym.imp.__stack_chk_fail();
    }
    return uVar2;
}

We can very easily locate functions such as puts(), printf() and fgets(). There is also a checkpassword() function that takes as argument the memory address that malloc() has allocated to the password retrieved by fgets().

What does the checkpassword() function do? It does bitwise XOR (^) between the characters n and n-1 (from bottom to top).

It is of course possible to do this by hand, but our processors are specially made for this, and do it very well!

#! /usr/bin/env python

password = [0x34]
password = [password[0] ^ 6] + password
password = [password[0] ^ 0x56] + password
password = [password[0] ^ 0x50] + password
password = [password[0] ^ 0x57] + password
password = [password[0] ^ 0x50] + password
password = [password[0] ^ 0x56] + password
password = [password[0] ^ 0x38] + password
password = [password[0] ^ 0x6f] + password
password = [password[0] ^ 0x69] + password

print(bytes(password).decode('utf-8'))

Test it.

python xor.py
[2]e3c4d24

Verify it.

./elf.bin
Operation Soleil Atomique
Entrez le mot de passe : [2]e3c4d24
Bravo

I think we have reached the end of this challenge. We now get the 4 elements we have to assemble: [0]aa938a16, [1]4d862d5a, [2]e3c4d24 and [3]4037402d4.

Once assembled with DGSESIEE{} we get DGSESIEE{aaa938a164d862d5ae3c4d244037402d4} which corresponds to the flag expected for this challenge.