Need QEMU help: TLS handshake failing (ESP-IDF)

I made a post over at the ESP32 forum, but I’m not sure I’ll get a response there (as this seems to be more of a QEMU problem).

I’ve got a Docker container running ESP-IDF (for development) along with a Mosquitto server for testing MQTT messaging. I can successfully connect and pub/sub to the Mosquitto broker with open MQTT (non-TLS) in QEMU as well as on a physical device. My full application is here: course-iot-with-esp-idf/workspace/apps/mqtts_demo/main/main.c at main · ShawnHymel/course-iot-with-esp-idf · GitHub

Here is my code for connecting over TCP on my physical ESP32 device. The hostname is the IP address of my computer on my local network. This works:

    esp_mqtt_client_config_t mqtt_cfg = {
        .broker.address.hostname = "10.0.0.100",
        .broker.address.transport = MQTT_TRANSPORT_OVER_TCP,
        .broker.address.port = 1883,
        .credentials.username = "iot",
        .credentials.authentication.password = "mosquitto"
    };

    // Initialize MQTT client
    esp_mqtt_client_handle_t mqtt_client = esp_mqtt_client_init(&mqtt_cfg);

    // Register event handler
    esp_ret = esp_mqtt_client_register_event(mqtt_client,
                                             ESP_EVENT_ANY_ID, 
                                             mqtt_event_handler, 
                                             NULL);
    if (esp_ret != ESP_OK) {
        ESP_LOGE(TAG, "Failed to register MQTT event handler (%d)", esp_ret);
        ESP_ERROR_CHECK(esp_mqtt_client_destroy(mqtt_client));
        abort();
    }

    // Start MQTT client
    esp_ret = esp_mqtt_client_start(mqtt_client);
    if (esp_ret != ESP_OK) {
        ESP_LOGE(TAG, "Failed to start MQTT client (%d)", esp_ret);
        ESP_ERROR_CHECK(esp_mqtt_client_destroy(mqtt_client));
        abort();
    }
  
   ...

Here is the esp_mqtt config for connecting over TCP in QEMU. The hostname is the IP address of the host container from within QEMU. This works:

    esp_mqtt_client_config_t mqtt_cfg = {
        .broker.address.hostname = "10.0.2.2",
        .broker.address.transport = MQTT_TRANSPORT_OVER_TCP,
        .broker.address.port = 1883,
        .credentials.username = "iot",
        .credentials.authentication.password = "mosquitto"
    };

Here is the esp_mqtt config for connecting over SSL/TLS on the physical ESP32. This works:

    esp_mqtt_client_config_t mqtt_cfg = {
        .broker.address.hostname = "10.0.0.100",
        .broker.address.transport = MQTT_TRANSPORT_OVER_SSL,
        .broker.address.port = 8883,
        .broker.verification.use_global_ca_store = false,
        .broker.verification.certificate = (const char *)mqtt_ca_cert_start,
        .broker.verification.certificate_len = mqtt_ca_cert_end - mqtt_ca_cert_start,
        .broker.verification.skip_cert_common_name_check = false,
        .broker.verification.common_name = "localhost",
        .credentials.username = "iot",
        .credentials.authentication.password = "mosquitto",
    };

Here is the esp_mqtt config for connecting over SSL/TLS in QEMU. This does NOT work:

    esp_mqtt_client_config_t mqtt_cfg = {
        .broker.address.hostname = "10.0.2.2",
        .broker.address.transport = MQTT_TRANSPORT_OVER_SSL,
        .broker.address.port = 8883,
        .broker.verification.use_global_ca_store = false,
        .broker.verification.certificate = (const char *)mqtt_ca_cert_start,
        .broker.verification.certificate_len = mqtt_ca_cert_end - mqtt_ca_cert_start,
        .broker.verification.skip_cert_common_name_check = false,
        .broker.verification.common_name = "localhost",
        .credentials.username = "iot",
        .credentials.authentication.password = "mosquitto",
    };

With mbedTLS logging on verbose, here is what I see in the terminal:

I (2245) mqtts_demo: Starting MQTT demo
I (2265) eth_qemu: Starting Ethernet...
I (2295) esp_eth.netif.netif_glue: 00:00:00:00:00:03
I (2295) esp_eth.netif.netif_glue: ethernet attached to netif
I (2395) network_wrapper: Waiting for WiFi to connect...
I (2395) eth_qemu: Ethernet started
I (2405) eth_qemu: Ethernet link up
I (2405) eth_qemu: Ethernet MAC address: 00:00:00:00:00:03
I (2405) network_wrapper: Connected to WiFi
I (2405) network_wrapper: Waiting for IP address...
I (3405) esp_netif_handlers: eth ip: 10.0.2.15, mask: 255.255.255.0, gw: 10.0.2.2
I (3405) eth_qemu: Ethernet IPv4 address obtained
I (3405) eth_qemu:   IP address: 10.0.2.15
I (3405) eth_qemu:   Netmask: 255.255.255.0
I (3405) eth_qemu:   Gateway: 10.0.2.2
I (12405) network_wrapper: Connected to IPv4 network
I (12405) mqtts_demo: Connecting to MQTT broker...
I (12455) mbedtls: ssl_tls.c:4606 => handshake

I (12455) mbedtls: ssl_msg.c:2353 => flush output

I (12455) mbedtls: ssl_msg.c:2362 <= flush output

I (12465) mbedtls: ssl_tls.c:4525 client state: MBEDTLS_SSL_HELLO_REQUEST

I (12465) mbedtls: ssl_msg.c:2353 => flush output

I (12465) mbedtls: ssl_msg.c:2362 <= flush output

I (12465) mbedtls: ssl_tls.c:4525 client state: MBEDTLS_SSL_CLIENT_HELLO

I (12465) mbedtls: ssl_client.c:919 => write client hello

--- Error: read failed: socket disconnected
--- Waiting for the device to reconnect................................

Even though the handshake succeeds in the real device, it gets to “write client hello” and then disconnects for some reason in QEMU. Does anyone have any clue as to why this is happening?

Hey @ShawnHymel :wave: I’ve never run an MQTT server/client or similar within QEMU, but my knee-jerk reaction is to validate you can actually get out on port 8883. Maybe you could reconfigure your Mosquito server to run TCP connections (which you’ve already validated) over 8883 instead of 1883 and test your client (in QEMU) using TCP at that port.

// It might be helpful to enable logging at the Transport layer as well in the hopes of getting a better indication of what’s wrong. A good starting tag might be transport_base

#include "esp_log.h"

...

esp_log_level_set("transport_base", ESP_LOG_ERROR);

Hi @stephenmurphy :wave: Thank you so much for helping out. I’m really struggling with this one.

I added esp_log_level_set("transport_base", ESP_LOG_VERBOSE); to the top of my main_app() but I still get the same output:

I (2371) network_wrapper: Connected to network
I (2371) network_wrapper: Waiting for IP address...
I (3371) esp_netif_handlers: eth ip: 10.0.2.15, mask: 255.255.255.0, gw: 10.0.2.2
I (3371) eth_qemu: Ethernet IPv4 address obtained
I (3371) eth_qemu:   IP address: 10.0.2.15
I (3371) eth_qemu:   Netmask: 255.255.255.0
I (3371) eth_qemu:   Gateway: 10.0.2.2
I (12371) network_wrapper: Connected to IPv4 network
I (12371) mqtts_demo: Connecting to MQTT broker...
I (12411) mbedtls: ssl_tls.c:4606 => handshake

I (12411) mbedtls: ssl_msg.c:2353 => flush output

I (12411) mbedtls: ssl_msg.c:2362 <= flush output

I (12411) mbedtls: ssl_tls.c:4525 client state: MBEDTLS_SSL_HELLO_REQUEST

I (12411) mbedtls: ssl_msg.c:2353 => flush output

I (12411) mbedtls: ssl_msg.c:2362 <= flush output

I (12411) mbedtls: ssl_tls.c:4525 client state: MBEDTLS_SSL_CLIENT_HELLO

I (12411) mbedtls: ssl_client.c:919 => write client hello

--- Error: read failed: socket disconnected
--- Waiting for the device to reconnect.......

What’s interesting is that I can run a basic HTTPS request using raw mbed TLS calls (application code here).

I (7459) eth_qemu: Ethernet IPv4 address obtained
I (7459) eth_qemu:   IP address: 10.0.2.15
I (7459) eth_qemu:   Netmask: 255.255.255.0
I (7459) eth_qemu:   Gateway: 10.0.2.2
I (16459) http_request: Connected to IPv4 network
I (16459) http_request: Initializing MbedTLS...
I (16469) http_request: Connecting to www.howsmyssl.com:443...
I (16469) main_task: Returned from app_main()
I (16549) http_request: Connected
I (16559) http_request: Performing SSL/TLS handshake...
I (17319) esp-x509-crt-bundle: Certificate validated
I (17779) http_request: Handshake complete
I (17779) http_request: Verifying peer X.509 certificate...
I (17779) http_request: Certificate verified
I (17779) http_request: Cipher suite is TLS-ECDHE-RSA-WITH-AES-128-GCM-SHA256
I (17779) http_request: Writing HTTP request...
I (17779) http_request: 110 bytes written
I (17779) http_request: Reading HTTP response...
HTTP/1.0 200 OK
Access-Control-Allow-Origin: *
Content-Length: 2647
Content-Type: application/json
Strict-Transport-Security: max-age=631138519; includeSubdomains; preload
Vary: Accept-Encoding
Date: Fri, 21 Feb 2025 20:31:13 GMT

{"given_cipher_suites":["TLS_AES_256_GCM_SHA384","TLS_AES_128_GCM_SHA256","TLS_AES_128_CCM_SHA256","TLS_AES_128_CCM_8_SHA256","TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384","TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384","TLS_ECDHE_ECDSA_WITH_AES_256_CCM","TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384","TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384","TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA","TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA","TLS_ECDHE_ECDSA_WITH_AES_256_CCM_8","TLS_ECDHE_ECDSA_WITH_ARIA_256_GCM_SHA384","TLS_ECDHE_RSA_WITH_ARIA_256_GCM_SHA384","TLS_ECDHE_ECDSA_WITH_ARIA_256_CBC_SHA384","TLS_ECDHE_RSA_WITH_ARIA_256_CBC_SHA384","TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256","TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256","TLS_ECDHE_ECDSA_WITH_AES_128_CCM","TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256","TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256","TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA","TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA","TLS_ECDHE_ECDSA_WITH_AES_128_CCM_8","TLS_ECDHE_ECDSA_WITH_ARIA_128_GCM_SHA256","TLS_ECDHE_RSA_WITH_ARIA_128_GCM_SHA256","TLS_ECDHE_ECDSA_WITH_ARIA_128_CBC_SHA256","TLS_ECDHE_RSA_WITH_ARIA_128_CBC_SHA256","TLS_RSA_WITH_AES_256_GCM_SHA384","TLS_RSA_WITH_AES_256_CCM","TLS_RSA_WITH_AES_256_CBC_SHA256","TLS_RSA_WITH_AES_256_CBC_SHA","TLS_ECDH_RSA_WITH_AES_256_GCM_SHA384","TLS_ECDH_RSA_WITH_AES_256_CBC_SHA384","TLS_ECDH_RSA_WITH_AES_256_CBC_SHA","TLS_ECDH_ECDSA_WITH_AES_256_GCM_SHA384","TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA384","TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA","TLS_RSA_WITH_AES_256_CCM_8","TLS_ECDH_ECDSA_WITH_ARIA_256_GCM_SHA384","TLS_ECDH_RSA_WITH_ARIA_256_GCM_SHA384","TLS_RSA_WITH_ARIA_256_GCM_SHA384","TLS_ECDH_ECDSA_WITH_ARIA_256_CBC_SHA384","TLS_ECDH_RSA_WITH_ARIA_256_CBC_SHA384","TLS_RSA_WITH_ARIA_256_CBC_SHA384","TLS_RSA_WITH_AES_128_GCM_SHA256","TLS_RSA_WITH_AES_128_CCM","TLS_RSA_WITH_AES_128_CBC_SHA256","TLS_RSA_WITH_AES_128_CBC_SHA","TLS_ECDH_RSA_WITH_AES_128_GCM_SHA256","TLS_ECDH_RSA_WITH_AES_128_CBC_SHA256","TLS_ECDH_RSA_WITH_AES_128_CBC_SHA","TLS_ECDH_ECDSA_WITH_AES_128_GCM_SHA256","TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA256","TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA","TLS_RSA_WITH_AES_128_CCM_8","TLS_ECDH_ECDSA_WITH_ARIA_128_GCM_SHA256","TLS_ECDH_RSA_WITH_ARIA_128_GCM_SHA256","TLS_RSA_WITH_ARIA_128_GCM_SHA256","TLS_ECDH_ECDSA_WITH_ARIA_128_CBC_SHA256","TLS_ECDH_RSA_WITH_ARIA_128_CBC_SHA256","TLS_RSA_WITH_ARIA_128_CBC_SHA256","TLS_EMPTY_RENEGOTIATION_INFO_SCSV"],"ephemeral_keys_supported":true,"session_ticket_supported":true,"tls_compression_supported":false,"unknown_cipher_suite_supported":false,"beast_vuln":false,"able_to_detect_n_minus_one_splitting":false,"insecure_cipher_suites":{},"tls_version":"TLS 1.3","rating":"Probably Okay"}Minimum free heap size: 222904 bytes

Done

This tells me that I can perform a TLS handshake with an external server from within a QEMU environment (at least over port 443). I can talk to my local Mosquitto server over port 1883 from within QEMU (so it’s not a blocked port issue), and I can TLS handshake with Mosquitto from my physical device. I’m not sure what I’m missing: is it a QEMU config issue? Or is something missing/wrong in my application code?

A clue! Just after posting, I thought to check doing a manual merge and QEMU run with:

esptool.py --chip esp32s3 merge_bin --output build/merged.bin --flash_mode dio --flash_size 8MB --fill-flash-size 8MB 0x0 build/bootloader/bootloader.bin 0x8000 build/partition_table/partition-table.bin 0x10000 build/app.bin 
qemu-system-xtensa -nographic -machine esp32s3 -drive file=build/merged.bin,if=mtd,format=raw

This gives me an actual crash due to a double free event:

I (2224) mqtts_demo: Starting MQTT demo
I (2224) eth_qemu: Starting Ethernet...
I (2304) esp_eth.netif.netif_glue: 00:00:00:00:00:03
I (2304) esp_eth.netif.netif_glue: ethernet attached to netif
I (2404) eth_qemu: Ethernet started
I (2404) eth_qemu: Ethernet link up
I (2404) eth_qemu: Ethernet MAC address: 00:00:00:00:00:03
I (2404) network_wrapper: Waiting for network to connect...
I (2404) network_wrapper: Connected to network
I (2404) network_wrapper: Waiting for IP address...
I (3404) esp_netif_handlers: eth ip: 10.0.2.15, mask: 255.255.255.0, gw: 10.0.2.2
I (3404) eth_qemu: Ethernet IPv4 address obtained
I (3404) eth_qemu:   IP address: 10.0.2.15
I (3404) eth_qemu:   Netmask: 255.255.255.0
I (3404) eth_qemu:   Gateway: 10.0.2.2
I (12404) network_wrapper: Connected to IPv4 network
I (12404) mqtts_demo: Connecting to MQTT broker...
I (12404) mbedtls: ssl_tls.c:4606 => handshake

I (12404) mbedtls: ssl_msg.c:2353 => flush output

I (12404) mbedtls: ssl_msg.c:2362 <= flush output

I (12404) mbedtls: ssl_tls.c:4525 client state: MBEDTLS_SSL_HELLO_REQUEST

I (12404) mbedtls: ssl_msg.c:2353 => flush output

I (12404) mbedtls: ssl_msg.c:2362 <= flush output

I (12404) mbedtls: ssl_tls.c:4525 client state: MBEDTLS_SSL_CLIENT_HELLO

I (12404) mbedtls: ssl_client.c:919 => write client hello

qemu-system-xtensa: warning: [SHA] Error reading from GDMA buffer
free(): double free detected in tcache 2
Aborted (core dumped)

Google is not turning up anything right away, and I’m worried this is pretty deep in their TLS stack. I’m guessing [SHA] refers to their hashing code? Any suggestions on how to debug this further?

I’m assuming you are using ESP-IDF, if so make sure the “max log level” (I can’t remember the exact config name) is also set to at least VERBOSE in the menuconfig. That might be why you aren’t getting any log output. Also I did just pull that snippet from the latest master branch so it’s also possible that transport_base doesn’t exist but you should be able to verify that by navigating to the transport_ssl.c file in your local IDF install.

My guess is it’s the max log level not being set high enough though.

Here it is. CONFIG_LOG_MAXIMUM_LEVEL

https://docs.espressif.com/projects/esp-idf/en/stable/esp32/api-reference/kconfig.html#config-log-maximum-level

So…it works if I idf.py set-target esp32. I had been building for the ESP32S3. It seems there might be a bug in the QEMU emulator for specifically that chip.

I set LOG_MAXIMUM_LEVEL to 5 (verbose). Here’s what I’m getting when I run it for the ESP32S3:

I (12425) mqtts_demo: Connecting to MQTT broker...
D (12425) esp-tls: host:10.0.2.2: strlen 8
D (12425) esp-tls: [sock=54] Resolved IPv4 address: 10.0.2.2
D (12425) esp-tls: [sock=54] Connecting to server. HOST: 10.0.2.2, Port: 8883
V (12425) opencores.emac: emac_opencores_transmit: len=42
V (12425) opencores.emac: emac_opencores_transmit: desc 0 (0x600cd400) len=42 wr=1
V (12425) opencores.emac: emac_opencores_receive: desc 2 (0x600cd418) e=0 len=68 wr=0
V (12425) opencores.emac: emac_opencores_receive: desc 3 (0x600cd420) e=1 len=0 wr=1
V (12425) opencores.emac: emac_opencores_transmit: len=58
V (12445) opencores.emac: emac_opencores_transmit: desc 0 (0x600cd400) len=58 wr=1
V (12445) opencores.emac: emac_opencores_receive: desc 3 (0x600cd420) e=0 len=64 wr=1
V (12445) opencores.emac: emac_opencores_receive: desc 0 (0x600cd408) e=1 len=594 wr=0
V (12445) opencores.emac: emac_opencores_transmit: len=54
V (12445) opencores.emac: emac_opencores_transmit: desc 0 (0x600cd400) len=54 wr=1
V (12445) intr_alloc: esp_intr_alloc_intrstatus (cpu 1): checking args
V (12445) intr_alloc: esp_intr_alloc_intrstatus (cpu 1): Args okay. Resulting flags 0xE
D (12445) intr_alloc: Connected src 77 to int 2 (cpu 1)
D (12445) gdma: new group (0) at 0x3fca6240
D (12445) gdma: new pair (0,0) at 0x3fca6044
D (12445) gdma: new tx channel (0,0) at 0x3fca600c
D (12445) gdma: new rx channel (0,0) at 0x3fca6064
D (12485) esp-tls: handshake in progress...
--- Error: read failed: socket disconnected
--- Waiting for the device to reconnect......

Very odd. I wasn’t aware of this but ESP-IDF looks to use LwIP for the tcp/ip stack which would include the socket management.

Depending how much you want to dig to find the issue you could turn on debugging at the sockets layer. esp_netif_lwip would be a start into this. I know LwIP has plenty of good logging at each of it’s layers/components but I’m not immediately seeing how to enable it’s internal logging via ESP-IDF.

1 Like

Sorry to keep blasting you with responses. Looks like there is a config to enable output but not necessarily which module you want it from. It might get a bit noisy.

// I should keep reading. Yes it does let you specify what you want to enable. That should reveal a lot more information.

https://docs.espressif.com/projects/esp-idf/en/stable/esp32/api-reference/kconfig.html#config-lwip-debug

This is only so fresh in my mind because I just finished an mbedtls and LwIP integration on an STM32 based project. Good timing!

1 Like

@stephenmurphy no worries! I enabled the lwip debugging, and it definitely spammed my terminal :sweat_smile:

Considering the built-in https_mbedtls example runs in QEMU for the ESP32 and not the ESP32S3, I’m going to consider this a bug within ESP-IDF. I filed an issue: TLS handshake fails for ESP32S3 in QEMU (IDFGH-14706) · Issue #15446 · espressif/esp-idf · GitHub. I don’t have the time to keep digging, sadly, and considering I have a workaround for students (set the target to esp32 when using QEMU), I’ll just forge ahead with the content I want to teach.

Thank you again for helping me with this!

1 Like