Understanding the source code of a malicious email

Posted by Riccardo Alfieri on 29 Sep 2020

When an end-user views an email, it isn't always apparent if it's malicious or not. However, if you look at the email's source code, a wealth of information is revealed. By applying specific DNSBLs (blocklists) to a particular element of this source code, you can filter email with great success. But, before we discuss blocklists, you need to understand the anatomy of an email...

An email as it looks to an end-user

Let’s put some context around the difference between the information that an end-user sees when viewing a malicious email and the actual data contained within the source code. Firstly, take a look at the email below:

Figure 1 | End-user view of an email

Above is a typical example of how a spam email displays via end-user client software, such as Apple Mail or Outlook.

In the rendering of this email, some elements of the email content may start to ring some warning bells, i.e., an HTML link, a bitcoin address and an attachment. Unfortunately, to a casual user, these elements will probably seem benign.

This is exactly what cybercriminals are counting on. They want to exploit how an email appears to the average end-user, keeping elements hidden that could point to the potentially dangerous nature of the email.

To see what an email really contains, we have to unpack it and look at the “source code” or “raw code,” viewing the email header and email content.

An example of email source code

Take a deep breath – there is a considerable amount of information contained within email source code. However, we will explain it all, hopefully without boring you to death (or at least that’s the plan!)

Below is the same email as pictured above, only in “raw” format. It illustrates what information is seen and recorded by internet servers as they send and deliver email messages.

Email source code elements explained

The Received Chain

Think of the “Received” chain as a paper trail of all the servers that have handled your email, including your mail server. The reading order is “final connection/most recent” at the top, and the preceding connections are listed in reverse order, culminating with the “originating computer/oldest” at the bottom.

These are vital elements because they include all the IP addresses of the email servers that have handled your email and information like the HELO string and reverse DNS (rDNS[1]) of the IPs.

“HELO“ is a required element in a Simple Mail Transfer Protocol (SMTP) transaction. It is similar to the greeting that a postman might give to a colleague when handling over postal mail: “Hello, I am John from The Postal Service, and I have mail for someone on your route.”

Let’s take a deeper dive into our example email’s received chain, starting at the top and reading down:

Received Chain C – The most recent received line:

This is the most recent and most trusted “Received” line because the recipient’s mail server has registered it; therefore, it’s guaranteed not to have been tampered with.

In an attempt to confuse anti-spam systems, miscreants will sometimes try to scramble the “Received” line chain. With this in mind, you should pay particular attention to the subsequent “Received Lines” following on from this most trusted one.

The key elements in Line C are:

Connecting IP: 192.0.4.1 (#1)
rDNS of connecting IP: mta-out.someisp.example (#2)
HELO string: mta-out.someisp.example (#3)

Explanation: the sending mail server “mta-out.someisp.example” (#3) makes a connection to the recipient’s mail server “mx.email.example”, announcing itself via HELO with the IP “192.0.4.1″ (#1) and the rDNS of “mta-out.someisp.example” (#2). This is the final connection – if it is accepted (and in this case it was), then the email addressed to <[email protected]> is delivered.

Received Chain B:

This is where “smtp.someisp.example” acknowledges receipt of the email for the destination email address <[email protected]>.

Explanation: the mail server “smtp.someisp.example” connects to the outbound (Postfix) mail server “mta-out.someisp.example” for that ISP, telling the outbound that it has mail to be sent to <[email protected]>.

Received Chain A – the original transaction

This line represents the original email transaction that generated the email. Since this is the initial transaction, the pertinent detail here is the connecting IP.

The key elements in Line A are:

Originating local machine : [192.168.1.2]
rDNS of connecting IP: unknown
Connecting IP: 192.0.2.1 – (#5)

Explanation: the originating local machine, “192.168.1.2″, makes an outgoing connection to “smtp.someisp.example” and announces itself as “192.0.2.1″. Upon connection, the server “smtp.someisp.example” tries to look up the rDNS of the connecting IP, “192.0.2.1″ (#5), and fails, returning in an “unknown” result.

More email source code elements

Return-Path (#4)

The “Return-Path” is also known as the “Envelope-From” or “Mail-From.” This is the “actual” email address that sent the email.

In terms of old-fashioned snail-mail, the “Return-Path” is similar to a local postal services stamp, i.e., it is an accurate indicator of the letter’s origin. However, when you’re reading the actual contents of the letter, you can’t see the postal services stamp on the envelope.

In the case of email, you can’t see the sender’s real email address in the visual rendering of an email, as illustrated in figure 1.

Reply-To address (#6)

Upon clicking “Reply” to an email, this field serves to instruct your email client software regarding what email address to insert in the “To” field. Usually, with legitimate emails, the “Reply-to” and “From” in the original email will be the same.

Our email example shows that the apparent sender is “[email protected],” but if you were to reply to this email, the address would populate as “[email protected]”… and the bad actors are hoping you won’t notice that.

Sender-Address (From) (#7)

This is also commonly referred to as the “Friendly From.” The sender-address is comparable to the signature of the person who signs a letter you’ve received through the post, i.e., they can sign it from anyone.

In our example, the “Return-Path / Envelope-From” (#1) is different from the “Sender-Address (From)” (#7), aka “Friendly-From.” The Friendly-From name, which is visible to an end-user, is how the sender wants to appear to the recipient. Typically, it includes the sender’s name and email address.

Why do miscreants use different Return-Path and Friendly-From email addresses?

That’s simple – to trick people! Bad actors want the recipient of the email to believe it’s from someone credible, important, or someone who is known to them. They are trying to add legitimacy and believability.

Remember! In most cases, email client software only shows the Friendly-From. It’s like opening a letter and throwing away the envelope before handing it to someone to read; they only see the writing on the actual letter.

Website (#8)

The end-user has no visibility of the website address (URL) in the email they are viewing. This is because the sender has concealed the URL with a hyperlink associated with “click here.” The actual website URL is “www.worldlotterywinner.example” .

To see where a link will take you, hover over the link, and the details of the URL will display. In many cases, the URLs included in malicious emails will be directing you to a “spamvertised” product, a phishing website, something unsavory, or a malware dropper site.

Bitcoin Wallet (#9)

A Bitcoin “wallet” is a crypto-address controlled by cybercriminals, often used as an avenue for payment in sextortion spam campaigns. These are campaigns where the bad actor claims to have recorded the victim indulging in some “private activities.” They threaten to share the recording with work, family, and friends unless the cybercriminal receives payment via the bitcoin wallet.

There are many variations on this theme, playing on people’s fears with the sole intent of extorting money.

Attachment (#10)

In this example, the attachment is an Office document called “Further instructions.doc.” More than likely, it will be “Ëœweaponized’ in some shape or form. By this, we mean that if you were to download and open the document, malware would infect your computer. The cybercriminal can then use your computer for whatever purpose they want. This can include sending more spam, exfiltrating your personal data/passwords, or utilizing your computer in a distributed denial-of-service (DDoS) attack.

Congratulations – you made it through! We hope this deep dive into email source code helps provide some clarity.

What blocklists to apply to email source code

With so many email elements to check, it can become confusing as to what blocklist to use at what point in the email filtering process. Take a look at this article to understand what Spamhaus blocklists to use and where to use them to maximize your catch rates.

The domain name system looking up (or querying) the domain name associated with an IP address. https://en.wikipedia.org/wiki/Reverse_DNS_lookup

Blog

Understanding the source code of a malicious email

An email as it looks to an end-user

An example of email source code

Email source code elements explained

The Received Chain

Received Chain C – The most recent received line:

Received Chain A – the original transaction

More email source code elements

Return-Path (#4)

Reply-To address (#6)

Sender-Address (From) (#7)

Website (#8)

Bitcoin Wallet (#9)

Attachment (#10)

What blocklists to apply to email source code

Related Products

Data Query Service (DQS)

Proactive & preventative

Save on email infrastructure & management costs

Actionable

Resources

Additional protection with an expanding CSS dataset

Busting Blocklist Myths

BEST PRACTICE | DNSBLs and email filtering – how to get it right