What is URLhaus?
A platform by abuse.ch - one of the most well-regarded specialists of malware and botnet command and controller (C&Cs) data - which provides expansive intelligence on malicious URLs that are being used for malware distribution. It can be used for threat hunting, research, automation into security information and event management (SIEM) tools and TIPs, incident response and much more.
To understand more about the data release, read here. But if you’re after more detail on how it might be used, continue you as you are, reading this blog post.
Example 1: For manual investigations
Take the daily task for a Security Operations Center (SOC) data analyst: monitoring their SIEM or TIP for any signals that deviate from a defined normal range.
The SOC analyst has noticed that, very recently, new connections have been made to an IP that fall outside of “normal”. Walking back through endpoint protection data reveals a new application on a number of endpoints, within their company.
In a reactive response, a quick Google search reveals this suspicious application is part of a software bundle that came packaged with perfectly legitimate and needed software. The SOC analyst acquires a copy of this application to analyze in a sandbox, to observe the sample attempting to connect to 101.52.247.105. The researcher, who has access to SIA, wants to also be proactive and see if the IP is known to be malicious. The following request is made:
curl -H 'Authorization: Bearer 12345TOKEN67890'
https://api.spamhaus.org/api/intel/v1/byobject/cidr/all/listed/live/1.2.3.4
With the following output:
{
"code": 200,
"results": [
{
"dataset": "BCL",
"ipaddress": "1.2.3.4",
"asn": "1234",
"cc": "CN",
"listed": 1717150475,
"seen": 1717150475,
"valid_until": 1717636836,
"abused": false,
"botname": "win.cobalt_strike",
"botname_malpedia": "win.cobalt_strike",
"dstport": 443,
"lat": xxx.7732,
"lon": xxx.722,
"shared": false
}
]
}
This indicates the IP is known to Spamhaus, observed as serving malicious C&C servers, and the timestamp shows it to be very current at the time of searching. The researcher wants to broaden their search and see what malicious content is not only at the IP, but also at the ASN. The researcher uses the query:
curl -H 'Authorization: Bearer 12345TOKEN67890'
https://api.spamhaus.org/api/intel/v2/byobject/url/search -X POST -d '{"asn":1234}'
This gives the output:
{
"ts": 1716425692,
"id": 2752947,
"url": "http://1.2.3.5/app/view/ta.sh"
}
{
"ts": 1716425210,
"id": 1234567,
"url": "http://1.2.3.5/app/view/ta.sh"
}
{
"ts": 1716196759,
"id": 1234567,
"url": "http://1.2.3.5/app/view/ta.sh"
}
{
"ts": 1716196159,
"id": 1234567,
"url": "http://1.2.3.5/app/view/ta.sh"
}
{
"ts": 1716193295,
"id": 1234567,
"url": "http://1.2.3.5/app/view/ta.sh"
}
The results show a single URL. Next, a quick check of that URL with the following query:
curl -H 'Authorization: Bearer 12345TOKEN67890'
https://api.spamhaus.org/api/intel/v2/byobject/url/last -X POST -d '{"url":"http://1.2.3.5/app/view/ta.sh"}'
Revealing this single malicious URL at the ASN is still online and active. The ability to build a picture from a single online data source becomes a valuable research tool and reduces the time, tooling and energy required of using multiple online CTI vendors. The malware researcher also has the confidence in knowing the data supplied is regularly validated and timestamped, supporting the understanding of the timeline of activity of an entity’s activity.
Example 2: Automation into threat intelligence platform
For organizations that have more automated consumption pipelines and mature tooling, such as national CERTs.
National CERTS and security bodies that supply national CERTs often request information split by geography. Confidently assigning geography to entities such as IPs is difficult. Proxying, VPNs, and other techniques mean the difficulties will not be solved soon, and IP signals will remain dynamic.
A less dynamic source of information would be an ASN. These are allocated to organizations with registered addresses, and are more consistently visible over time. They also are very important networks to route the traffic of static and dynamic IPs within their remit. Qualifying malicious activity by ASN is a useful method of organizing threats across the globe within a threat intel platform (TIP).
For example, a broad start may be to use the Well Known Ltd ISP ASN (1234) in the United States, and consume the malicious URLs presented as one of your clients is US based and primarily rates US:
curl -H 'Authorization: Bearer 12345TOKEN67890'
https://api.spamhaus.org/api/intel/v2/byobject/url/search -X POST -d '{"asn":1234}'
Would give:
{
"ts": 1714924701,
"id": 1234567,
"url": "http://1.2.3.4:49459/i"
}
{
"ts": 1714923430,
"id": 1234567,
"url": "http://1.2.3.4:49459/i"
}
{
"ts": 1714922580,
"id": 1234567,
"url": "http://1.2.3.4:49459/i"
}
{
"ts": 1714921302,
"id": 1234567,
"url": "http://1.2.3.4:49459/i"
}
{
"ts": 1714920656,
"id": 1234567,
"url": "http://1.2.3.4:49459/i"
}
...more...
Equally, the starting point could be triggered by a malicious IP discovered, and the associated ASN be used as a search query.
In another scenario, an organization with a TIP can see potentially malicious events unfolding. The correlation of signals show that IPs, linked to businesses they protect, need to be investigated further. This correlation of event data can trigger the consumption of malicious URL data from Spamhaus. This will allow the organization to accurately identify potential malicious events, and the causes of those events, to the businesses they protect.
Spamhaus is well placed to support the mature and specific data requirements for automated consumption with established experience in this space. Users gain a source of data, rich with reputational signal on internet entities, that enables drilling down into malicious methods being used to harm users, all from one API.
Access the data – for free!
The URLhaus data via SIA is released as a beta version. Test out the data and influence ongoing product enhancements before a production-ready release. You can gain a long-term commercial license here – but to test out the data and share your feedback, sign up to the Developer License program here.
The Developer License is offered free for six months, with more details here. Happy hunting!