Product Details

What data is listed?

  • 7.5 million listings in total (approx.)
  • 75,000 (up to) newly observed IPs added every 24 hours
  • 1 million (average) “refreshed” listings added every 24 hours

This dataset contains IP addresses that are exhibiting compromised behavior, including:

  • Malware infections
  • Worm infections
  • Trojan infections
  • Devices controlled by botnets command and controllers
  • Third-party exploits, such as open proxies

Every IP address has various metadata relating to it, including bot names, first seen date, and valid until date. Both historical and live data is accessible to provide additional insight.

Benefits of Spamhaus Intelligence API

  1. Breadth

    Includes both live and historical data relating to IPs showing signs of compromise, with access to 20 different fields per infected IP.

  2. Control & convenience

    API access makes the data easily consumable across multiple applications, without the requirement to download the entire data set.

  3. Real-time updates

    Threats are included in this dataset as soon as researchers observe them.

Data limits

Free usage of the beta service is up to 1,000 queries per day and a total of 20,000 queries per month.

Included Fields

ipaddress

The IP address identified as the source of the bot-generated traffic. Always Provided

botname

The bot name associated with the detected activity. Where detection can’t be clearly associated, “unknown” will be returned. Always provided

seen

Unix timestamp of the last detected event for the given IP and the given bot name. Always provided.

first seen

Unix timestamp (rounded to the minute) of the first detection event for this IP and bot name combination. This will match the value of “seen” if it’s the first sighting of this type on this particular IP. When there has been no activity for this given combination for a month, this field is reset. Always provided.

listed

The Unix timestamp (rounded to the minute) of when the entry reached our database. Usually, this is very close to the value of “seen” unless the data is coming from batched processes. Always provided.

valid_until

Unix timestamp (rounded to the minute) of when the given entry will be considered “expired” from our dataset. Always provided.

detection

Human-readable form, briefly describing how the data was collected. This field only appears when the heuristic can involve multiple ways of collecting said data.

rule

An internal ID pointing to the rule operating the detection. Detections operated by different means or rules will show different IDs, even when they refer to the same detection. Always provided.

dstport

The destination port of the traffic triggering the detection. Not always disclosed/available.

helo

When detection results from SMTP traffic, this is the HELO string used in the SMTP session triggering the detection.

helos

Specific to MPD detections only. This is an array enumerating all the HELO strings involved in the detection. Appears only in records for the MPD heuristic.

heuristic

The heuristic applied to generate the detection. This returns a limited number of possible values.

asn

The Autonomous System Number (ASN) announcing the IP, predominantly obtained from routeviews data.

lat

Geographic latitude of the IP. Only provided when geolocation data is available.

lon

Geographic Longitude of the IP. Only provided when geolocation data is available.

cc

The ISO Country Code of the nation where the IP resides. Only provided when geolocation data is available.

protocol

IP protocol of the traffic triggering the detection. Usually either UDP or TCP.

srcip

Source IP of the traffic triggering the detection. In rare cases, this usually matches the argument of the listing.

uri

Specific to the “SINKHOLE” heuristic, and to HTTP sinkholes detections only. This is the URI of the HTTP request triggering the listing. Not always available.

useragent

Specific to the “SINKHOLE” heuristic, and to HTTP sinkholes detections only. It is the User-Agent header of the HTTP request triggering the listing. Not always available.

domain

Mostly specific to the “SINKHOLE” heuristic, and to HTTP sinkholes in particular. It’s the domain/hostname the traffic triggering the detection is reaching, i.e., the sinkhole’d domain. Often obtained from the “host” header of the HTTP request triggering the listing. Not always available.

Incident response

Integrate this data with your current applications to provide increased visibility as to where issues have occurred, enabling you to rapidly connect the dots.

Online real-time risk assessment

Where online transactions or account creations are occurring, check if the connecting IP address has additional risk associated with it i.e. is compromised.

Monitoring trends

Historical data enables you to build a picture relating to a compromised IP address, and monitor trends, or create visualizations.