Web content and reputation classification plays a critical role in enforcing security policies and protecting users from threats. Attackers increasingly host malware, phishing kits, and command-and-control infrastructure on legitimate-looking websites. Being able to identify and categorize traffic, even without inspecting encrypted payloads, is essential. Whether it's blocking access to high-risk domains and known malicious websites (phishing, malware, etc.) or preventing access to content not suitable for work environments (adult websites, gambling, etc.), classification allows network and security teams to determine which applications and web destinations should be allowed or denied.
How does Web Content and Reputation filtering work?
HPE Aruba Networking Access Points (APs) and Gateways (GWs) have been performing web content and reputation classification for more than 10 years. This was initially a simple task, as many websites were unencrypted, and networks could simply read unencrypted HTTP headers. As websites evolved and HTTPS encryption has become the norm, APs and gateways have resorted to analyzing unencrypted metadata and traffic patterns to identify traffic flows. One key technique is inspecting the Server Name Indication (SNI) field from the TLS handshake, which reveals the destination domain in plaintext (unless encrypted via ECH) and is performed by the Deep Packet Inspection (DPI) engine of both APs and Gateways. This destination domain is then checked against Opentext's Threat Intelligence WebClassification and Reputation Service (formerly known as Brightcloud).
This service classifies over 1 billion domains and 43 billion URLs, and processes 1.5 million classifications per day. That's not something a network device will be able to hold (and keep up-to-date) in memory. APs and GWs keep a local cache with the classification for all domains being visited by any client device. These domains will remain in the cache for a minimum of 72 to 144 hours (depending on the reputation), and will keep getting refreshed as clients keep visiting those domains. Gateways additionally download (on a daily basis) a local database with the most representative 1M global URLs.
To prevent impacting the user experience for those domains accessed for the first time, both APs and GWs default to allowing access in the absence of a known classification for any given domain. Given the fact that the most popular 1M domains are proactively downloaded by the gateways, these can be configured to "drop packets during WebCC cache miss". This is the most popular setting in environments where, for decorum or security reasons, administrators want to ensure undesired content can never be accessed, even if just for a brief period of time.
What challenges come with protocols such as QUIC or ECH? How can I overcome them?
As the web becomes increasingly encrypted, networks face a growing challenge: how to classify and control web traffic without doing full SSL/TLS decryption. Privacy enhancements like TLS 1.3, Encrypted Client Hello (ECH), and protocols like QUIC and DNS-over-HTTPS (DoH) limit visibility into traffic content. However, with the right strategies, it’s still possible to implement effective web content filtering without breaking encryption or introducing privacy concerns.
Block DNS-over-HTTPS (DoH)
DoH allows clients to bypass corporate DNS infrastructure by encrypting DNS queries within HTTPS. This creates two problems:
-
Bypassing corporate DNS policies using Google or Cloudflare DNS directly from the browser.
-
Breaking SNI-based classification, as DoH is often used alongside ECH (Encrypted Client Hello), which hides the domain name from passive inspection.
Enforcing DNS traffic through your corporate servers restores your ability to classify traffic based on domain and apply category-based rules. Blocking DoH in HPE Aruba Networking APs and GWs is as simple as blocking access to the "dns-over-https" web category. Here's an example of a policy that would do this:
Block QUIC (and force fall back to TLS/TCP)
QUIC is a modern transport protocol used by Google, YouTube, Facebook, and others. It runs over UDP and encrypts all handshake data (via TLS 1.3), including SNI, making it opaque to most firewalls.
Blocking QUIC forces clients to use TLS over TCP, where SNI (Server Name Indication) remains visible and can be inspected for domain classification even without full decryption. To do so in HPE Aruba Networking APs and GWs, create a policy that only allows the use of UDP 443 for corporate-sanctioned applications. Streaming applications (which benefit from the accelerated negotiation when using QUIC) or the wireguard tunnels often created by ZTNA agents (such as the atmos agent) are a good example of a legitimate use for UDP 443. A good example would be the following policy, where "streaming" and "wireguard" are allowed, but unknown traffic using UDP 443 is blocked.
Conclusion
In summary, Web content filtering is only as effective as the visibility that supports it. For APs and GWs to reliably enforce robust filtering policies it’s essential to block protocols like DoH and QUIC that obscure domain visibility. Try this out in your network and let us know how it goes!