UpGuard flags massive U.S. dataset containing billions of emails and Social Security numbers

🇺🇸United States🇩🇪Germany

cybersecuritycloud hostingdata-brokersidentity protection

Wed, Feb 18, 2026

InsightsWire News2026

Discovery and removal. A cybersecurity research team uncovered a large, openly reachable data repository during a January sweep and traced hosting to the German cloud firm Hetzner. The researchers did not possess a clear owner to contact, so they notified the host; the provider reported that its customer removed the resource on January 21. Because of the dataset’s size and sensitivity the team avoided downloading the full corpus, instead working from a representative subset for analysis.

Contents and validation. Aggregated counts reported by the investigators included roughly 3 billion email/password entries and around 2.7 billion records tied to Social Security numbers. From a sampled pool of about 2.8 million rows, validation checks suggested approximately one quarter of SSNs appeared legitimate; extrapolating that rate yields an estimated 675 million potentially valid SSNs. Cultural markers embedded in password text pointed to U.S.-origin credentials concentrated around the mid-2010s, indicating many elements may be recycled from older breaches.

Possible origins and attacker tradecraft. While investigators did not identify a single provenance for the aggregation, similar high-volume caches have often arisen in two ways: large-scale recombinations of historical breach dumps, or direct exfiltration from infected endpoints via commodity infostealer malware. The latter typically harvests locally stored credentials, session tokens and browser-stored secrets, creating a heterogeneous mix that can include streaming, social, government, and cryptocurrency-related logins. That landscape makes it difficult to attribute a dataset’s origin without deeper forensic traces, but it does mean the exposed trove could contain both recycled and device-sourced data.

Risk profile and persistence. Two structural issues amplify danger: wide reuse of login data across services and the permanent nature of SSNs as identity anchors, making them especially valuable to fraud actors. Crucially, interviews conducted by the team found that a nontrivial portion of affected people had not yet experienced misuse, implying the database contains latent, unexploited material. Because threat actors routinely recombine and resell historical leaks—or harvest live credentials from infected endpoints—an aggregated mega-set raises the odds of large-scale account takeover and identity fraud even years after the original intrusions.

Operational implications and mitigation. Responders should treat this kind of discovery as an active threat: prioritize notification and remediation for high-value targets, force credential resets and session revocations where possible, and scan for exposed credentials on underground markets. At a systems level, platform defenders and cloud providers should accelerate automated exposure detection, session revocation tooling and broader adoption of hardware-backed multi-factor authentication. For individuals, eliminating password reuse, enabling MFA, and improving endpoint hygiene (including anti-malware controls and limiting stored credentials in browsers) reduce the most immediate exploitation vectors.

PREMIUM ANALYSIS

Read Our Expert Analysis

Create an account or login for free to unlock our expert analysis and key takeaways for this development.

By continuing, you agree to receive marketing communications and our weekly newsletter. You can opt-out at any time.

Free Access

No Payment Needed

Join Thousands of Readers

Recommended for you

Cybersecurity

Massive 149M credential trove exposes risks from infostealer malware to crypto and government accounts

A researcher found a publicly accessible collection of roughly 149 million stolen logins harvested by credential-stealing malware, including hundreds of thousands tied to major crypto platforms and numerous government-related accounts. The exposure stems from infected end-user devices rather than platform breaches, but it raises urgent questions about account hygiene, phishing risk, and detection across the crypto and social-media ecosystems.

Cybersecurity

Social Security Administration Opens Probe into DOGE Engineer Data Claims

The SSA inspector general has launched a probe after a whistleblower alleged an engineer tied to the Department of Government Efficiency copied two agency databases containing personal records for more than 500 million people. The complaint, filed in January, follows an earlier allegation about insecure cloud storage and has prompted notifications to Congress and the GAO.

Cybersecurity

Canadian Tire: Data Compromise Hits Tens of Millions of Customers

A wide-scale e-commerce breach at Canadian Tire exposed roughly 38M customer accounts and an auxiliary data set that totals about 42M records. Passwords hashed with PBKDF2 , partial payment details, and contact fields are in circulation, raising fraud and regulatory risk. Industry signals from other recent retail and support-channel incidents indicate attackers often combine credential caches, infostealers and social‑engineering to amplify impact.

Cybersecurity

U.S. Panera Bread Customer Data Dumped After ShinyHunters Exploit Microsoft Entra SSO

ShinyHunters published a large archive of customer contact data it says was taken from Panera Bread after a failed extortion attempt, claiming about 5.1 million unique email addresses within an asserted 14 million-record haul. Researchers say the Panera intrusion matches a wider, telephone-based social-engineering trend—real-time vishing paired with browser phishing toolkits—and a separate unsecured infostealer cache of roughly 149 million credentials that together amplify risks of credential stuffing and targeted account takeover.

Cybersecurity

LexisNexis breach exposes legacy datasets, raises cloud-hygiene alarm

LexisNexis confirmed an intrusion that exposed legacy files and identifiers, with the attacker alleging exploitation of React2Shell and weak cloud controls. Immediate risks include exposed credentials, roughly 400,000 personal records, and elevated regulatory and insurance scrutiny — a pattern echoed by recent large-scale exfiltrations where fast operational recovery did not eliminate downstream fraud and identity risk.

Cybersecurity

DHS Data Breach Exposes ICE Contracts and Multi‑Million Awards

A hacktivist collective released procurement records tied to DHS and ICE, revealing contracts with thousands of vendors and multi‑million dollar awards. Related reporting and security research suggests the disclosures extend beyond vendor files to lease lists, embedded GSA activity and exposed admin credentials, increasing operational and legal disruption risks.

Policy & Geopolitics

Department of Homeland Security Seeks Access to Child-Support Database

The Department of Homeland Security has requested permission to query the Federal Parent Locator Service , which includes the National Directory of New Hires , raising legal, programmatic and trust risks. The request arrives amid a broad administrative push that has centralized enforcement funding and procurement and that has already connected field biometric tools to large commercial image repositories, compounding governance and attribution concerns.

Cybersecurity

Salt Typhoon hackers believed to be retaining stolen telecom data for later exploitation

An FBI cyber official warned the China-linked group Salt Typhoon likely preserved exfiltrated telecom records as a long-term intelligence cache rather than for immediate monetization. Investigators say the intrusion touched dozens of providers and may involve data tied to more than one million U.S. residents, heightening risks from future targeted surveillance and fraud.