Data Sources Explained

Understand where our breach data comes from and what types of information we index.

RX
ReconX Team··2 min read

Where Does Our Data Come From?

ReconX indexes breach data from multiple sources to provide comprehensive coverage. Understanding these sources helps you interpret your search results.

Stealer Logs

Information-stealing malware (infostealers) collect data from infected computers. This data often includes:

  • Saved browser passwords and cookies
  • Autofill data from forms
  • Cryptocurrency wallet files
  • Desktop files and documents
  • Application credentials

Stealer logs are particularly valuable because they contain fresh, active credentials with clear context about which services they're used for.

Database Dumps

When websites and services are breached, attackers often dump the entire user database. These dumps typically contain:

  • Email addresses and usernames
  • Password hashes (sometimes plaintext)
  • Registration dates and IP addresses
  • Profile information

Database dumps reveal exposure at specific services and can indicate how your password was stored (hashed, salted, or unfortunately plaintext).

Combo Lists

Combo lists are aggregated collections of email:password pairs compiled from multiple breaches. They're often:

  • Sorted and deduplicated
  • Formatted for credential stuffing attacks
  • Missing context about the original source

While less detailed than raw breach data, combo lists show what's actively being used by attackers.

Paste Sites

Attackers and researchers sometimes post breach data on paste sites like Pastebin. We monitor these sites for:

  • Credential dumps
  • Configuration files
  • API keys and secrets

Dark Web Sources

Some breach data is traded or posted on dark web forums and marketplaces. We safely collect this data without participating in illegal activities.

Data Processing

When we receive new breach data, our system:

  1. Indexes the raw content for full-text search
  2. Extracts structured data (emails, URLs, IPs, etc.)
  3. Categorizes the source type
  4. Makes it searchable within hours

Data Freshness

We continuously index new breaches as they become available. However, there can be delays between when a breach occurs and when the data surfaces publicly. Our freshness varies:

  • Stealer logs - Often indexed within days of collection
  • Database dumps - Can take weeks to months to surface
  • Combo lists - Compiled over time from various sources

Ethical Considerations

ReconX operates under strict ethical guidelines:

  • We don't purchase or participate in breach marketplaces
  • Data is provided for defensive research only
  • We comply with relevant data protection regulations
  • Access is restricted to verified security professionals

Limitations

No breach intelligence platform can have complete coverage. Some limitations:

  • Private breaches that never become public
  • Recently compromised data not yet leaked
  • Highly targeted attacks against specific organizations
Share this article
RX

ReconX Team

Expert in cyber intelligence, threat analysis, and security research. Contributing insights and analysis to help security professionals stay ahead of emerging threats.

Was this article helpful?