The Unattributable “Db8151dd” Data Breach

栏目: IT技术 · 发布时间: 3年前

内容简介:I was reticent to write this blog post because it leaves a lot of questions unanswered, questions that weBack in Feb,

I was reticent to write this blog post because it leaves a lot of questions unanswered, questions that we should be able to answer. It's about a data breach with almost 90GB of personal information in it across tens of millions of records - including mine. Here's what I know:

Back in Feb, Dehashed reached out to me with a massive trove of data that had been left exposed on a major cloud provider via a publicly accessible Elasticsearch instance. It contained 103,150,616 rows in total, the first 30 of which look like this:

The Unattributable “Db8151dd” Data Breach

The global unique identifier beginning with "db8151dd" features heavily on these first lines hence the name I've given the breach. I've had to give it this name because frankly, I've absolutely no idea where it came from, nor does anyone else I've worked on with this.

My delving into the breach began back in Feb with a tweet:

I'm trying to trace down the origin of a *massive* breach someone sent me. Looks very much like a data aggregator but I can't attribute it. Came from a cloud hosted IP so no clues there. My own data is there, anyone see any clues indicating the source? https://t.co/GHBoWN93Fy

— Troy Hunt (@troyhunt) February 23, 2020

I embedded my own record which you can pore through in more detail on Pastebin:

It's mostly scrapable data from public sources, albeit with some key differences. Firstly, my phone number is not usually exposed and that was in there in full. Yes, there are many places that (obviously) have it, but this isn't a scrape from, say, a public LinkedIn page. Next, my record was immediately next to someone else I've interacted with in the past as though the data source understood the association. I found that highly unusual as it wasn't someone I'd expect to see a strong association with and I couldn't see any other similar folks. But it's the next class of data in there which makes this particularly interesting and I'm just going to quote a few snippets here:

Recommended by Andie [redacted last name]. Arranged for carpenter apprentice Devon [redacted last name] to replace bathroom vanity top at [redacted street address], Vancouver, on 02 October 2007.

Met at the 6th National Pro Bono Conference in Ottawa in September 2016

Met on 15-17 October 2001 in Vancouver for the Luscar/Obed/Coal Valley arbitration.

It feels like a CRM. These are records of engagement the likes you'd capture in order to later call back to who had been met where and what they'd done. It wasn't just simple day to day business interaction stuff either, there was also this:

But then there's also a bunch of legal summaries, for example "CASE CLOSING SUMMARY ON USA V. [redacted]" and "10/3/11 detention hrg in court 20 min plus travel split with [redacted]"

— Troy Hunt (@troyhunt) February 23, 2020

But nowhere - absolutely nowhere - was there any indication of where the data had originated from. The closest I could get to that at all was the occurrence of the following comments which appeared over and over again:

This contact information was synchronized from Exchange. If you want to change the contact information, please open OWA and make your changes there.

Exported from Microsoft Outlook (Do not delete)

Contact Created By Evercontact

Evercontact did actually reach out and we discussed the breach privately but it got us no closer to a source. I communicated with multiple infosec journalists (one of whose own personal data was also in the breach) and still, we got no closer. Over the last 3 months I kept coming back to this incident time and time again, looking at the data with fresh eyes and each time, coming up empty. And just before you ask, no, cloud providers won't disclose which customer owns an asset but they will reach out to those with unsecured assets.

Today is the end of the road for this breach investigation and I've just loaded all 22,802,117 email addresses into Have I Been Pwned .  Why load it at all? Because every single time I ask about whether I should add data from an unattributable source, the answer is an overwhelming "yes":

If I have a MASSIVE spam list full of personal data being sold to spammers, should I load it into @haveibeenpwned ?

— Troy Hunt (@troyhunt) November 15, 2016

So, mark me down for another data breach of my own personal info. There's nothing you nor I can do about it beyond being more conscious than ever about just how far our personal information spreads without our consent and indeed, without our knowledge. And, perhaps most alarmingly, this is far from the last time I'll be writing a blog post like this.

Edit:No, I don't load complete and individual records into HIBP, only email addresses. As such, only the presence of an address is searchable, the data associated with the address is not stored nor retrievable.


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

颠覆式创新:移动互联网时代的生存法则

颠覆式创新:移动互联网时代的生存法则

李善友 / 机械工业出版社 / 2014-12-1 / 69

为什么把每件事情都做对了,仍有可能错失城池?为什么无人可敌的领先企业,却在一夜之间虎落平阳? 短短三年间诺基亚陨落,摩托罗拉区区29亿美元出售给联想,芯片业霸主英特尔在移动芯片领域份额几乎为零,风光无限的巨头转眼成为被颠覆的恐龙,默默无闻的小公司一战成名迅速崛起,令人瞠目结舌的现象几乎都被“颠覆式创新”法则所解释。颠覆式创新教你在新的商业竞争中“换操作系统”而不是“打补丁”,小公司用破坏性思......一起来看看 《颠覆式创新:移动互联网时代的生存法则》 这本书的介绍吧!

HTML 压缩/解压工具
HTML 压缩/解压工具

在线压缩/解压 HTML 代码

JS 压缩/解压工具
JS 压缩/解压工具

在线压缩/解压 JS 代码

Base64 编码/解码
Base64 编码/解码

Base64 编码/解码