Conversation

Jarkko Sakkinen

Edited 3 months ago
In #CrowdStrike outage biggest surprise is not the bug but instead how unprepared they were for rollback.

Lessons learned IMHO from the whole thing is that companies running these platforms should have a test suite, and exercised rollback process for faulty patches.

I.e. I'd focus to the only thing that can be fixed permanently, i.e. the rollback process at scale. Faulty patches come and go.

#infosec
1
3
7
I'm assuming they have a staging environment already for new patches, and not referring to that.

In addition it would probably save some money to have some well-crafted rollback for patches that have went from staging to live outside the CrowdStrike's internal environments.

I mean it is exactly kind of bug where a single hour loss in rollback probably costs some unimaginable sum of money, so thus I think it'd be a business-smart "local maximum" to optimize ;-)
1
0
0
So if anyone who as any responsibility causing the bug reads this: don't feel too bad of yourself, not throwing the first rock! This scenario should have been projected beforehand.
0
0
0