AI.Dev and Cassandra Summit 2023 - Building a CyberGrid on Cassandra
On December 12, 2023 in San Jose at the Linux Foundations, AI.dev & Cassandra Summit, I delivered this presentation. The deck and recording is included below.
On December 12, 2023 in San Jose at the Linux Foundations, AI.dev & Cassandra Summit, I delivered this presentation. The deck and recording is included below.
An updated deck for my talk on Big Data in Cybersecurity can be downloaded here.
WitFoo Precinct persists and replicates data on big-data NoSQL platform Apache Cassandra. Precinct 6.1.3 is built on Cassandra 3.11. In preparation for upgrade to Cassandra 4.0, the following lab & production testing was conducted.
WitFoo Precinct clusters consisting of 1 Management, 1 Streamer and 3 Data nodes were deployed in AWS using the official Marketplace images. The instances were configured to use AWS GP2 SSD drives (the recommended default) and were running on c5d.2xlarge hardware (16GB RAM, 8 CPU Cores.)
When we founded WitFoo five years ago, we wanted to analyze data in SIEM and other data stacks to provide craft knowledge that would stabilize communications within cybersecurity teams and between those teams and their organizations. A few months into that journey we realized there were fundamental problems in how existing SIEM and log aggregators collected and stored data which prompted us to add big data processing to the scope of our venture.
First, the nature of evolution discards noise. Much like the concept in biology, only fit, useful facts survive the evolution process. When exposed to more complex systems, noise goes the way of the dodo bird. A “possible SQL injection attack on MySQL” event becomes irrelevant when vulnerability reports show the targeted server isn’t running MySQL. As data becomes a more mature, evolved object the irrelevant events fall away.