Big Data

Unpacking my ‘go-to’ architecture for transforming data into actionable business intelligence.

Certificationn Badge Certificationn Badge Certificationn Badge Certificationn Badge Certificationn Badge
Certificationn Badge Certificationn Badge <
Certificationn Badge Certificationn Badge Certificationn Badge
Note: This page is a work in progress; please forgive incomplete descriptions.

This architecture reflects my current technical acumen and business philosophies:

  • Customer is most important person in the room
  • Do more with less
  • Secure by design
  • Learn & apply
  • Follow the data
  • Simplicity accelerates

I use it within my own projects — and as a baseline for related consulting work.

It’s an ideal — secure, performant and modular — platform for delivering:

  • Product innovation
  • Customer relationship analysis
  • Project prioritization

Inspired — and perpetually evolving — from AWS reference architectures and business intelligence best-practices:

  • Immediate availability
  • Broad & deep capabilities
  • Trusted & secure

And continuously validated against the AWS Well-Architected and NIST Cybersecurity frameworks.

Push-Button Deployment

Maintained in a library of CloudFormation templates and deployable on-demand — as individual modules or end-to-end.

Design Considerations

  • Why not TerraForm? CloudFormation covers everything required in a universal format (YAML/JSON), so there is no compelling reason to take on the added costs and complexity of a 3rd-party tool
  • Why AWS? In a word…ecosystem. The breadth and quality of services, coupled with their integration, provide the most compelling business case. And after working with nearly all of AWS’s services over the past 10+ years — and applying their published best practices and reference architectures — AWS has earned my trust…and my business. I’ve also used GCP and Azure quite a bit, but have always ended up back on AWS due to performance or architectural reasons.

The Bigger Picture

This architecture also fills a key role in a more comprehensive cloud strategy:

So let’s dive in…

Core Components

Laying the groundwork for extracting useful insights from large volumes of data.

Processing high volume, variety and velocity of data.

Expansion Modules

With this solid foundation in place, we can easily enhance functionality using a number of different add-on modules:

  • Data Warehouse
  • Pattern Recognition
  • Model Training

Let’s dive a little deeper into those modules…

Elasticsearch Module

This add-on module provides a high volume search engine.

Quickly and easily find details in data.

Warehouse Service

This add-on module enables a data warehouse platform.

Store and query large volumes of data for analytical workloads.

Pattern Recognition Service

This add-on module provides an exploratory machine learning platform.

Identify hidden patterns buried deep in data.

Model Training Service

This add-on module provides a machine learning model training pipeline.

Including algorithm selection and data cleaning.