75 Years After V-E Day, Leveraging Big Data, Analytics, & Data Visualization to Illuminate the Past
On May 8th, 1945, World War II ended in Europe after five protracted years of battle.
In the 75 years since, little has gone unsaid of WWII. The war’s broad strokes have been hashed and rehashed.
Yet, at the intersection of data science and history is the potential to unearth new histories of WWII that couldn’t have been told until today.
Two years ago, I dreamed up the idea to build an interactive dashboard to tell the story of an Army Air Force combat group during WWII using big data.
Behind the dashboard was my own family’s story: my grandfather Wally took his service on a B-24 crew in Europe to the grave. A lifelong struggle with PTSD left his wartime service a mystery to everyone in his life. Because I’m a data scientist by trade, I turned to data in telling Wally’s story twenty years after he died.
When I began researching Wally’s service with the 44th Bomb Group, I eventually stumbled upon newly digitized data sources from WWII. In spite of an increasing amount of digital information available about WWII, the databases housing the data made it impossible to aggregate and analyze the data.
I wanted to look beyond Wally’s story to the 44th Bomb Group at large to contextualize Wally’s service. For example, knowing Wally flew 42 missions in the war meant little without understanding the average number flown of missions an airman in the 44th Bomb Group flew over the course of the war.
With much manual effort and a serendipitous hunch, I determined that Wally flew more missions than 99.5% of all 5,000 airmen in the 44th Bomb Group. It was an insight gleaned from the data that fundamentally reshaped how I perceived Wally’s war and the grandfather I knew.
I knew if I could create a manipulable data set, there were hundreds more insights, like the one I gleaned about my grandfather, waiting to be found.
The dashboard idea was born from my desire to look at the relationship between missions flown, combat losses, and demographics for all 5,000 men in the 44th Bomb Group — a monumental task never done before.
I began by exploring how I could leverage the 21st century digital tools designed for business analytics to tell a new history of the war with big data. These tools, which are the staples of my day job studying human capital issues, proved remarkably useful in looking to the past.
What I learned is that unearthing insights from 75-year-old isn’t markedly different than extracting insights from new data.
Before building the dashboard, I spent the better part of a year web scraping and cleaning 1.5 million data points about demographics, wartime missions, and combat losses for all 5,000 airmen who served in the 44th Bomb Group over the course of the war.
In the process, I learned that 75-year-old data is replete with quality issues. Originally captured by hand, then manually entered into a database, the scale and scope of inconsistencies to rectify was profound. Getting to the final dataset included over 10,000 lines of data cleaning syntax.
With cleaned data in tow, I built the dashboard wire frame aiming to tell a visual story about the collective impact and singular stories of 5,000 bomber boys who served with the 44th Bomb Group.
After a four-month sabbatical from work, the 44th Bomb Group Data Dashboard became a reality in Google Data Studio.
The 44th Bomb Group data dashboard tells a new history of WWII that wasn’t possible until today, 75 years after war in Europe ended on V-E Day.
Drawing from 1.5 million data points and featuring 100+ data visualizations, the dashboard sits at the intersection of data science and history. It tells the story of the 44th Bomb Group’s collective impact in WWII and singular stories of the 5,000 men who served in the group, including that of my grandfather Wally.
What follows is a never-before told big data history of the 44th Bomb Group during WWII featuring data insights and excerpts from the 44th Bomb Group Data Dashboard:
The 44th Bomb Group was the first B-24 Heavy Bomber group stationed in England. They were pioneers of the air war.
The 44th Bomb Group flew into Fortress Europe nearly two years before the Allies touched down on European soil. They flew 29 continuous months from late 1942 until V-E Day in 1945. Over half of the 44th Bomb Group’s missions (57%) were flown in 1944. In five months of combat operations in 1945, 21% of all wartime missions were flown – the same amount flown in all twelve months of 1943.
The 44th Bomb Group fought in every major battle in the European Theater of Operations. Over 344 missions, the 44th Bomb Group bombed 8 distinct countries, including 117 unique cities in Germany and 66 unique cities in France.
The 44th Bomb Group’s losses were unprecedented. Over half (53%) of their B-24 bombers were lost in combat. Over one-quarter (26%) of all airmen who served in the 44th were killed or became prisoners of war (n=1280).
Completing a tour of duty was statistically improbable. At the end of the war, a tour consisted of 35 missions. 44th airmen killed in action flew an average of 9 missions; airmen who became prisoners of war averaged 11 missions.
By demographics, the 5,000 airmen who served in the 44th Bomb Group came from every walk of life:
- Half came from 6 most populated states in the country — the other half came from every state in between.
- They came from every state except Alaska.
- Almost one-quarter (23%) were born in New York, Pennsylvania, or Massachusetts.
- One-fifth (21%) were born in California, Texas, Illinois and Ohio — the most populous states in the 1940 census not located in the northeast.
Race & Citizenship
- The men of the 44th Bomb Group were diverse in nearly every way except race: 95% of airmen in the 44th Bomb Group were white.
Note: African-Americans were not permitted to serve as combat pilots in the Army Air Corps — with the exception of a limited number of African American men accepted in 1941 to serve as combat pilots — the Tuskegee Airmen.
- Almost 8 in 10 (77%) were single when they enlisted. On average, this group was 22-years-old.
- Only 1 in 10 (13%) were married at enlistment. They were older than their single counterparts with an average age of 27.
The never-told big data history of the 44th Bomb Group’s collective impact, which has been the focus thus far, leaves out the singular stories of the 5,000 airmen who served in the group.
The beauty of crafting a narrative with big data are the exponential storytelling possibilities to look at the “big picture” while also zooming in on the granular stories that are “needles in a haystack” — all with just a few clicks of a mouse.
Dashboards are useful tools in filtering big data to pinpoint singular stories. By creatively leveraging the standard data elements available in dashboard software like Google Data Studio, it’s possible to build a user interface that resembles a database so users can easily pinpoint a single data point using a variety of filters.
In telling the 44th Bomb Group’s story, the counterpoint to the story of the group’s collective impact is the ability to search for the story of a single airman, a functionality that is the cornerstone of the 44th Bomb Group Data Dashboard:
V-E Day 75 is a reminder of the presence of the past. At the intersection of big data and history is unparalleled opportunity to shed new light on the stories of our forefathers.
Data scientists are in a unique position to creatively leverage the tools of our trade to innovate in unearthing new narratives of the past with big data, analytics, and data visualization.
The intersection of data science and history led me back to my grandfather Wally’s war. The story of his quiet heroism and survival against all odds came to life from data points I cobbled together. It took 75 years to learn of Wally’s profound sacrifices in the war.
Big data has the capacity to tell deeply human stories from our personal and collective histories.
Let the commemoration of the 75th anniversary of V-E Day encourage explorations at the intersection of history and data science:
- Continue learning from the collective impact and singular stories of 44th Bomb Group airmen during WWII by exploring the full dashboard at www.44thbombgroup.org.
- Begin exploring how to leverage your skills as a data scientist in unearthing stories of the past, including those of your forefathers.