You know what’s weird? You can work in one of the most progressive, well-paid, remote-flexible industries on Earth—and still feel like you’re quietly falling apart. Behind all the stand-ups and Slack jokes, there’s burnout, anxiety, depression. And way too many people thinking they’re the only ones feeling it.
That’s why this project matters. Not just because you’ll sharpen your data skills—but because it’s personal, raw, and real. You’re not just looking at charts. You’re holding up a mirror to an industry that still struggles to talk openly about mental health.
Let’s walk through how you can turn a public dataset into an emotionally honest, visually rich exploration of how people in tech view mental health—and how that view changes depending on where they live, who they work for, and what they believe.
First Things First: The Dataset
You’re working with the 2014 Mental Health in Tech Survey from OSMI (Open Sourcing Mental Illness). It’s freely available here:
https://www.kaggle.com/datasets/osmi/mental-health-in-tech-survey
Don’t be fooled by the year. The questions are still painfully relevant. It includes anonymized responses from over 1,000 tech workers, with questions covering:
- Employer attitudes toward mental health
- Personal experiences and diagnoses
- Comfort talking to coworkers and managers
- Demographics: age, gender, country, company size, etc.
No modeling needed. No NLP. This is a pure exploration project—with the power to say something meaningful through data storytelling.
What You’ll Learn by Doing This
If you’re newer to data science, this one teaches the foundations really well:
- Filtering data by conditions (age groups, gender, job type)
- Handling missing values (yep, there are plenty here)
- Creating visual narratives with pie charts, bar plots, stacked comparisons
- Segmenting responses by demographic slices
You’ll also build empathy—a skill data folks rarely talk about but desperately need more of. This isn’t just “how many people answered yes or no.” It’s about context. Feeling. Weight.
Start with the Right Question
Before you open the notebook, ask yourself: What do I want to understand?
You’re not here to “analyze everything.” That’s how projects die. Pick a question that sticks in your gut. A few options:
- Do younger people feel more open talking about mental health at work?
- Are managers more or less likely to seek help themselves?
- Does company size influence how mental health is handled?
- Is there a stigma difference between countries?
Start with one or two. Let the rest come later.
Hot tip: If a chart doesn’t feel like it could make someone pause and feel something, scrap it and dig deeper.
Phase 1: Clean Gently, Don’t Overkill
Real-world data is messy. This one’s no exception.
You’ll see weird age entries (someone put 323). You’ll see gender written a hundred different ways (“M”, “male”, “Man”, “cis male”). You’ll see skipped questions.
Clean lightly:
- Set reasonable age bounds (18–100 is fine).
- Normalize gender to 3–5 broad categories (Male, Female, Nonbinary, Prefer not to say, Other).
- Drop rows with too many nulls, but keep some blanks—they can still tell a story.
This isn’t Kaggle competition cleaning. It’s okay to keep some mess. That mess is human.
Phase 2: Slice It by Demographics
This is where the beginner twist comes in. Don’t just count yes/no answers. Ask: who said them?
Here are some demographic lenses worth slicing the responses through:
- Age groups (18–25, 26–35, 36–45, 46+)
- Gender
- Country (group into “US,” “UK,” “India,” “Rest of world” to keep it readable)
- Company size (Small: <100, Medium: 100–1000, Large: 1000+)
For each, pick one or two key mental health questions to compare, like:
- “Would you feel comfortable discussing a mental health issue with your supervisor?”
- “Do you believe there’s a stigma attached to mental health in your workplace?”
- “Have you been diagnosed with a mental health disorder?”
And then visualize. Don’t just say “32% of all respondents…”—say “45% of women at companies with fewer than 100 people felt comfortable talking to their manager.”
That’s a story.
Visuals That Actually Land
You don’t need fancy dashboards. Just good, clear plots that tell a truth.
- Bar charts for comparing yes/no responses across groups
- Pie charts sparingly—great for one-off views (e.g., gender breakdown)
- Stacked bar charts to show proportions within subgroups
- Line plots if you want to show age vs response trends (but only if it tells something clear)
One of the best visuals you can make:
“Comfort discussing mental health with manager” by company size
(Spoiler: people at smaller companies often feel more comfortable—but that’s not always good news. Sometimes it’s just because HR doesn’t exist.)
Make sure your charts don’t just look nice. They should say something, even at a glance.
Dig Into the Stigma Questions
This is where it gets meaty.
The survey includes questions like:
- “Do you think discussing a mental health issue with your employer would have negative consequences?”
- “Would you bring up a mental health issue in a job interview?”
- “Does your employer offer resources for mental health?”
These are gold for storytelling.
You could show how different groups feel about speaking up. Maybe women are more hesitant. Maybe older employees worry more about consequences. Maybe people in the US vs UK differ wildly.
Here’s one idea:
Make a chart titled “Who stays silent?”
Show the % of each group who said they wouldn’t disclose a mental health issue to their manager. It’s a gut punch. And it sticks.
Go Light on Stats, Heavy on Insight
This isn’t the place for logistic regression or confidence intervals.
Stick to:
- Simple percentages
- Group means
- Counts and proportions
If you want to impress, don’t throw stats—throw insights.
Like this:
“Only 12% of people in large companies said they’d discuss a mental health issue in an interview. That means 88% are walking in the door pretending everything’s fine.”
No math required. Just empathy + clarity.
Wrap It All Up with a Message That Hits
Don’t just end with a chart.
End with something felt. A real sentence. Something you believe after doing this.
Here’s one example:
“The data doesn’t say mental health is ignored in tech. It says people want to talk—but most still don’t feel safe doing it. That gap is where the real work begins.”
That’s what makes this a portfolio piece, not just an assignment. You show you care. You show you get the human side of data. And recruiters will remember that.
TL;DR: What You’ll Build and Why It Matters
- A cleaned, thoughtful version of the OSMI 2014 survey dataset
- Charts that reveal how age, gender, company size, and country affect mental health openness in tech
- A few surprising, even uncomfortable truths about how stigma still lingers in one of the world’s most forward-thinking industries
- A human-centered data story that shows off your filtering, grouping, and visual storytelling skills
More than any technical project, this one says something about you. About how you think, what you care about, and what kind of data scientist you’re becoming.
And if you do it well? Someone scrolling through your portfolio won’t just see your charts. They’ll see the person behind them.