Ebook417 pages8 hours

Big Data: Does Size Matter?

Name: Big Data: Does Size Matter?
Brand: Bloomsbury Publishing
Rating: 3.6 (5 reviews)

By Timandra Harkness

Rating: 3.5 out of 5 stars

3.5/5

()

Read preview

About this ebook

What is Big Data, and why should you care?

Big data knows where you've been and who your friends are. It knows what you like and what makes you angry. It can predict what you'll buy, where you'll be the victim of crime and when you'll have a heart attack. Big data knows you better than you know yourself, or so it claims.

But how well do you know big data?

You've probably seen the phrase in newspaper headlines, at work in a marketing meeting, or on a fitness-tracking gadget. But can you understand it without being a Silicon Valley nerd who writes computer programs for fun?

Yes. Yes, you can.

Timandra Harkness writes comedy, not computer code. The only programmes she makes are on the radio. If you can read a newspaper you can read this book.

Starting with the basics – what IS data? And what makes it big? – Timandra takes you on a whirlwind tour of how people are using big data today: from science to smart cities, business to politics, self-quantification to the Internet of Things.

Finally, she asks the big questions about where it's taking us; is it too big for its boots, or does it think too small? Are you a data point or a human being? Will this book be full of rhetorical questions?

No. It also contains puns, asides, unlikely stories and engaging people, inspiring feats and thought-provoking dilemmas. Leaving you armed and ready to decide what you think about one of the decade's big ideas: big data.

Skip carousel

LanguageEnglish

PublisherBloomsbury Publishing

Release dateJun 2, 2016

ISBN9781472920065

Author

Timandra Harkness

Timandra Harkness is a writer, comedian and broadcaster who has been performing on scientific, mathematical and statistical topics since the latter days of the 20th Century. She has written about travel for the Sunday Times, motoring for the Telegraph, science & technology for WIRED, BBC Focus Magazine and Men's Health Magazine, and on being 'Seduced by Stats' for Significance (the Journal of the Royal Statistical Society). She is a regular on BBC Radio, resident reporter on social psychology series The Human Zoo as well as writing and presenting documentaries and BBC Radio 4's FutureProofing series. In 2010 she co-wrote and performed Your Days Are Numbered: The Maths of Death with stand-up mathematician Matt Parker, which was a sell-out hit at the Edinburgh Fringe before touring the rest of the UK and Australia. Science comedy since then includes solo show Brainsex, cabarets and gameshows. She is currently writing a new comedy show about Big Data. Meanwhile, she puts her MC skills to more serious uses, hosting and chairing events with Cheltenham Science Festival, the British Council, the Institute of Ideas, the Wellcome Collection and a Robotics conference in Moscow, among many others. In her spare time she is studying for an Open University degree in Mathematics & Statistics. @TimandraHarknes/ timandraharkness.com

Related authors

Skip carousel

Related to Big Data

Related ebooks

Skip carousel

Outnumbered: From Facebook and Google to Fake News and Filter-bubbles – The Algorithms That Control Our Lives
Ebook
Outnumbered: From Facebook and Google to Fake News and Filter-bubbles – The Algorithms That Control Our Lives
byDavid Sumpter
Rating: 4 out of 5 stars
4/5
Life After Google: The Fall of Big Data and the Rise of the Blockchain Economy
Ebook
Life After Google: The Fall of Big Data and the Rise of the Blockchain Economy
byGeorge Gilder
Rating: 3 out of 5 stars
3/5
The Fourth Age: Smart Robots, Conscious Computers, and the Future of Humanity
Ebook
The Fourth Age: Smart Robots, Conscious Computers, and the Future of Humanity
byByron Reese
Rating: 3 out of 5 stars
3/5
Humans Need Not Apply: A Guide to Wealth & Work in the Age of Artificial Intelligence
Ebook
Humans Need Not Apply: A Guide to Wealth & Work in the Age of Artificial Intelligence
byJerry Kaplan
Rating: 4 out of 5 stars
4/5
Virtually Human: The Promise—and the Peril—of Digital Immortality
Ebook
Virtually Human: The Promise—and the Peril—of Digital Immortality
byMartine Rothblatt, PhD
Rating: 4 out of 5 stars
4/5
Big Data: Using SMART Big Data, Analytics and Metrics To Make Better Decisions and Improve Performance
Ebook
Big Data: Using SMART Big Data, Analytics and Metrics To Make Better Decisions and Improve Performance
byBernard Marr
Rating: 4 out of 5 stars
4/5
Numbersense: How to Use Big Data to Your Advantage
Ebook
Numbersense: How to Use Big Data to Your Advantage
byKaiser Fung
Rating: 3 out of 5 stars
3/5
The Average is Always Wrong: A real-world guide to putting data at the heart of your business
Ebook
The Average is Always Wrong: A real-world guide to putting data at the heart of your business
byIan Shepherd
Rating: 0 out of 5 stars
0 ratings
Heart of the Machine: Our Future in a World of Artificial Emotional Intelligence
Ebook
Heart of the Machine: Our Future in a World of Artificial Emotional Intelligence
byRichard Yonck
Rating: 4 out of 5 stars
4/5
The Algorithmic Leader: How to Be Smart When Machines Are Smarter Than You
Ebook
The Algorithmic Leader: How to Be Smart When Machines Are Smarter Than You
byMike Walsh
Rating: 3 out of 5 stars
3/5
Big Data: Principles and Paradigms
Ebook
Big Data: Principles and Paradigms
byRajkumar Buyya
Rating: 0 out of 5 stars
0 ratings
Uberland: How Algorithms Are Rewriting the Rules of Work
Ebook
Uberland: How Algorithms Are Rewriting the Rules of Work
byAlex Rosenblat
Rating: 4 out of 5 stars
4/5
Big Data in Practice: How 45 Successful Companies Used Big Data Analytics to Deliver Extraordinary Results
Ebook
Big Data in Practice: How 45 Successful Companies Used Big Data Analytics to Deliver Extraordinary Results
byBernard Marr
Rating: 3 out of 5 stars
3/5
Real World AI: A Practical Guide for Responsible Machine Learning
Ebook
Real World AI: A Practical Guide for Responsible Machine Learning
byAlyssa Simpson Rochwerger
Rating: 0 out of 5 stars
0 ratings
Reading Our Minds: The Rise of Big Data Psychiatry
Ebook
Reading Our Minds: The Rise of Big Data Psychiatry
byDaniel Barron
Rating: 0 out of 5 stars
0 ratings
The Big Data-Driven Business: How to Use Big Data to Win Customers, Beat Competitors, and Boost Profits
Ebook
The Big Data-Driven Business: How to Use Big Data to Win Customers, Beat Competitors, and Boost Profits
byRussell Glass
Rating: 0 out of 5 stars
0 ratings
Geek Girl Rising: Inside the Sisterhood Shaking Up Tech
Ebook
Geek Girl Rising: Inside the Sisterhood Shaking Up Tech
byHeather Cabot
Rating: 3 out of 5 stars
3/5
Tubes: A Journey to the Center of the Internet
Ebook
Tubes: A Journey to the Center of the Internet
byAndrew Blum
Rating: 4 out of 5 stars
4/5
Bit by Bit: Social Research in the Digital Age
Ebook
Bit by Bit: Social Research in the Digital Age
byMatthew J. Salganik
Rating: 4 out of 5 stars
4/5
What Would Google Do?: Reverse-Engineering the Fastest Growing Company in the History of the World
Ebook
What Would Google Do?: Reverse-Engineering the Fastest Growing Company in the History of the World
byJeff Jarvis
Rating: 4 out of 5 stars
4/5
The Fuzzy and the Techie: Why the Liberal Arts Will Rule the Digital World
Ebook
The Fuzzy and the Techie: Why the Liberal Arts Will Rule the Digital World
byScott Hartley
Rating: 3 out of 5 stars
3/5
Big Data Revolution: What farmers, doctors and insurance agents teach us about discovering big data patterns
Ebook
Big Data Revolution: What farmers, doctors and insurance agents teach us about discovering big data patterns
byRob Thomas
Rating: 3 out of 5 stars
3/5
The End of Jobs: The Rise of On-Demand Workers and Agile Corporations
Ebook
The End of Jobs: The Rise of On-Demand Workers and Agile Corporations
byJeff Wald
Rating: 0 out of 5 stars
0 ratings
We Are Data: Algorithms and the Making of Our Digital Selves
Ebook
We Are Data: Algorithms and the Making of Our Digital Selves
byJohn Cheney-Lippold
Rating: 2 out of 5 stars
2/5
Behind Every Good Decision: How Anyone Can Use Business Analytics to Turn Data into Profitable Insight
Ebook
Behind Every Good Decision: How Anyone Can Use Business Analytics to Turn Data into Profitable Insight
byPiyanka Jain
Rating: 5 out of 5 stars
5/5
No Explanation Required!: A Woman's Guide to Assert Your Confidence and Communicate to Win at Work
Ebook
No Explanation Required!: A Woman's Guide to Assert Your Confidence and Communicate to Win at Work
byCarol Sankar
Rating: 4 out of 5 stars
4/5
Big Data Ethics in Research
Ebook
Big Data Ethics in Research
byNicolae Sfetcu
Rating: 0 out of 5 stars
0 ratings
What Is Big Data
Ebook
What Is Big Data
byJay Kassing
Rating: 0 out of 5 stars
0 ratings
In the Plex: How Google Thinks, Works, and Shapes Our Lives
Ebook
In the Plex: How Google Thinks, Works, and Shapes Our Lives
bySteven Levy
Rating: 4 out of 5 stars
4/5
Big Data Analytics: From Strategic Planning to Enterprise Integration with Tools, Techniques, NoSQL, and Graph
Ebook
Big Data Analytics: From Strategic Planning to Enterprise Integration with Tools, Techniques, NoSQL, and Graph
byDavid Loshin
Rating: 5 out of 5 stars
5/5

Mathematics For You

Skip carousel

Algebra - The Very Basics
Ebook
Algebra - The Very Basics
byMetin Bektas
Rating: 5 out of 5 stars
5/5
Quantum Physics for Beginners
Ebook
Quantum Physics for Beginners
byMax Thomson
Rating: 4 out of 5 stars
4/5
The Little Book of Mathematical Principles, Theories & Things
Ebook
The Little Book of Mathematical Principles, Theories & Things
byRobert Solomon
Rating: 3 out of 5 stars
3/5
Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics
Ebook
Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics
byGary Smith
Rating: 4 out of 5 stars
4/5
Basic Math & Pre-Algebra For Dummies
Ebook
Basic Math & Pre-Algebra For Dummies
byMark Zegarelli
Rating: 4 out of 5 stars
4/5
The Everything Everyday Math Book: From Tipping to Taxes, All the Real-World, Everyday Math Skills You Need
Ebook
The Everything Everyday Math Book: From Tipping to Taxes, All the Real-World, Everyday Math Skills You Need
byChristopher Monahan
Rating: 5 out of 5 stars
5/5
The Thirteen Books of the Elements, Vol. 1
Ebook
The Thirteen Books of the Elements, Vol. 1
byEuclid
Rating: 0 out of 5 stars
0 ratings
Calculus Made Easy
Ebook
Calculus Made Easy
bySilvanus P. Thompson
Rating: 4 out of 5 stars
4/5
Real Estate by the Numbers: A Complete Reference Guide to Deal Analysis
Ebook
Real Estate by the Numbers: A Complete Reference Guide to Deal Analysis
byJ Scott
Rating: 0 out of 5 stars
0 ratings
My Best Mathematical and Logic Puzzles
Ebook
My Best Mathematical and Logic Puzzles
byMartin Gardner
Rating: 5 out of 5 stars
5/5
Algebra I Workbook For Dummies
Ebook
Algebra I Workbook For Dummies
byMary Jane Sterling
Rating: 3 out of 5 stars
3/5
The Everything Guide to Pre-Algebra: A Helpful Practice Guide Through the Pre-Algebra Basics - in Plain English!
Ebook
The Everything Guide to Pre-Algebra: A Helpful Practice Guide Through the Pre-Algebra Basics - in Plain English!
byJane Cassie
Rating: 5 out of 5 stars
5/5
Statistics 101: From Data Analysis and Predictive Modeling to Measuring Distribution and Determining Probability, Your Essential Guide to Statistics
Ebook
Statistics 101: From Data Analysis and Predictive Modeling to Measuring Distribution and Determining Probability, Your Essential Guide to Statistics
byDavid Borman
Rating: 4 out of 5 stars
4/5
Relativity: The special and the general theory
Ebook
Relativity: The special and the general theory
byAlbert Einstein
Rating: 5 out of 5 stars
5/5
This is The Statistics Handbook your Professor Doesn't Want you to See. So Easy, it's Practically Cheating...
Ebook
This is The Statistics Handbook your Professor Doesn't Want you to See. So Easy, it's Practically Cheating...
byS. Deviant
Rating: 4 out of 5 stars
4/5
Game Theory: A Simple Introduction
Ebook
Game Theory: A Simple Introduction
byK.H. Erickson
Rating: 4 out of 5 stars
4/5
Flatland
Ebook
Flatland
byEdwin A. Abbott
Rating: 4 out of 5 stars
4/5
ACT Math & Science Prep: Includes 500+ Practice Questions
Ebook
ACT Math & Science Prep: Includes 500+ Practice Questions
byKaplan Test Prep
Rating: 3 out of 5 stars
3/5
Is God a Mathematician?
Ebook
Is God a Mathematician?
byMario Livio
Rating: 4 out of 5 stars
4/5
The Math Book: From Pythagoras to the 57th Dimension, 250 Milestones in the History of Mathematics
Ebook
The Math Book: From Pythagoras to the 57th Dimension, 250 Milestones in the History of Mathematics
byClifford A. Pickover
Rating: 3 out of 5 stars
3/5
Precalculus: A Self-Teaching Guide
Ebook
Precalculus: A Self-Teaching Guide
bySteve Slavin
Rating: 5 out of 5 stars
5/5
Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition
Ebook
Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition
byAndrew Hodges
Rating: 4 out of 5 stars
4/5
The End of Average: How We Succeed in a World That Values Sameness
Ebook
The End of Average: How We Succeed in a World That Values Sameness
byTodd Rose
Rating: 4 out of 5 stars
4/5
The Everything Guide to Algebra: A Step-by-Step Guide to the Basics of Algebra - in Plain English!
Ebook
The Everything Guide to Algebra: A Step-by-Step Guide to the Basics of Algebra - in Plain English!
byChristopher Monahan
Rating: 4 out of 5 stars
4/5
Mental Math Secrets - How To Be a Human Calculator
Ebook
Mental Math Secrets - How To Be a Human Calculator
byRandy Silverman
Rating: 5 out of 5 stars
5/5
The Golden Ratio: The Divine Beauty of Mathematics
Ebook
The Golden Ratio: The Divine Beauty of Mathematics
byGary B. Meisner
Rating: 5 out of 5 stars
5/5
Algebra I For Dummies
Ebook
Algebra I For Dummies
byMary Jane Sterling
Rating: 4 out of 5 stars
4/5
Things to Make and Do in the Fourth Dimension: A Mathematician's Journey Through Narcissistic Numbers, Optimal Dating Algorithms, at Least Two Kinds of Infinity, and More
Ebook
Things to Make and Do in the Fourth Dimension: A Mathematician's Journey Through Narcissistic Numbers, Optimal Dating Algorithms, at Least Two Kinds of Infinity, and More
byMatt Parker
Rating: 4 out of 5 stars
4/5
See Ya Later Calculator: Simple Math Tricks You Can Do in Your Head
Ebook
See Ya Later Calculator: Simple Math Tricks You Can Do in Your Head
byEditors of Portable Press
Rating: 4 out of 5 stars
4/5
Logicomix: An epic search for truth
Ebook
Logicomix: An epic search for truth
byApostolos Doxiadis
Rating: 4 out of 5 stars
4/5

Related podcast episodes

Skip carousel

How Google Tracks Hackers: Tracking hacking groups has become a booming business. Dozens of so-called “threat intelligence” companies keep tabs on them and sell subscriptions to feeds where they provide customers with up to date information on what the most advanced cyber crimin...
Podcast episode
How Google Tracks Hackers: Tracking hacking groups has become a booming business. Dozens of so-called “threat intelligence” companies keep tabs on them and sell subscriptions to feeds where they provide customers with up to date information on what the most advanced cyber crimin...
byCYBER
0 ratings
0% found this document useful
#232: How to Actually Get Work Done at Home | Rasmus Hougaard & Jacqueline Carter: We are in the middle of a giant, global experiment in remote work. Even in the best of times, working from home is tricky. You're surrounded by distractions: pets, laundry, Netflix. But in the midst of this pandemic, WFH (as the millennials call it) is even harder, given that many of us are cooped up with our children, cut off from our coworkers, and overwhelmed by anxiety. Our guests, Rasmus Hougaard and Jacqueline Carter, from Potential Project, are experts in bringing mindfulness into the workplace. In this episode, we explore solutions to four major problems: distraction, isolation, virtual collaboration, and balancing family life. Plugzone: Website: https://www.potentialproject.com/ Books: https://www.potentialproject.com/books/the-mind-of-the-leader/ Rasmus on LinkedIn: https://www.linkedin.com/in/rasmushougaard/ Jacqueline on LinkedIn: https://www.linkedin.com/in/jacquelinecarter1/ Other Resour
Podcast episode
#232: How to Actually Get Work Done at Home | Rasmus Hougaard & Jacqueline Carter: We are in the middle of a giant, global experiment in remote work. Even in the best of times, working from home is tricky. You're surrounded by distractions: pets, laundry, Netflix. But in the midst of this pandemic, WFH (as the millennials call it) is even harder, given that many of us are cooped up with our children, cut off from our coworkers, and overwhelmed by anxiety. Our guests, Rasmus Hougaard and Jacqueline Carter, from Potential Project, are experts in bringing mindfulness into the workplace. In this episode, we explore solutions to four major problems: distraction, isolation, virtual collaboration, and balancing family life. Plugzone: Website: https://www.potentialproject.com/ Books: https://www.potentialproject.com/books/the-mind-of-the-leader/ Rasmus on LinkedIn: https://www.linkedin.com/in/rasmushougaard/ Jacqueline on LinkedIn: https://www.linkedin.com/in/jacquelinecarter1/ Other Resour
byTen Percent Happier with Dan Harris
0 ratings
0% found this document useful
Google Cloud Next Data, Analytics, and AI Launches with Eric Schmidt and Bruno Aziza: Mark Mirchandani is back this week with cohost Bukola Ayodele. We're talking with Eric Schmidt and Bruno Aziza about all the awesome new analytics, data, and AI launches from last week's Google Cloud Next conference.
Podcast episode
Google Cloud Next Data, Analytics, and AI Launches with Eric Schmidt and Bruno Aziza: Mark Mirchandani is back this week with cohost Bukola Ayodele. We're talking with Eric Schmidt and Bruno Aziza about all the awesome new analytics, data, and AI launches from last week's Google Cloud Next conference.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
Propositions as Types by Philip Wadler
Podcast episode
Propositions as Types by Philip Wadler
byFuture of Coding
0 ratings
0% found this document useful
#62 - Paul Christiano on messaging the future, increasing compute, & how CO2 impacts your brain: Imagine that – one day – humanity dies out. At some point, many millions of years later, intelligent life might well evolve again. Is there any message we could leave that would reliably help them out?
Podcast episode
#62 - Paul Christiano on messaging the future, increasing compute, & how CO2 impacts your brain: Imagine that – one day – humanity dies out. At some point, many millions of years later, intelligent life might well evolve again. Is there any message we could leave that would reliably help them out?
by80,000 Hours Podcast
0 ratings
0% found this document useful
#72 - Toby Ord on the precipice and humanity's potential futures
Podcast episode
#72 - Toby Ord on the precipice and humanity's potential futures
by80,000 Hours Podcast
0 ratings
0% found this document useful
???‍?? 215 - Social Science & Collective Intelligence with Brigham Adams of Goodly Labs
Podcast episode
???‍?? 215 - Social Science & Collective Intelligence with Brigham Adams of Goodly Labs
byFUTURE FOSSILS
0 ratings
0% found this document useful
You’ve Got Brainmail: In our last episode of the season, we take one one of the most requested futures: telepathy! What would it be like to be able to link minds, and communicate brain to brain? And how likely is it that we’ll ever get this kind of technology? -
Podcast episode
You’ve Got Brainmail: In our last episode of the season, we take one one of the most requested futures: telepathy! What would it be like to be able to link minds, and communicate brain to brain? And how likely is it that we’ll ever get this kind of technology? -
byFlash Forward
0 ratings
0% found this document useful
The p-zombie theory of consciousness
Podcast episode
The p-zombie theory of consciousness
by"Securities"
0 ratings
0% found this document useful
Antikythera Mechanism – The World’s First Computer? | 119
Podcast episode
Antikythera Mechanism – The World’s First Computer? | 119
byHysteria 51
0 ratings
0% found this document useful
Can You Come Out and Play?
Podcast episode
Can You Come Out and Play?
byWizard of Ads Monday Morning Memo
0 ratings
0% found this document useful
Best Of: This Is Your Brain on Deep Reading. It’s Pretty Magnificent.: Every day, we consume a mind-boggling amount of information. We scan online news articles, sift through text messages and emails, scroll through our social-media feeds — and that’s usually before we even get out of bed in the morning. In 2009, a team of researchers found that the average American consumed about 34 gigabytes of information a day. Undoubtedly, that number would be even higher today. But what are we actually getting from this huge influx of information? How is it affecting our memories, our attention spans, our ability to think? What might this mean for today’s children, and future generations? And what does it take to read — and think — deeply in a world so flooded with constant input? Maryanne Wolf is a researcher and scholar at U.C.L.A.’s School of Education and Information Studies. Her books “Proust and the Squid: The Story and Science of the Reading Brain” and “Reader, Come Home: The Reading Bra
Podcast episode
Best Of: This Is Your Brain on Deep Reading. It’s Pretty Magnificent.: Every day, we consume a mind-boggling amount of information. We scan online news articles, sift through text messages and emails, scroll through our social-media feeds — and that’s usually before we even get out of bed in the morning. In 2009, a team of researchers found that the average American consumed about 34 gigabytes of information a day. Undoubtedly, that number would be even higher today. But what are we actually getting from this huge influx of information? How is it affecting our memories, our attention spans, our ability to think? What might this mean for today’s children, and future generations? And what does it take to read — and think — deeply in a world so flooded with constant input? Maryanne Wolf is a researcher and scholar at U.C.L.A.’s School of Education and Information Studies. Her books “Proust and the Squid: The Story and Science of the Reading Brain” and “Reader, Come Home: The Reading Bra
byThe Ezra Klein Show
0 ratings
0% found this document useful
76. William Poundstone — The Doomsday Calculation: How an Equation that Predicts the Future is Transforming Everything We Know About Life and the Universe: When will the world end? How likely is it that intelligent extraterrestrial life exists? Are we living in a simulation like the Matrix? Is our universe but one in a multiverse? How does Warren Buffett continue to beat the stock market? How much longer...
Podcast episode
76. William Poundstone — The Doomsday Calculation: How an Equation that Predicts the Future is Transforming Everything We Know About Life and the Universe: When will the world end? How likely is it that intelligent extraterrestrial life exists? Are we living in a simulation like the Matrix? Is our universe but one in a multiverse? How does Warren Buffett continue to beat the stock market? How much longer...
byThe Michael Shermer Show
0 ratings
0% found this document useful
Tree Free: Today we travel to a fully digital world, a world where paper is a thing of the past. - On this show we’ve tackled a huge range of futures — we’ve talked about things that are extremely likely, like, antibiotic resistance,
Podcast episode
Tree Free: Today we travel to a fully digital world, a world where paper is a thing of the past. - On this show we’ve tackled a huge range of futures — we’ve talked about things that are extremely likely, like, antibiotic resistance,
byFlash Forward
0 ratings
0% found this document useful
Our pranking primate cousins: A conversation with Johanna Eckert and Erica Cartmill
Podcast episode
Our pranking primate cousins: A conversation with Johanna Eckert and Erica Cartmill
byMany Minds
0 ratings
0% found this document useful
Message to the stars: A conversation with Daniel Oberhaus
Podcast episode
Message to the stars: A conversation with Daniel Oberhaus
byMany Minds
0 ratings
0% found this document useful
David Wolpert & Farita Tasnim on The Thermodynamics of Communication
Podcast episode
David Wolpert & Farita Tasnim on The Thermodynamics of Communication
byCOMPLEXITY: Physics of Life
0 ratings
0% found this document useful
BI 159 Chris Summerfield: Natural General Intelligence: Support the show to get full episodes and join the Discord community. Chris Summerfield runs the Human Information Processing Lab at University of Oxford, and hes a research scientist at Deepmind. You may remember him from episode 95 with Sam
Podcast episode
BI 159 Chris Summerfield: Natural General Intelligence: Support the show to get full episodes and join the Discord community. Chris Summerfield runs the Human Information Processing Lab at University of Oxford, and hes a research scientist at Deepmind. You may remember him from episode 95 with Sam
byBrain Inspired
0 ratings
0% found this document useful
Why Myth Matters with Sophie Strand
Podcast episode
Why Myth Matters with Sophie Strand
byThe Bigger Picture Podcast
0 ratings
0% found this document useful
90 - Kate Greene on Humanizing Science & Cooking on Mars
Podcast episode
90 - Kate Greene on Humanizing Science & Cooking on Mars
byFUTURE FOSSILS
0 ratings
0% found this document useful
#25 — Human Consciousness and AI: What Does the Future Hold? with Susan Schneider and Rachel St. Clair: In this episode, we examine Human Consciousness and AI, and particularly the popular idea that AI will become conscious at some point. Because conscious brains are the product of enormous periods of evolution and environmental conditions that keep c...
Podcast episode
#25 — Human Consciousness and AI: What Does the Future Hold? with Susan Schneider and Rachel St. Clair: In this episode, we examine Human Consciousness and AI, and particularly the popular idea that AI will become conscious at some point. Because conscious brains are the product of enormous periods of evolution and environmental conditions that keep c...
byOn Consciousness & the Brain with Bernard Baars
0 ratings
0% found this document useful
Expert guide to conspiracy theories part 4 – how they spread: Has the internet actually been a game changer for conspiracy theories?
Podcast episode
Expert guide to conspiracy theories part 4 – how they spread: Has the internet actually been a game changer for conspiracy theories?
byThe Anthill
0 ratings
0% found this document useful
Chapter 81: Dave Eggers on surreptitious spying in the snares of surveillance: I discovered Dave Eggers in the late 90s when the Internet was all belts and pinions and the only two comedy websites that I remember reading were and . The Onion’s site was the notorious outcropping of a campus comedy newspaper from Wisconsin and...
Podcast episode
Chapter 81: Dave Eggers on surreptitious spying in the snares of surveillance: I discovered Dave Eggers in the late 90s when the Internet was all belts and pinions and the only two comedy websites that I remember reading were and . The Onion’s site was the notorious outcropping of a campus comedy newspaper from Wisconsin and...
by3 Books With Neil Pasricha
0 ratings
0% found this document useful
115. Matthew Cobb — The Idea of the Brain: The Past and Future of Neuroscience: For thousands of years, thinkers and scientists have tried to understand what the brain does. Yet, despite the astonishing discoveries of science, we still have only the vaguest idea of how the brain works. In , scientist and historian Matthew Cobb...
Podcast episode
115. Matthew Cobb — The Idea of the Brain: The Past and Future of Neuroscience: For thousands of years, thinkers and scientists have tried to understand what the brain does. Yet, despite the astonishing discoveries of science, we still have only the vaguest idea of how the brain works. In , scientist and historian Matthew Cobb...
byThe Michael Shermer Show
0 ratings
0% found this document useful
Dr. Lynne S. McNeill (Part 2) | Unplugged
Podcast episode
Dr. Lynne S. McNeill (Part 2) | Unplugged
byDigital Folklore
0 ratings
0% found this document useful
Too Lazy to Read the Paper: Episode 8 with Martin Rosvall
Podcast episode
Too Lazy to Read the Paper: Episode 8 with Martin Rosvall
byToo Lazy to Read the Paper
0 ratings
0% found this document useful
Why More Security Never Feels Like Enough, by Astrid Countee: Storyslamming Anthropology Series #3
Podcast episode
Why More Security Never Feels Like Enough, by Astrid Countee: Storyslamming Anthropology Series #3
byThis Anthro Life
0 ratings
0% found this document useful
Thirty years after the Earth Summit: Thirty years on, we ask what has the Earth Summit in Rio achieved?
Podcast episode
Thirty years after the Earth Summit: Thirty years on, we ask what has the Earth Summit in Rio achieved?
byUnexpected Elements
0 ratings
0% found this document useful
Fuhgeddaboudit
Podcast episode
Fuhgeddaboudit
byBig Picture Science
0 ratings
0% found this document useful
Chapter 22: Tim Urban on shivering in shorts and shifting from sheep to chef: 3 Books is a completely insane and totally epic 15-year-long quest to uncover the 1000 most formative books in the world. Each chapter is hosted live and in-person at the guest's preferred location by Neil Pasricha, New York Times bestselling author...
Podcast episode
Chapter 22: Tim Urban on shivering in shorts and shifting from sheep to chef: 3 Books is a completely insane and totally epic 15-year-long quest to uncover the 1000 most formative books in the world. Each chapter is hosted live and in-person at the guest's preferred location by Neil Pasricha, New York Times bestselling author...
by3 Books With Neil Pasricha
0 ratings
0% found this document useful

Skip carousel

Want A Job In Data Science? You Might Have To Take A Standardized Test When Applying
Chicago Tribune
Article
Want A Job In Data Science? You Might Have To Take A Standardized Test When Applying
Jul 10, 2018
3 min read
Editor’s Note
Techfastly
Article
Editor’s Note
Jul 1, 2022
Dear Readers, Centralization has contributed to onboarding billions of users to the World Wide Web and creating the solid, reliable infrastructure that supports it. Simultaneously, a few centralized companies control significant swaths of the Interne
1 min read
Netflix Employees Join Wave Of Tech Activism With Walkout Over Chappelle Controversy
The Guardian
Article
Netflix Employees Join Wave Of Tech Activism With Walkout Over Chappelle Controversy
Oct 20, 2021
Employees at Netflix will halt work on Wednesday in a virtual walkout to condemn the streaming platform’s handling of complaints against Dave Chappelle’s new special. The action is the latest in a string of highly visible organizing efforts in the t
4 min read
Elon Musk's Boring Co. Raises Almost $113 Million, Mostly From Musk
Los Angeles Times
Article
Elon Musk's Boring Co. Raises Almost $113 Million, Mostly From Musk
Apr 17, 2018
1 min read
Rohingya Seek Reparations From Facebook For Role In Massacre
TechLife News
Article
Rohingya Seek Reparations From Facebook For Role In Massacre
Oct 1, 2022
4 min read
Data-driven Decision Making That Uses Data, Mind And Heart
The European Business Review
Article
Data-driven Decision Making That Uses Data, Mind And Heart
Jan 31, 2020
14 min read
The Changing Scenarios of Digital Enterprises in 21st Century World
Techfastly
Article
The Changing Scenarios of Digital Enterprises in 21st Century World
Dec 1, 2021
5 min read
The Myth of 'I'm Bad at Math'
The Atlantic
Article
The Myth of 'I'm Bad at Math'
Oct 28, 2013
4 min read
In Conversation with Surbhi Rathore
Techfastly
Article
In Conversation with Surbhi Rathore
Oct 1, 2021
4 min read
The Turing guide
New Zealand Listener
Article
The Turing guide
May 29, 2022
2 min read
Why Computers Will Never Write Good Novels: The power of narrative flows only from the human brain.
Nautilus
Article
Why Computers Will Never Write Good Novels: The power of narrative flows only from the human brain.
Feb 10, 2021
You’ve been hoaxed. The hoax seems harmless enough. A few thousand AI researchers have claimed that computers can read and write literature. They’ve alleged that algorithms can unearth the secret formulas of fiction and film. That Bayesian software c
12 min read
How Science Makes “Rick and Morty” Great
Nautilus
Article
How Science Makes “Rick and Morty” Great
Oct 1, 2017
5 min read
A.I. Has Grown Up and Left Home: It matters only that we think, not how we think.
Nautilus
Article
A.I. Has Grown Up and Left Home: It matters only that we think, not how we think.
Jan 22, 2015
The history of Artificial Intelligence,” said my computer science professor on the first day of class, “is a history of failure.” This harsh judgment summed up 50 years of trying to get computers to think. Sure, they could crunch numbers a billion ti
9 min read
Women In Computing
All About History
Article
Women In Computing
Sep 7, 2023
What are some of the earliest examples of women in computing? Have they been there since the beginning? Even in what might be viewed as the beginnings of computing, women are involved. One of the earliest was NicoleReine Étable de la Brière (1723-178
3 min read
What We Know Now
New Zealand Listener
Article
What We Know Now
Jul 16, 2023
KNOWING WHAT WE KNOW, by Simon Winchester (William Collins, $37.99) Sometimes, a great idea is all you need. Other times, even the best of ideas needs a little help. In the 1890s, Belgian duo Paul Otlet and Henri La Fontaine had what must have seemed
3 min read
Inbox
How It Works
Article
Inbox
Oct 1, 2020
3 min read
Robots Can’t Dance: Why the singularity is greatly exaggerated.
Nautilus
Article
Robots Can’t Dance: Why the singularity is greatly exaggerated.
Jan 22, 2015
Can a robot be creative? Advances in cloud robotics—machines connected to supercomputers in the cloud—have given self-driving cars, surgical robots, and other “smart” devices tremendous powers of computation. But can a robot, even one supercharged wi
10 min read
The 120-Year-Old Mind-Reading Machine
The Atlantic
Article
The 120-Year-Old Mind-Reading Machine
Jun 16, 2014
3 min read
Stop Worrying About Deepfakes
Nautilus
Article
Stop Worrying About Deepfakes
Dec 20, 2023
8 min read
How Do We Stop The Big Robot Takeover?
You South Africa
Article
How Do We Stop The Big Robot Takeover?
Mar 10, 2023
9 min read
Duncan, Dennis: In New Book, Author Dives Deep Into History Of The Index
NPR
Article
Duncan, Dennis: In New Book, Author Dives Deep Into History Of The Index
Feb 17, 2022
2 min read
Literature by the Numbers: Critical Reading Gets Even Better When You Use Your Computer.: Critical reading gets even better when you use your computer.
Nautilus
Article
Literature by the Numbers: Critical Reading Gets Even Better When You Use Your Computer.: Critical reading gets even better when you use your computer.
Oct 10, 2013
“Literature is the opposite of data,” wrote novelist Stephen Marche in the Los Angeles Times Review of Books in October 2012. He cited his favorite line from Shakespeare’s Macbeth: “Light thickens, and the crows make wing to the rooky wood.” Marche w
9 min read
“I Wrote So Many Drafts Of This Book”
All About Space
Article
“I Wrote So Many Drafts Of This Book”
May 18, 2023
6 min read
Ellen Ullman: We Have to Demystify Code
Literary Hub
Article
Ellen Ullman: We Have to Demystify Code
Oct 11, 2017
10 min read
Paper Versus Pixel: The science of reading shows that print and digital experiences are complementary.
Nautilus
Article
Paper Versus Pixel: The science of reading shows that print and digital experiences are complementary.
Aug 29, 2013
On the occasion of the inaugural Nautilus Quarterly, we asked Nicholas Carr to survey the prospects for a print publication. Here he shows why asking if digital publications will supplant printed ones is the wrong question. Gutenberg we know. But w
7 min read
When Sci-Fi Anticipates Reality
The Atlantic
Article
When Sci-Fi Anticipates Reality
Aug 31, 2023
4 min read
Cover Reveal: Pola Oloixarac’s Forthcoming Novel is Watching You
Literary Hub
Article
Cover Reveal: Pola Oloixarac’s Forthcoming Novel is Watching You
Nov 20, 2018
4 min read
Careful What You Wish For
Guardian Weekly
Article
Careful What You Wish For
May 12, 2023
5 min read
No Easy Way To Mastery
New Zealand Listener
Article
No Easy Way To Mastery
May 22, 2022
2 min read
Smart With Heart
Guardian Weekly
Article
Smart With Heart
Jul 30, 2021
3 min read

Related categories

Skip carousel

Reviews for Big Data

Rating: 3.6 out of 5 stars

3.5/5

5 ratings1 review

Rating: 4 out of 5 stars
4/5
I received an advance copy of this for review from NetGalley.com.

Ms. Harkness tackles a huge subject (snark) and handles it well in three parts:
- the history of "big data" ... think census as the biggest driver;
- what it's done for us (or my take might be *to* us) from business and science to ? and politics;
- her own ideas on the future of big data and a prodding to the reader

First, I am impressed with the access she had to some pretty amazing people/places. Large Hadron Collider? How cool! There are more...you'll just have to read the book.

That businesses try to use big data to target customers is no surprise. Big data may shape predictions of trends, or reactions to new marketing, but I really don't know if it can be used to influence individuals (something she addresses in a different context in a later chapter.) Facebook and Amazon seem to use a combination of data mining to achieve their ends. And that particular link is creepy...seconds after looking at something on Amazon, my wife gets the exact item suggested when she switches over to Facebook. I don't because I block ads, and on my devices where I can't, I don't go to Amazon. But I do get offers based on questionable data...I live in Texas. so I must be interested in guns, right? I rant about FoxNews, so I must want to see more of it, right? That would be wrong... Big data may be the masses of searches and purchases, but to truly be effective, there must be a component of small data to tailor correctly to the individual.

Ms. Harkness addresses the biggest dataset, that being human language, in a bit about an artificial intelligence truly understanding language. I agree with her skepticism about a computer understanding the language, but I also thought that communication is not necessarily the same as understanding. A self-learning database and front end can continually ask "I didn't get that...?" and given sufficient storage and processing, someday pass Turing's test. She notes "Irony is one of these things that it's hard for a computer to get." Which is why Ray Kurzweil's singularity is probably a lot farther out than he thinks. Or not. An AI will think differently (it has to) and not likely need the myriad nuances of human language. More on AI, she talks about virtual butlers prompting us to pack for a trip, or tying various activities and predicting something wholly inappropriate that a human assistant would clearly not do. Letting Big Sister plan our lives is too much for me.

On science, she brilliantly distills into commonspeak the processes required to parse an unimaginable amount of data from a nanosecond collision of very high energy particles. The transition from "large data" - collecting lots of a single or limited type of information to be used in the originally envisioned context - to "big data" - collecting all the things that can be thought of and more to determine relationships not anticipated - is one of the key changes in big data in science. inexpensive storage makes it possible to save nearly everything...to be examined later. I was surprised that in the book she only mentions briefly weather prediction. That's a prime example of big data being used to make specific (statistically close, that is) near term predictions, with the goal of extending the window out as more data and analysis become available.

I was fascinated by the chapter on politics, and political parties in Britain identifying individual swing voters and targeting them with personalized letters or messages. I don't know if people here read the crap sent, but I know I don't. (Nor do I listen to specifics said in debates or stump speeches...one must put into context whether whatever is said can even come about...) When she cites a website that urges voters to check out voting records, look at speeches, even ask the representatives direct questions, she asks the question "So can big data help us make properly informed decisions about who to vote for?" That fell flat with me. I think that is small data, because answers must be combined with evaluations of effectiveness.

She also mentions postcodes (zip codes for us USA readers) making it easy to put people into groups according to where they live. Several times, and in several contexts, such as postcodes correlating people to locations with a high incidence of say, lung cancer. That doesn't take into account people moving. Maybe in Britain they don't move as easily or as frequently as in the US. Of course, there are many parochial non-movers here, but it just stood out to me each time she brought it up that to be of any value, people would have to be microchipped or something.

I liked her citing the twisted words big data users (or I note, anybody with an agenda) can spew. The Cancer Research UK said studies showed "those who ate the most processed meat had around a 17 per cent higher risk of bowel cancer." But the risk is relative, as normal bowel cancer rates are 6%, so that increase would only change the risk to 6.5%. I ran across that some years ago when a vendor claimed to decrease inefficiency by 30%, conveniently neglecting to mention that the original efficiency was already a substantial 97%, so that 30% reduction in inefficiency only resulted in an increase of 1% efficiency! One might surmise that I never got a callback after I asked the question.

Ms. Harkness, in addition to being a journalist, is also a comedian, and she infuses humor throughout. She made me laugh when she said big data made better food crops, medicine and even better Guinness. I am of the opinion that Guinness is awful, so...my note was {snort}! And when she was talking about the first American census created as a way to share "out the burden of the War of Independence", she noted that "[b]oth representation and taxation would be allocated according to population."

I did have a few quibbling points... She talks about Laplace's confidence that science and mathematics could (eventually) explain everything - Laplace's Demon - and she quotes Laplace (in English): "We may regard the present state of the universe as the effect of its past and the cause of its future." but goes on to paraphrase "If everything in the future is determined by what happened in the past, that leaves no room for us to make choices." Unless I misunderstand her wording, that isn't what Laplace was saying...he said the *present* state of the universe is a product of its past, not that a future is determined from a past. I don't think Laplace was excluding the human free will, or random factors.

A couple of quibbling notes on the structure and presentation of the book:
- Formatting is different than I've been used to the past fifty years. When quoting someone, Ms. Harkness uses a single quote mark to open the quote...having read British authors, that's not new to me, but what was new was not leading following paragraphs with another quote mark. Instead, the only closing mark is at the end of the quote, sometimes several paragraphs, and even another page away. While it's not hard to break that code, it is more confusing than necessary to keep track of which is her voice and which is her interviewee.
- My copy had no index, but it did have a placeholder. I did not see a list of references or a similar placeholder. While I understand Ms. Harkness is taking a more conversational approach in her narrative (she has many footnotes, but most are slight clarifications/explanations, or humor), she quotes a number of sources without providing references. This is me writing this, and that may not be important to other readers, but I sometimes like to dig deeper. In particular, she quoted an activist in Oakland, on the use of a Domain Awareness Center, referring to a white paper from the "Monterey Naval Academy". The quote was in quote (I'd say "quotes" to mean quotation marks if this were not a UK book...) to indicate that it was a quote, and yet I happen to know, as would most people, that the "Naval Academy" is actually the Naval Postgraduate School. In Monterrey.
- I would have liked to be able to copy text in this review, but the permissions forbade it. As such, I'm reduced to typing quotes and I admit little patience for that.

I liked this nugget:
"How can you take a dataset and repurpose it and do something interesting with it?" Her example was using Major League Baseball data to determine if left-handed people live longer than right-handed. She does note the limitation of the dataset being gender one-sided, but that *is* an interesting take.

Book preview

Big Data - Timandra Harkness

A NOTE ON THE AUTHOR

Timandra Harkness is a writer, comedian and broadcaster, who has been performing on scientific, mathematical and statistical topics since the latter days of the 20th Century. She has written about travel for the Sunday Times, motoring for the Telegraph, science & technology for WIRED, BBC Focus Magazine and Men’s Health Magazine, and on being ‘Seduced by Stats’ for Significance (the Journal of the Royal Statistical Society). She is a regular on BBC Radio, resident reporter on social psychology series The Human Zoo, and writes and presents documentaries and BBC Radio 4’s Future Proofing series.

In 2010 she co-wrote and performed Your Days Are Numbered: The Maths of Death, with stand-up mathematician Matt Parker, which was a sell-out hit at the Edinburgh Fringe before touring the rest of the UK and Australia. Science comedy since then includes solo show Brainsex, cabarets and gameshows. Meanwhile, she puts her MC skills to more serious uses, hosting and chairing events with Cheltenham Science Festival, the British Council, the Institute of Ideas, the Wellcome Collection and a Robotics conference in Moscow, among many others.

Also available in the Bloomsbury Sigma series:

Sex on Earth by Jules Howard

p53: The Gene that Cracked the Cancer Code by Sue Armstrong

Atoms Under the Floorboards by Chris Woodford

Spirals in Time by Helen Scales

Chilled by Tom Jackson

A is for Arsenic by Kathryn Harkup

Breaking the Chains of Gravity by Amy Shira Teitel

Suspicious Minds by Rob Brotherton

Herding Hemingway’s Cats by Kat Arney

Electronic Dreams by Tom Lean

Sorting the Beef from the Bull by Richard Evershed

and Nicola Temple

Death on Earth by Jules Howard

The Tyrannosaur Chronicles by David Hone

Soccermatics by David Sumpter

Goldilocks and the Water Bears by Louisa Preston

Science and the City by Laurie Winkless

Bring Back the King by Helen Pilcher

Furry Logic by Matin Durrani and Liz Kalaugher

Built on Bones by Brenna Hassett

My European Family by Karin Bojs

4th Rock from the Sun by Nicky Jenner

Patient H69 by Vanessa Potter

Catching Breath by Kathryn Lougheed

For Linda,

who would have made this a better book,

were you here to read it.

BIG DATA

DOES SIZE MATTER?

Timandra Harkness

logo.tif

Part 1: What is it? Where did it come from?

Chapter 1: What is data? And what makes it big?

Chapter 2: Death and taxes. And babies.

Chapter 3: Thinking machines

Part 2: What has big data ever done for us?

Chapter 4: Big business

Chapter 5: Big science

Chapter 6: Big society

Chapter 7: Data-driven democracy

Part 3: Big ideas?

Chapter 8: Big Brother

Chapter 9: Who do we think you are?

Chapter 10: Are you a data point or a human being?

Chapter 11: Even bigger data

Appendix: Keeping your data private

Acknowledgements

Index

part 1: What is it? Where did it come from?

‘What is this book?’ asked my stepmother, Juliet.

‘Is it for people like me, who keep hearing the phrase ‘big data’ and want to be able to talk about it at dinner parties?’

‘Yes,’ I said, ‘that’s exactly what it is.’

Not only for Juliet, and not just at dinner parties – it’s a book for anyone who gets the feeling big data is interesting and important, and should be talked about, but doesn’t want to study mathematics or computer programming.

In 10 chapters I aim to get you from the most basic ideas to some of the thorniest issues we need to be arguing about.

On the way, you’ll meet some of the people, ideas and projects I’ve been lucky enough to encounter around the world. Much of this book is in other people’s words, telling their own stories or introducing concepts that help me understand why big data matters. I’ve tried to structure it so each new idea builds naturally on what’s gone before.

That means it’s written to be read in order. You can dip in and out if you prefer, of course. Hey, it’s your book, you can wallpaper your bathroom with it if you like.¹ But I think you’ll get more out of it if you read from beginning to end.

Big data is a huge subject, and changing so fast I sometimes felt I was running as fast as I could just to stand still. The extra chapter I’ve added for the paperback edition is probably out of date already. The subject matter of any one of these chapters could fill an entire book. So there are things I only touch upon, or miss out altogether. It doesn’t mean they’re not important or interesting. I hope I will give you enough of an overview that you will be able to go and find out more for yourself.

I have my own opinions on what is great, and not so great, about big data. I don’t want you to accept them. I want you to make up your own mind. That’s kind of the point of the whole book.

But just as important to me is that you enjoy reading it. I hope you do.

Note

1 Unless you got it out of the library.

chapter one

What is data? And what makes it big?

What is data?¹

Thirty thousand years ago, in central Europe, somebody scratched 57 notches into a wolf bone. Those 57 notches, grouped into fives just as you might tally something² today, are the earliest known recorded data.

We don’t know anything more about who scratched them, or even if the notches were all made by the same person. We have no idea what they denote, only that they were a record of something. Which may not seem like much to you, but it represented a breakthrough in how our ancestors were able to keep track of things.

Imagine for a moment that the fact it’s a wolf bone is significant, and knowing how many wolves you’d killed was important for some reason. Perhaps you wanted to see if the local wolf pack was getting bigger or smaller, or whether the new flint-tipped arrows were more efficient than the old wooden ones, or just to win an argument about which member of your tribe was the best wolf-killer and got to sit nearest the fire.

You could hang on to a trophy from each wolf, and just see which pile of skulls is biggest, but that takes up room, and is vulnerable to being eaten by dogs. If you can represent each wolf with a notch, all you have to do is compare bones and see which has more notches.

Somebody in an Ice Age cave, in what is now the Czech Republic, had invented digital data.

Today, you can download Wild Wolf Data from the comfort of your own computer. The International Wolf Center in Minnesota, USA, fits wild wolves with tracking collars: radio collars since 1968, and more recently GPS collars that use satellite links to track the wolf’s position. This has allowed them to locate individual wolves at any given time, but also to study patterns of wolf movement and behaviour, and even to predict likely conflicts between the wolves and their human neighbours.

The technology is more advanced, but the basic principle is the same: turn your information into numbers, and record it in a form that’s easy to use and share. GPS data, tracking wolves through the forests of America, is digital information, by which we simply mean that it comes in numbers that you could, in theory, count on your fingers, your digits.

You’d need a lot of fingers, but that’s where computers come in handy.

Today, computing technology is so cheap, compact and powerful that domestic washing machines use computers to control laundry cycles. Impressive. And yet it’s still easier to keep track of wild wolves in Minnesota than of your own socks.

Without computers, big data would be impossible, so let’s take a quick look at their unstoppable rise.

A century of computers

The earliest computers wore petticoats. Until the twentieth century, ‘computer’ was a job title, and people, mainly women, were paid to do mathematics, with the aid of primitive technology such as log tables and slide rules, both of which were still in use well into the space age.

The first computer in the modern sense was built by IBM in 1944, in partnership with Harvard University. The Automatic Sequence Controlled Calculator, affectionately known as the Mark 1, was 2.4m (8ft) high and more than 15m (50ft) long. It weighed nearly 4,535kg (5 tons) and worked by a combination of electrical and mechanical parts, relay switches, rods and wheels. Computer historian John Kopplin described it as sounding ‘like a roomful of ladies knitting’.

The Mark 1 could add together 23-digit numbers in under a second. Multiplication took around five seconds, and division over 10 seconds. It received its program and data in the form of holes punched into paper tape and cards.

Mathematician Grace Hopper became the chief programmer. Her work was central to the development of computer programming, but you may be more entertained to learn that she was the first person to debug a computer: she removed a moth that got stuck in the mechanism.³ Most of the Mark 1’s early tasks related to the Second World War. Grace Hopper was officially part of the US Naval Reserve, and remained so until she retired, aged 79, with the rank of Rear Admiral (lower half⁴).

For tasks such as predicting the path of artillery shells, you put numbers in, and you got numbers out. But after the war, both business and government wanted to use the computer for a wider range of tasks. Human beings don’t naturally converse in a string of ones and zeroes.⁵ They wanted to use recognisable words and syntax to set tasks for the Mark 1, and to understand the answers that came back. Hopper led a team who developed a new programming language⁶ using words and structures from the English language, so that non-specialists could work more easily with computers.

The development of what today we’d call software was a step towards introducing the power of computing into non-mathematical areas of human life. But the hardware was unwieldy and expensive. When Harvard’s Howard Aiken, the inventor of the Mark 1, was asked in 1947 to estimate how many computers the US might buy, he said six. It would take a transformation in how they worked, and how they were built, to get us to the present day.

The laptop on which I’m writing this book is smaller and lighter than the machines used to punch the data cards for the Mark 1, works millions of times faster and costs a fraction of the price. Instead of rods and wheels, it uses electronic circuits printed on tiny slivers of silicon: cheaper and less susceptible to moths. The Mark 1 was handmade by experts, but my computer is mass-produced by machines and assembled by people with a few specific skills.

The miniaturisation of electronic components, combined with processes that make them cheaper to produce, gave rise to Moore’s Law, coined by Gordon Moore, co-founder of microchip company Intel. Moore’s Law says that the amount of processing power you can fit on to a chip will double every couple of years, while costs of production fall.⁷ In 1965, he predicted that:

Integrated circuits will lead to such wonders as home computers – or at least terminals connected to a central computer – automatic controls for automobiles, and personal portable communications equipment. The electronic wristwatch needs only a display to be feasible today.

You can now carry in your pocket a computer far more powerful than all the computers that existed in the world 50 years ago. The fact that we use so much of this technological power to play games, or count our own footsteps, is an indication of how ubiquitous, how effortless, it has become.

So data can be any kind of information, so long as it’s expressed as numbers, in a digital form that computers can store, process and manipulate.

OK, so what’s special about big data?

At a conference in New York I have coffee with Roger Magoulas, the man reputed to have invented the term ‘big data’.

He’s diffident, but he does admit that he first used the phrase in 2006, ‘and after that the term started being used a lot more’. In 2009, he contributed to a special big data issue of O’Reilly’s Radar newsletter, taking examples from Barack Obama’s US presidential election campaign and fast-growing social media sites. Magoulas spotted some new developments that went beyond size.

Prediction, for example. Companies weren’t just analysing the past, they were using data to look forwards as well as backwards.

And instead of data they collected themselves, people were using the masses of information available on the internet. It might be indirect information, what Magoulas calls ‘faint signal’ data, but with enough of it and the right techniques, it could give answers.

Those techniques meant harnessing machines that could teach themselves. Machines could learn to make sense of information, even when it came in a form designed for human-to-human communication.

Magoulas is a polymath. He does write computer code, but he’s equally interested in the human side, in asking the right questions, in understanding the ways that human beings can draw meaning from digital information.

‘There’s a few things you can automate,’ he says, ‘but most of it is to augment people. Nothing should make the decision for you, it should make you a better decision-maker because you’re getting these new inputs.’

Big data is sometimes described in terms of three Vs, defined by an analyst called Doug Laney in 2001 when it was plain ‘data’. Volume, velocity and variety identify three of the qualities that Roger Magoulas also noted: there’s a lot of data, it’s coming at you very fast and in different forms.

But I have my own acronym⁸ to sum up what’s special enough about big data to be worth writing (or reading) this book: big DATA.

Big is for big, obviously. DATA spells out four key elements that make it new and distinctive: it deals with many Dimensions, it’s Automatic, it’s Timely, and it uses AI, Artificial Intelligence. I’ll go through those one at a time:

Big

It’s difficult to define the bigness of data in absolute terms. Partly because it’s expanding so fast that between me typing a number and this book being printed, it would already be out of date. To give you an idea, O’Reilly’s 2009 big data special reports scientists handling ‘some of the largest known data sets’ of several petabytes.

A petabyte is 1,024 terabytes. On my desk is a portable hard drive that fits into my pocket. I used it to back up the manuscript of this book, and pretty much everything on my computer, plus my entire music collection, but it’s far from full. It holds 1 terabyte, 1TB, of data, cost me less than US$100, and could contain nearly 200,000 copies of the complete works of Shakespeare.

I could fit 1,024 of them, a petabyte, into a large suitcase. So what would have been one of the largest known datasets in 2009 would now fit on to a luggage trolley.

There are very few measures that make sense on an everyday level. In the internationally recognised unit of measurement for Very Large Things, the ‘to the Moon and back’, if all the information currently available as digital data could be put on to CDs, it would stretch to the Moon and back between three and 20 times. Though, by the time you read this, that’ll be 100 times, or 1,000.

Does that help? CDs are already old-fashioned in computing, because they don’t really hold enough data. One reason the world’s stock of data is growing so fast, doubling every three years by some estimates, is that we use more data to say the same thing. If you have a camera in your cellphone, it’s probably 10mp or more – that’s 10 megapixels, 10 million cells of colour and light, in that photo you took of your mates in that bar. No wonder it takes nearly 3MB of data to store it.

So part of the proliferation of data is deceptive; we’re just recording the same things in more detail. But there is genuinely more of the stuff. Filing cabinets full of paper have become computer servers full of digital data, which is more compact to store, and easier to find. So what?

Data analysts talk about ‘data mining’, as if all the information is already buried beneath our feet, and we just need to dig down through the dirt to bring back the diamonds. Big data has an air of completeness, of everything already being in there somewhere. Instead of asking questions with a survey, data analysts put queries to the data that’s already collected. This is a big change in how information is understood.

If you read that 93 per cent of women agree a certain face cream is brilliant, you may be impressed. If, like me, you check the small print and find it was 93 per cent of a survey of 28 people, commissioned by the manufacturer, not so much. But if the same company had somehow surveyed all 331,548 women who bought the product in a year, and 93 per cent of them say it’s brilliant, the face cream may be worth a try.

It doesn’t take a degree in statistics⁹ to understand that a selective sample isn’t a completely accurate guide to the whole picture.

Scientists use the letter n to tell you how many items they studied: ‘n = 11’ means you had 11 wolves, or patients, or women using face cream, in your study or experiment. Now data scientists talk about ‘n = all’, meaning they have the whole population in their dataset.

Somebody, somewhere, still had to decide which information to collect, but it’s easy to gather that data just in case, and decide later whether it’s useful.

D is for dimensions

A space scientist, Dr Sima Adhiya, once told me a story about her grandmother. In India, where the grandmother lived, the crickets were a constant background noise. And she told her granddaughter that the song of the crickets could tell you how hot the weather was that day.

When Sima grew up and became a scientist, she discovered that her grandmother was right.

A scientific paper, The Cricket as a Thermometer, published in 1897 by Amos E. Dolbear, expressed the relationship between temperature and how fast the crickets chirp in an equation known as Dolbear’s Law.¹⁰ So if you didn’t have a thermometer, but you were within earshot of some crickets, you could tell the temperature to the nearest degree by counting chirps with a stopwatch.

How did Dolbear discover this? Tantalisingly, he doesn’t say. His main interest was in turning sound waves into electrical signals, and vice versa. He invented something very like the telephone before Alexander Graham Bell, and patented the wireless telegraph before Marconi. So it’s possible that he used some ingenious apparatus to turn the cricket sounds into electrical waves before measuring their frequency.

But making a note of the temperature on successive days, and counting chirps per minute at the same time, would be enough for him to find the mathematical relationship. Temperature and chirps per minute are two very different types of thing, but by expressing both as numbers, and treating them as different dimensions of the same moment in time, Dolbear found a correlation close enough that one could predict the other.

Anything that can be turned into numbers can be a dataset. I could compare my tea-drinking against words written every day, and turn it into an equation to predict how many teabags I will need to finish this book.¹¹ I could go further and download weather data, to see if the weather has any effect on how much I write and/or how much tea I drink.¹²

If Dolbear were alive today, he could use a digital recording device to record the song of the crickets, and a computer to analyse the frequencies and compare them to the readings from a digital thermometer. In fact, he could write a computer program to do it all for him. Although, as he’d be nearly 180 years old, he might prefer to hire some young person to write it for him. Then he could get back to squinting at the controls of his computerised washing machine.

Having data in digital form is the first step towards making this kind of pattern-spotting possible, and it means you can link datasets of very different types. Perhaps D should stand for Datasets instead of Dimensions. Or for Diverse. The point is that you can now combine utterly Different types of information to learn something new.

A is for automatic

Think of how many things you do every day that involve computer technology.

In London, where I live, you can no longer pay cash to travel by bus. You can use an Oyster card, or you can tap a bank card directly on to the yellow pad on the bus. Whichever you use, the travel company deducts money from your account. When I run out of credit on my Oyster card, as I regularly do, I used to have to pay a higher cash fare to get a paper ticket. Now I just swipe my debit card.

The ease of collecting data by machine makes it very simple to gather up more than just a total of fares taken. I haven’t registered my Oyster card with my name or address on the Transport for London system, which is why it runs out, instead of automatically buying itself extra credit when the balance falls below £10. I do, however, top it up using my bank or credit card, so it’s already linked in their system with my name and address.

The more we do everyday things via computers, cellphones and plastic cards, the more information is automatically hoovered up and stored on a computer somewhere. Transport for London doesn’t only know how many passengers boarded their trains, buses and trams, but also where we got on and off¹³ and a swathe of other information such as where I live, and possibly whether this is my regular commute.

When information was collected by people, who had to write it down or type it into a machine, decisions about what to collect were tough. Now we’re far beyond recording data being easier than using flint to make notches in a wolf bone. In many cases, it’s now easier to record data than not to record it. Recording data is the default.

Every time you use a cellphone it stores all sorts of information in digital form: not only the numbers you call, and how long you talk for, but where you are whenever your phone is turned on. If you have a smartphone, it’s full of cunning little bits of kit, such as accelerometers and GPS receivers. That’s how the wonderful apps work that let you point your phone at a star and find out it’s the planet Jupiter.

You may already be one of the people using technology to capture your own data. Many cellphones come preloaded with apps related to health and fitness. You can track how many steps you take, how many calories you burn, even your heart rate. Or go further and turn other aspects of your life into data. How happy are you? How many tweets have you sent? How many cups of tea have you drunk today?¹⁴

It’s easy to automate both the collection of the data and the processes that turn it into useful information. In order to keep a tally of how many steps you take every day, your smartphone performs detailed calculations using the accelerometers that track changes of angle as you move. You don’t want to read those calculations. You just want a total of steps that day.

Or do you? Perhaps you want to combine it with other information. If you wanted to lose weight, you could combine it with an app that tells you how many calories you’re taking in by analysing photographs of your meals. If you’re competitive, you could share your total online and compare yourself against others. You might want a map of where you went, like Murphy Mack from San Francisco.

Murphy, like many keen cyclists, uses an app called Strava. Using GPS technology in a device such as a smartphone or satnav, Strava tracks your route, your speed, and how far up and down hill you went. Then it turns that data into various formats, such as a training calendar, personal best records, and a map of where you went.

Murphy rode his bicycle for 29km (18 miles) through San Francisco and uploaded his data. It wasn’t just his own pulse racing after that ride, because the red line on the map drew a heart shape. Not the most perfectly rounded heart shape, as San Francisco’s street map is mostly a rectangular grid, but discernibly a heart, with ‘marry me, Emily’ spelled out inside.

If you get marks for effort when proposing marriage, Murphy gets top marks. San Francisco is far more hilly than the neat map grid suggests. Small wonder that his 80-minute ride burned off 749 calories, so we can tell exactly how much effort he went to, physically at least. And Emily said yes.

This is a frivolous example, though it’s very important to Murphy and Emily. But the very fact that you only need a free app, a smartphone and the internet to automatically turn your cycle ride into data, and back again into a romantic picture, reveals how easily big data becomes part of your life.

T is for time

Have you had your first barbecue of the year yet? Or are you waiting for the weather to get just a bit warmer? How far ahead will you decide? A couple of weeks or half an hour beforehand when you’re out shopping and see the sausages on display?

We may not need to plan ahead, but supermarkets do. If they get it wrong, they’ll either miss out on millions of sales of burgers, charcoal and beer, or be left with unsold stocks of perishable meat and hot dog buns. How can they predict the first big barbecue weekend of the year?

The weather is one factor. Supermarket chain Tesco calculated that a 10° Celsius (18° Fahrenheit) rise in temperature means they will sell three times as much meat. In other words, we all want to barbecue in spring. After June, even though it’s warmer, the novelty of charred meat and marauding insects wears off.

Data analytics company Black Swan uses more sources of data, captured almost in real time, to give supermarkets a more reliable prediction. By collating extra information, such as how many people are searching ‘BBQ’ online or the general mood of social media posts, they claim to have saved at least one supermarket chain millions of dollars in wasted stock or missed sales.

This example shows two characteristics of big data. The picture changes continuously, as data comes in almost as fast as it’s produced. And because the software produces not just a snapshot but a moving picture, it can be extended into the future.

You may not care whether you can buy barbecue food when the mood takes you, or whether retailers have to throw away unsold chicken drumsticks. But the same approach can be used to predict more important things, such as medical needs: the spread of colds and flu, for example. Black Swan worked with a large pharmaceutical company that wanted to know which areas were being hit by those seasonal illnesses, or were about to be hit. This was for commercial reasons. They knew which of their products sell better when people have colds, or don’t have one yet and are trying not to catch one.

By combining the company’s own sales information with weather forecasts, web searches and social media posts, data analysts were able to identify, down to postcode level, trends in the spread of coughs and sneezes. The company was able to provide targeted online advertising in the places most likely to need vitamins and cold remedies.

Black Swan are now applying similar techniques to predicting how many staff need

Enjoying the preview?

Page 1 of 1

Big Data: Does Size Matter?

About this ebook

Timandra Harkness

Related authors

Related to Big Data

Related ebooks

Mathematics For You

Related podcast episodes

Related articles

Related categories

Reviews for Big Data

What did you think?

Book preview

Big Data - Timandra Harkness

Contents

part 1: What is it? Where did it come from?