- Home
- Data Science
- What is Data Science?
What is Data Science?
Data science continues to evolve as one of the most promising and in-demand career paths for skilled professionals. Today, successful data professionals understand that they must advance past the traditional skills of analyzing large amounts of data, data mining, and programming skills. In order to uncover useful intelligence for their organizations, data scientists must master the full spectrum of the data science life cycle and possess a level of flexibility and understanding to maximize returns at each phase of the process.
The Data Science Life Cycle
The image represents the five stages of the data science life cycle: Capture, (data acquisition, data entry, signal reception, data extraction); Maintain (data warehousing, data cleansing, data staging, data processing, data architecture); Process (data mining, clustering/classification, data modeling, data summarization); Analyze (exploratory/confirmatory, predictive analysis, regression, text mining, qualitative analysis); Communicate (data reporting, data visualization, business intelligence, decision making).
The term “data scientist” was coined as recently as 2008 when companies realized the need for data professionals who are skilled in organizing and analyzing massive amounts of data. 1 In a 2009 McKinsey&Company article, Hal Varian, Google's chief economist and UC Berkeley professor of information sciences, business, and economics, predicted the importance of adapting to technology’s influence and reconfiguration of different industries. 2
“The ability to take data — to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it — that’s going to be a hugely important skill in the next decades.”
Effective data scientists are able to identify relevant questions, collect data from a multitude of different data sources, organize the information, translate results into solutions, and communicate their findings in a way that positively affects business decisions. These skills are required in almost all industries, causing skilled data scientists to be increasingly valuable to companies.
Advance Your Career with an Online Short Course
Take the Data Science Essentials online short course and earn a certificate
from the UC Berkeley School of Information.
What Does a Data Scientist Do?
In the past decade, data scientists have become necessary assets and are present in almost all organizations. These professionals are well-rounded, data-driven individuals with high-level technical skills who are capable of building complex quantitative algorithms to organize and synthesize large amounts of information used to answer questions and drive strategy in their organization. This is coupled with the experience in communication and leadership needed to deliver tangible results to various stakeholders across an organization or business.
Data scientists need to be curious and result-oriented, with exceptional industry-specific knowledge and communication skills that allow them to explain highly technical results to their non-technical counterparts. They possess a strong quantitative background in statistics and linear algebra as well as programming knowledge with focuses in data warehousing, mining, and modeling to build and analyze algorithms.
They must also be able to utilize key technical tools and skills, including:
R
Python
Apache Hadoop
MapReduce
Apache Spark
NoSQL databases
Cloud computing
D3
Apache Pig
Tableau
iPython notebooks
GitHub
Why Become a Data Scientist?
Glassdoor ranked data scientist as the #1 Best Job in America in 2018 for the third year in a row. 4 As increasing amounts of data become more accessible, large tech companies are no longer the only ones in need of data scientists. The growing demand for data science professionals across industries, big and small, is being challenged by a shortage of qualified candidates available to fill the open positions.
The need for data scientists shows no sign of slowing down in the coming years. LinkedIn listed data scientist as one of the most promising jobs in 2017 and 2018, along with multiple data-science-related skills as the most in-demand by companies. 5
The statistics listed below represent the significant and growing demand for data scientists.
28%
Demand Increase by 2020
4,524
Number of Job Openings
$120,931
Average Base Salary
#1
Best Job in America 2016, 2017, 2018
Where Do You Fit in Data Science?
Data is everywhere and expansive. A variety of terms related to mining, cleaning, analyzing, and interpreting data are often used interchangeably, but they can actually involve different skill sets and complexity of data.
Data Scientist
Data scientists examine which questions need answering and where to find the related data. They have business acumen and analytical skills as well as the ability to mine, clean, and present data. Businesses use data scientists to source, manage, and analyze large amounts of unstructured data. Results are then synthesized and communicated to key stakeholders to drive strategic decision-making in the organization.
Skills needed: Programming skills (SAS, R, Python), statistical and mathematical skills, storytelling and data visualization, Hadoop, SQL, machine learning
Data Analyst
Data analysts bridge the gap between data scientists and business analysts. They are provided with the questions that need answering from an organization and then organize and analyze data to find results that align with high-level business strategy. Data analysts are responsible for translating technical analysis to qualitative action items and effectively communicating their findings to diverse stakeholders.
Skills needed: Programming skills (SAS, R, Python), statistical and mathematical skills, data wrangling, data visualization
Data Engineer
Data engineers manage exponential amounts of rapidly changing data. They focus on the development, deployment, management, and optimization of data pipelines and infrastructure to transform and transfer data to data scientists for querying.
Skills needed: Programming languages (Java, Scala), NoSQL databases (MongoDB, Cassandra DB), frameworks (Apache Hadoop)
Data Science Career Outlook and Salary Opportunities
Data science professionals are rewarded for their highly technical skill set with competitive salaries and great job opportunities at big and small companies in most industries. With over 4,500 open positions listed on Glassdoor, data science professionals with the appropriate experience and education have the opportunity to make their mark in some of the most forward-thinking companies in the world.6
Below are the average base salaries for the following positions: 7
Data analyst: $65,470
Data scientist: $120,931
Senior data scientist: $141,257
Data engineer: $137,776
Gaining specialized skills within the data science field can distinguish data scientists even further. For example, machine learning experts utilize high-level programming skills to create algorithms that continuously gather data and automatically adjust their function to be more effective.
ICAgICAgICA8ZGl2IGNsYXNzPSJ0YXhpX2Zvcm1zX3dpZGdldCB0YXhpX2Zvcm1zX3dpZGdldF9pbmxpbmUgdS0tYmFja2dyb3VuZC1saWdodCIKICAgICAgICAgICAgIGRhdGEtZm9ybS10eXBlPSJpbmxpbmUiPgogICAgICAgICAgICA8ZGl2IGlkPSIwNjc5NDQ1My1sZWFkLWZvcm0tZGlhbG9nIiBjbGFzcz0iY2FyZCI+CiAgICAgICAgICAgICAgICA8ZGl2IGNsYXNzPSJjYXJkX19ib2R5Ij4KICAgICAgICAgICAgICAgICAgICA8ZGl2IGNsYXNzPSJ0YXhpLWZvcm0tbWVzc2FnZSI+CiAgICAgICAgICAgICAgICAgICAgICAgIDxoMiBpZD0iMDY3OTQ0NTMtdGl0bGUiIGNsYXNzPSJoMyI+UmVxdWVzdCBNb3JlIEluZm9ybWF0aW9uPC9oMj4KICAgICAgICAgICAgICAgICAgICAgICAgPHAgaWQ9IjA2Nzk0NDUzLWludHJvX3RleHQiPjwvcD4KICAgICAgICAgICAgICAgICAgICA8L2Rpdj4KICAgICAgICAgICAgICAgICAgICA8ZGl2IGlkPSIwNjc5NDQ1My1sZWFkLWZvcm0tcHJvZ3Jlc3MtYmFyLW1vdW50LXBvaW50Ij48L2Rpdj4KICAgICAgICAgICAgICAgIDwvZGl2PgogICAgICAgICAgICAgICAgPGRpdiBpZD0iMDY3OTQ0NTMtbGVhZC1mb3JtLWVycm9yLW1vdW50LXBvaW50Ij48L2Rpdj4KCiAgICAgICAgICAgICAgICA8ZGl2IGNsYXNzPSJmb3JtLWJvZHkgc3RlcHMiPgogICAgICAgICAgICAgICAgICAgIDxkaXYgbmFtZT0idGF4aUZvcm0iIGNsYXNzPSJ0YXhpLWZvcm0iPgogICAgICAgICAgICAgICAgICAgICAgICA8ZGl2IGNsYXNzPSJ0YXhpLWZvcm0tc3RlcHMgY2FyZF9fYm9keSB1LS1ib3JkZXItdG9wIHUtLXBhZGRpbmctYm90dG9tLTEiPgogICAgICAgICAgICAgICAgICAgICAgICAgICAgPGRpdiBpZD0iMDY3OTQ0NTMtbGVhZC1mb3JtLWZpZWxkLW1vdW50LXBvaW50Ij48L2Rpdj4KICAgICAgICAgICAgICAgICAgICAgICAgPC9kaXY+CiAgICAgICAgICAgICAgICAgICAgPC9kaXY+CiAgICAgICAgICAgICAgICA8L2Rpdj4KCiAgICAgICAgICAgICAgICA8ZGl2IGNsYXNzPSJjYXJkX19mb290ZXIiPgogICAgICAgICAgICAgICAgICAgIDxidXR0b24gY2xhc3M9InRheGktZm9ybS1uZXh0LXN0ZXAgYnV0dG9uIGJ1dHRvbi0tdGhlbWUtY3RhIGJ1dHRvbi0tYmxvY2siIGlkPSIwNjc5NDQ1My1sZWFkLWZvcm0tbmV4dC1hY3Rpb24tYnV0dG9uIj4KICAgICAgICAgICAgICAgICAgICAgICAgTmV4dCBTdGVwCiAgICAgICAgICAgICAgICAgICAgPC9idXR0b24+CiAgICAgICAgICAgICAgICAgICAgPGRpdiBpZD0iMDY3OTQ0NTMtbGVhZC1mb3JtLWZvb3Rlci1tb3VudC1wb2ludCI+PC9kaXY+CiAgICAgICAgICAgICAgICA8L2Rpdj4KICAgICAgICAgICAgPC9kaXY+CiAgICAgICAgPC9kaXY+Cg==
{"admissionsEmail": "admissions@datascience.berkeley.edu", "degreeOffering": "ucb-mids", "fields": [{"helpText": "", "hidden": false, "label": "Which program most interests you?", "mountPoint": 1, "name": "degree", "required": true, "type": 3, "value": {"defaultOption": "", "options": [{"label": "Data Science", "value": "MIDS"}, {"label": "Cybersecurity", "value": "CYB"}]}}, {"hidden": true, "label": "", "name": "no_klondike_gdpr_only_consent", "required": true, "type": 9, "value": {"gdprOnly": "false"}}, {"hidden": false, "mountPoint": 2, "name": "", "type": 7, "value": {"text": "Your personal data will be used as described in our [--link:https://ischoolonline.berkeley.edu/legal/privacy-policy/ target:blank]privacy policy[link--]. You may opt out of receiving communications at any time."}}], "grouping": "ucb-umt", "id": 893, "inferredFields": {}, "programsOfStudy": "5deaba1e-2939-4e8d-9ce6-7345e67fa31e, 5deaba1d-5f98-4636-abf0-c35d936afdd2", "published": "2021-02-23T16:34:27.206Z", "screens": [{"allFields": [0, 1], "conditional": {}, "out": {"0": ["$next", [{"data": "$valid"}]]}}], "version": "1.0.1"}
{"admissionsEmail": "admissions@datascience.berkeley.edu", "degreeOffering": "ucb-mids", "fields": [{"helpText": "", "hidden": false, "label": "First Name", "mountPoint": 1, "name": "first_name", "required": true, "type": 0, "value": {"text": ""}}, {"helpText": "", "hidden": false, "label": "Last Name", "mountPoint": 1, "name": "last_name", "required": true, "type": 0, "value": {"text": ""}}, {"helpText": "", "hidden": false, "label": "Email", "mountPoint": 1, "name": "email", "required": true, "type": 0, "value": {"text": ""}}, {"conditionallyRendered": true, "helpText": "", "hidden": false, "label": "US Marketing Consent \u00f0\u009f\u0087\u00ba\u00f0\u009f\u0087\u00b8 - MIDS", "mountPoint": 1, "name": "lead_share_opt_in", "required": true, "type": 11, "value": {"checkboxText": "Please contact me about these educational programs.", "defaultChecked": true, "defaultRadio": "none", "disclaimer": "datascience@berkeley\u0027s technology partner, 2U, Inc., and its family of companies, work with multiple universities to offer educational programs in data science and other fields.", "format": "checkbox", "optInValue": "UCB-MIDS Marketing", "smsHiddenConsent": false}}, {"conditionallyRendered": true, "helpText": "", "hidden": false, "label": "GDPR Marketing Consent \u00f0\u009f\u0087\u00aa\u00f0\u009f\u0087\u00ba - MIDS", "mountPoint": 1, "name": "lead_share_opt_in", "required": true, "type": 8, "value": {"disclaimer": "This personal data is collected and processed by [--link:https://2u.com/ target:blank]2U, Inc.[link--], datascience@berkeley\u0027s technology partner.", "leadShareOptIn": {"email": "Please email me about these educational programs.", "leadShareValue": "UCB-MIDS Marketing", "phone": "", "sms": "", "text": "datascience@berkeley\u0027s technology partner, [--link:https://2u.com/ target:blank]2U, Inc., and the 2U family of companies[link--], work with multiple universities to offer educational programs in data science and other fields."}, "retailOptIn": {"email": "Email", "phone": "Phone", "sms": "", "text": "Yes, I want to receive additional information about datascience@berkeley. Please contact me via:"}}}, {"helpText": "", "hidden": false, "label": "State", "mountPoint": 1, "name": "state", "required": false, "type": 5, "value": {}}, {"helpText": "", "hidden": false, "label": "Zip/Postal Code", "mountPoint": 1, "name": "zip_code", "required": false, "type": 0, "value": {"text": ""}}, {"helpText": "", "hidden": false, "label": "Country of Residence", "mountPoint": 1, "name": "country", "required": true, "type": 6, "value": {}}, {"helpText": "", "hidden": false, "label": "Which program most interests you?", "mountPoint": 1, "name": "degree", "required": true, "type": 3, "value": {"defaultOption": "", "options": [{"label": "Data Science", "value": "MIDS"}, {"label": "Cybersecurity", "value": "CYB"}]}}, {"conditionallyRendered": true, "helpText": "", "hidden": true, "label": "Degree Offering: ucb-mids", "mountPoint": 1, "name": "degree_offering", "required": false, "type": 12, "value": {"degreeOffering": "ucb-mids", "programId": "400"}}, {"conditionallyRendered": true, "helpText": "", "hidden": true, "label": "Degree Offering: ucb-cyb", "mountPoint": 1, "name": "degree_offering", "required": false, "type": 12, "value": {"degreeOffering": "ucb-cyb", "programId": "399"}}, {"helpText": "", "hidden": false, "label": "What is your highest level of education completed?", "mountPoint": 1, "name": "level_of_education", "required": true, "type": 3, "value": {"defaultOption": "", "options": [{"label": "High School", "value": "High School"}, {"label": "Associate\u0027s", "value": "Associates"}, {"label": "Bachelor\u0027s in progress", "value": "Bachelors in progress"}, {"label": "Bachelor\u0027s", "value": "Bachelors"}, {"label": "Master\u0027s in progress", "value": "Masters in progress"}, {"label": "Master\u0027s", "value": "Masters"}, {"label": "Doctorate", "value": "Doctorate"}]}}, {"helpText": "", "hidden": false, "label": "What was your undergraduate GPA?", "mountPoint": 1, "name": "stated_gpa_range", "required": true, "type": 3, "value": {"defaultOption": "", "options": [{"label": "4.00 and above", "value": "4.00 and above"}, {"label": "3.99 - 3.50", "value": "3.99-3.50"}, {"label": "3.49 - 3.00", "value": "3.49-3.00"}, {"label": "2.99 - 2.50", "value": "2.99-2.50"}, {"label": "2.49 and below", "value": "2.49 and below"}]}}, {"conditionallyRendered": true, "helpText": "", "hidden": false, "label": "What is your educational background?", "mountPoint": 1, "name": "educational_background", "required": true, "type": 3, "value": {"defaultOption": "", "options": [{"label": "Business/Economics", "value": "Business/Economics"}, {"label": "Computer Science", "value": "Computer Science"}, {"label": "Education/Teaching", "value": "Education/Teaching"}, {"label": "English/Writing", "value": "English/Writing"}, {"label": "Engineering", "value": "Engineering"}, {"label": "History/Government", "value": "History/Government"}, {"label": "Math/Statistics", "value": "Math/Statistics"}, {"label": "Physical Science", "value": "Physical Science"}, {"label": "Other", "value": "Other"}]}}, {"conditionallyRendered": true, "helpText": "", "hidden": false, "label": "What category best describes your undergraduate major?", "mountPoint": 1, "name": "undergraduate_major", "required": true, "type": 3, "value": {"defaultOption": "", "options": [{"label": "Computer Engineering", "value": "Computer Engineering"}, {"label": "Computer Science", "value": "Computer Science"}, {"label": "Electrical Engineering", "value": "Electrical Engineering"}, {"label": "Mathematics", "value": "Mathematics"}, {"label": "Mechanical Engineering", "value": "Mechanical Engineering"}, {"label": "Physics", "value": "Physics"}, {"label": "Information Technology", "value": "Information Technology"}, {"label": "Other", "value": "Other"}]}}, {"helpText": "", "hidden": false, "label": "How many years of programming experience do you have?", "mountPoint": 1, "name": "years_of_programming_experience", "required": true, "type": 3, "value": {"defaultOption": "", "options": [{"label": "0", "value": "0"}, {"label": "1", "value": "1"}, {"label": "2", "value": "2"}, {"label": "3", "value": "3"}, {"label": "4", "value": "4"}, {"label": "5", "value": "5"}, {"label": "6", "value": "6"}, {"label": "7", "value": "7"}, {"label": "8", "value": "8"}, {"label": "9", "value": "9"}, {"label": "10+", "value": "10+"}]}}, {"conditionallyRendered": true, "helpText": "", "hidden": false, "label": "Why are you interested in the Master of Information and Data Science?", "mountPoint": 1, "name": "why_are_you_interested_in_earning_a_mids", "required": true, "type": 3, "value": {"defaultOption": "", "options": [{"label": "Advance my career", "value": "Advance my career"}, {"label": "Switch to a new career", "value": "Switch to a new career"}]}}, {"helpText": "", "hidden": false, "label": "Country of Citizenship", "mountPoint": 1, "name": "country_of_citizenship", "required": true, "type": 6, "value": {}}, {"conditionallyRendered": true, "helpText": "", "hidden": false, "label": "Phone", "mountPoint": 1, "name": "phone", "required": true, "type": 0, "value": {"text": ""}}, {"conditionallyRendered": true, "helpText": "", "hidden": false, "label": "US Marketing Consent \u00f0\u009f\u0087\u00ba\u00f0\u009f\u0087\u00b8 - CYB", "mountPoint": 1, "name": "lead_share_opt_in", "required": true, "type": 11, "value": {"checkboxText": "Please contact me about these educational programs.", "defaultChecked": true, "defaultRadio": "none", "disclaimer": "cybersecurity@berkeley\u0027s technology partner, 2U, Inc., and its family of companies, work with multiple universities to offer educational programs in cybersecurity and other fields.", "format": "checkbox", "optInValue": "UCB-CYB Marketing", "smsHiddenConsent": false}}, {"conditionallyRendered": true, "helpText": "", "hidden": false, "label": "GDPR Marketing Consent \u00f0\u009f\u0087\u00aa\u00f0\u009f\u0087\u00ba - CYB", "mountPoint": 1, "name": "lead_share_opt_in", "required": true, "type": 8, "value": {"disclaimer": "This personal data is collected and processed by [--link:https://2u.com/ target:blank]2U, Inc.[link--], cybersecurity@berkeley\u0027s technology partner.", "leadShareOptIn": {"email": "Please email me about these educational programs.", "leadShareValue": "UCB-CYB Marketing", "phone": "", "sms": "", "text": "cybersecurity@berkeley\u0027s technology partner, [--link:https://2u.com/ target:blank]2U, Inc., and its family of companies[link--], work with multiple universities to offer educational programs in cybersecurity and other fields."}, "retailOptIn": {"email": "Email", "phone": "Phone", "sms": "", "text": "Yes, I want to receive additional information about cybersecurity@berkeley. Please contact me via:"}}}, {"hidden": true, "label": "", "name": "no_klondike_gdpr_only_consent", "required": true, "type": 9, "value": {"gdprOnly": "false"}}, {"hidden": false, "mountPoint": 2, "name": "", "type": 7, "value": {"text": "Your personal data will be used as described in our [--link:https://ischoolonline.berkeley.edu/legal/privacy-policy/ target:blank]privacy policy[link--]. You may opt out of receiving communications at any time."}}], "grouping": "ucb-umt", "id": 893, "inferredFields": {}, "programsOfStudy": "5deaba1e-2939-4e8d-9ce6-7345e67fa31e, 5deaba1d-5f98-4636-abf0-c35d936afdd2", "published": "2021-02-23T16:34:27.206Z", "screens": [{"allFields": [8, 21], "conditional": {}, "out": {"1": ["$next", [{"data": "$valid"}]]}}, {"allFields": [9, 10, 11, 12, 13, 14, 15, 16], "conditional": {"10": [1, "", [{"data": "state.degree"}, {"data": "CYB"}, {"op": 0}]], "13": [1, "", [{"data": "state.degree"}, {"data": "MIDS"}, {"op": 0}]], "14": [1, "", [{"data": "state.degree"}, {"data": "CYB"}, {"op": 0}]], "16": [1, "", [{"data": "state.degree"}, {"data": "MIDS"}, {"op": 0}]], "9": [1, "", [{"data": "state.degree"}, {"data": "MIDS"}, {"op": 0}]]}, "out": {"2": ["$next", [{"data": "$valid"}]]}}, {"allFields": [0, 1, 2], "conditional": {}, "out": {"3": ["$next", [{"data": "$valid"}]]}}, {"allFields": [5, 6, 7, 17, 18, 3, 19, 4, 20, 22], "conditional": {"18": [3, "", [{"data": "state.no_klondike_gdpr_only_consent"}, {"data": "true"}, {"op": 0}, {"data": "state.no_klondike_carmen_sandiego_region"}, {"data": "eu"}, {"op": 0}, {"op": 8}]], "19": [1, "", [{"data": "state.no_klondike_gdpr_only_consent"}, {"data": "true"}, {"op": 1}, {"data": "state.no_klondike_carmen_sandiego_region"}, {"data": "eu"}, {"op": 1}, {"data": "state.degree"}, {"data": "CYB"}, {"op": 0}, {"op": 7}, {"op": 7}]], "20": [1, "", [{"data": "state.no_klondike_gdpr_only_consent"}, {"data": "true"}, {"op": 0}, {"data": "state.degree"}, {"data": "CYB"}, {"op": 0}, {"op": 7}, {"data": "state.no_klondike_carmen_sandiego_region"}, {"data": "eu"}, {"op": 0}, {"data": "state.degree"}, {"data": "CYB"}, {"op": 0}, {"op": 7}, {"op": 8}]], "3": [1, "", [{"data": "state.no_klondike_gdpr_only_consent"}, {"data": "true"}, {"op": 1}, {"data": "state.no_klondike_carmen_sandiego_region"}, {"data": "eu"}, {"op": 1}, {"data": "state.degree"}, {"data": "MIDS"}, {"op": 0}, {"op": 7}, {"op": 7}]], "4": [1, "", [{"data": "state.no_klondike_gdpr_only_consent"}, {"data": "true"}, {"op": 0}, {"data": "state.degree"}, {"data": "MIDS"}, {"op": 0}, {"op": 7}, {"data": "state.no_klondike_carmen_sandiego_region"}, {"data": "eu"}, {"op": 0}, {"data": "state.degree"}, {"data": "MIDS"}, {"op": 0}, {"op": 7}, {"op": 8}]]}, "out": {"-1": ["$next", [{"data": "$valid"}]]}}], "version": "1.0.1"}