Data science has many definitions depending on who is doing what. However, in simple terms, Data Science is a combination of skills, expertise, and acumen in Mathematics, Technology, and Business Strategy. To do well in Data Science, you need a combination of domain expertise, data engineering, scientific method, math, statistics, advanced computing, visualization, and a hacker mindset.
The following questions vary from easy to difficult in no particular order. Also, you will find the answer to each question in bold and italicized text. If you find any issue with the questions or you have some questions and answers you want to suggest, you can contact me as you wish.
A Type I error occurs if you do which of the following?
- fail to reject the null hypothesis when it is false.
- reject the null hypothesis when it is true.
A Type II error occurs if you do which of the following?
- fail to reject the null hypothesis when it is false.
- reject the null hypothesis when it is true.
Alpha (α) is which of the following?
- the probability that you correctly reject the null hypothesis.
- the probability of committing a Type I error.
Power is which of the following?
- the probability that you correctly reject the null hypothesis.
- the probability of committing a Type I error.
Sample size influence refers to which of the following?
- the effect of the number of trails on the p-value.
- the difference between the observed statistics and the hypothesized value.
Effect size to which of the following?
- the effect of the number of trails on the p-value.
- the difference between the observed statistics and the hypothesized value.
In general, what information does a histogram depict?
- the changes in an outcome measurement over time.
- the number (or fraction) of obervations that falls within certain ranges of a particular outcome measurement.
True or False? By hunting for correlations in data samples, one can often find entirely spurios patterns.
- true.
- false.
True or False? Probability and statistics provide mathematical tools for estimating the likelihood of random events
- True.
- False.
What is one way of characterizing the relationship between data science and artificial intelligence (AI)?
- AI nearly always involves data science, but only some data science projects involves AI.
- Data science and AI are different names for the same thing.
What is the best use of the rules that are developed in neural networks?
- for automating decision-making processes.
- for helping humans make decisions based on general principles.
Application Programming Interfaces (APIs) generally serve what functions in a data science project?
- APIs allows for accessing data and including it in a data science programming.
- APIs allows for acquiring data that was not structured for sharing.
What are the elements that constitute “big data”?
- volume, velocity, and variety of data.
- programming, maths/statistics and domain expertise.
How does linear regression provide rules for decision-making?
- Linear regression uses coefficients to combine multiple input variables into a single output variable; these coefficients can then be used for decision making.
- Linear regression finds the observations that best serve as examples for decision making.
What elements make up data science?
- hacking/programming, math/statistics, and domian expertise.
- descriptive analysis, prescriptive analysis, and predictive analysis.
Why do the functions COVARIANCE.P and COVARIANCE.S return different values when using the same data?
- COVARIANCE.S adds one to the number of data.
- COVARIANCE.S substracts one from the number of data.
To determine the probability of a result, Bayes’ rule combines accuracy, false positives and ….?
- sample deviation.
- base rate.
The few questions and answers are just the start. More will be added.
Enjoy!