# Bayesian Statistics

Based on the concept of probability, Bayesian Statistics computes and analyzes prior data to forecast the future trend. If there is a specific change in the present, the prior data will not reflect that.

Frequency analysis, therefore, is computing the likelihood of a specific occurrence, where new information isn’t computed.

49 STASHED

4 LIKES

## The 5 Basic Statistics Concepts Data Scientists Need to Know

towardsdatascience.com

MORE IDEAS FROM THE ARTICLE

In data science, probability is the percent chance that something will happen. A zero(0) in this case means the event will not occur, while the digit 1 denotes that we are certain it will happen.

46 STASHED

4 LIKES

A typical data set diagram (box plot) carries a lot of information.

1. If it is short, it means the data points are similar, but if it is tall, it implies there is a lot of range and variance.
2. A median (the line in the middle of a dataset graph) provides a more accurate reading as it avoids outlier values.
3. The lower regions of the box plot represent smaller percentages (like 25 percentile), with the higher regions denoting larger ones.

45 STASHED

3 LIKES

Statistics is using math to do technical analysis of data. Instead of guesstimating, data helps us get concrete and factual information.

The most widely used statistical concept in data science is called Statistical Features. It includes important measurements like bias, variance, mean, median and percentiles. It’s all code-friendly too.

63 STASHED

10 LIKES

Sometimes if we want to compare two datasets, or classify datasets that have an uneven number of samples for different sides or types. Just by taking fewer samples (undersampling), one can even out a dataset.

Oversampling is a way to copy datasets to have the same number of examples as the other class. The copies are produced maintaining the distribution ratio.

45 STASHED

2 LIKES

The process of reduction in the number of dimensions (or feature variables) in datasets is known as Dimensionality Reduction.

If a cube has 1000 points, we can reduce its dimensionality by simply taking the 3D data and viewing it as a 2D model. We can also remove feature variables to reduce the data volume. This is generally done with features that have a low correlation with the dataset and is called feature pruning.

47 STASHED

2 LIKES

The common probability distributions are:

1. Uniform Distribution: It is a simple off or on distribution, where anything outside the given range is 0.
2. Normal (or Gaussian) Distribution: This distribution has the same standard deviation in all directions. We get to know the average dataset value along with the spread of the data.
3. Poisson Distribution: This is similar to Poisson Distribution but also has skewness, in which the variation tells about the spread of the data in different directions.

48 STASHED

3 LIKES

Deepstash helps you become inspired, wiser and productive, through bite-sized ideas from the best articles, books and videos out there.

GET THE APP:  RELATED IDEAS Risk analysis is the process of assessing the likelihood of an adverse event occurring within the corporate, government, or environmental sector.

Risk analysis is the study of the underlying uncertainty of a given course of action and refers to the uncertainty of forecasted cash flow streams, the variance of portfolio or stock returns, the probability of a project's success or failure, and possible future economic states.

5 STASHED

1 LIKE

• Math enables you to select the right machine learning algorithm. It gives insight into how the model works, including selecting the right model parameter and validation strategies.
• Maths helps with creating the right confidence interval and uncertainty measurements with the model.
• Maths is needed to understand aspects such as metrics, training time, model complexity, number of parameters, and number of features.
• By knowing the machine learning model's math, you could develop a customised model.

55 STASHED

7 LIKES

### 6 Math Foundation to Start Learning Machine Learning

towardsdatascience.com

Small, daily fluctuations are often just statistical noise. For instance, in the stock market or polls.

To avoid drawing faulty conclusions about the causes, request the "margin of error" relating to the numbers. If the difference is smaller than the margin of error, there is probably no real difference.

138 STASHED

2 LIKES

### The seven deadly sins of statistical misinterpretation, and how to avoid them

theconversation.com

❤️ Brainstash Inc.