When you roll a die or pick a card from a deck, you have a *limited number of outcomes possible*.

This type of data is called Discrete Data, which can only take a specified number of values.

Recording time or measuring a person’s height has **infinitely** many values within a given interval.

This type of data is called Continuous Data, which can have any value within a given range.

That range can be **finite or infinite**.

**Discrete uniform distribution**: All outcomes are equally likely**Bernoulli Distribution**: Single-trial with two possible outcomes**Binomial Distribution**: A sequence of Bernoulli events**Poisson Distribution**: The probability that an event may or may not occur**chi-square distribution****Uniform distribution**

Uniform distribution refers to a statistical distribution in which all outcomes are **equally likely**.

Consider rolling a six-sided die. You have an equal probability of obtaining all six numbers on your next roll, i.e., obtaining precisely one of 1, 2, 3, 4, 5, or 6, equaling a probability of 1/6, hence an example of a discrete uniform distribution.

As a result, the uniform distribution graph contains bars of equal height representing each outcome. In our example, the height is a probability of 1/6 (0.166667).

It can be used as a starting point to derive more complex distributions.

Any event with a** single trial and only two **outcomes follows a Bernoulli distribution. Flipping a coin or choosing between True and False in a quiz are examples of a Bernoulli distribution.

We have the probability of one of the **outcomes (p).** From (p), we can deduce the probability of the other outcome by subtracting it from the total probability (1), represented as

**(1-p).**

p(Head)= 0.3** **

p(Tail)= q = 1-p = 1- 0.3 = 0.7

Used for categorical variables.

** Sum of outcomes of an event following a Bernoulli distribution. **

Therefore, Binomial Distribution is used in binary outcome events, and the probability of success and failure is the same in all successive trials.

An example of a binomial event would be flipping a coin multiple times to count the number of heads and tails.

**B(n,p) **where

n = number of trails

p=success of probability of each trail

Poisson distribution deals with the frequency with which an event occurs within a specific interval.

Instead of the probability of an event, Poisson distribution requires knowing how often it happens in a particular period or distance.

For eg. a cricket chirps two times in 7 seconds on average. We can use the Poisson distribution to determine the likelihood of it chirping five times in 15 seconds.

Represented with the notation **Po(λ),** λ represents the expected number of events that can take place in a period.

The expected value and variance of a Poisson process is λ

X -discrete random variable.

**Normal Distribution**: Symmetric distribution of values around the mean**Student t-Test Distribution**: Small sample size approximation of a normal distribution**Exponential distribution**: Model elapsed time between two events**Weibull Distribution****Non-normal distributions****Lognormal distribution****F distribution**

Here, data is symmetrically distributed with **no skew.**

When plotted, the data follows a **bell shape**, with most values clustering around a central region and tapering off as they go further away from the center.

Represented as **N(µ, σ2) ; **sample mean and variance.

The curve is symmetric at the center.

Therefore *mean, mode, and median *are equal to the **same value**, distributing all the values symmetrically around the mean.

The area under the distribution curve equals **1 **(all the probabilities must sum up to 1).

**68-95-99.7 Rule**

**68%** of the data points will fall within one standard deviation of the mean.

**95%** of the data points will fall within two standard deviations of the mean.

**99.7%** of the data points will fall within three standard deviations of the mean.

A type of statistical distribution similar to the normal distribution with its bell shape but has **heavier tails**.

The t distribution is used instead of the normal distribution when you have **small sample sizes**.

For example, suppose we deal with the total apples sold by a shopkeeper in **a month**. In that case, we will use the normal distribution. Whereas, if we are dealing with the total amount of apples sold in** a day,** i.e., a smaller sample, we can use the t distribution.

Critical difference between the students’ t distribution and the Normal one is that apart from the mean and variance, we must also define the degrees of freedom for the distribution.

In statistics, the number of **degrees of freedom** is the number of values in the final calculation of a statistic that are free to vary.

A Student’s t distribution is represented as** t(k)**, where k represents the number of degrees of freedom. For k=2, i.e., 2 degrees of freedom, the expected value is the same as the mean.

Exponential distribution is one of the widely used continuous distributions.

It is used to **model the time taken between different events**.

For example, in physics, it is often used to measure radioactive decay; in engineering, to measure the time associated with receiving a defective part on an assembly line; and in finance, to measure the likelihood of the next default for a portfolio of financial assets.

Another common application of Exponential distributions in survival analysis (e.g., **expected life of a device/machine**).

It is **a two-parameter family of curves**.

Weibull Distributions measure data in **an exponential curve **– a curve beginning at **zero **and gradually increasing in value.

This data distribution is often used for **reliability tests** and can help us predict how long it will take for a **system to fail.**

It models a broad range of random variables, largely in the nature of a time to failure or time between events.

Terms:

α is referred to as the shape parameter, and β is the scale parameter.

When α=1, the Weibull distribution is an exponential distribution with λ=1/β, so the exponential distribution is a special case of both the Weibull distributions and the gamma distributions.

It may **lack symmetry**, may have **extreme values**, or may have a flatter or steeper “dome” than a typical bell.

There is nothing inherently wrong with non-normal data; some traits simply do not follow a bell curve.

For example,* data about coffee and alcohol consumption are rarely bell shaped.*

Continuous probability distribution of a random variable whose logarithm is normally distributed.

Thus, if the random variable X is log-normally distributed, then **Y = log(X) **has a normal distribution.

Equivalently, if Y has a normal distribution, then the exponential function of Y, X = exp(Y) , has a log-normal distribution.

A random variable which is log-normally distributed takes only positive real values. It is a convenient and useful model for measurements in exact and engineering sciences, as well as medicine, economics and other topics (e.g., energies, concentrations, lengths).

**F-distribution** or **F-ratio**, also known as or the **Fisher–Snedecor distribution** (after Ronald Fisher and George W. Snedecor), is a continuous probability distribution that arises frequently as the null distribution of a test statistic, most notably in the analysis of variance (ANOVA) and other F-tests.

The graph above shows examples of chi-square distributions with different values of k(shape of a chi-square distribution).

They’re widely used in hypothesis tests, including the **chi-square goodness of fit test and the chi-square test of independence.**

**In hypothesis testing, steps**

- Ho or H1
**significance value**α =0.05**Degree of freedom**= n-1 ; n is number of categorical variables**Decision Boundary**- Cheak chi-square table- calculate
**test statistics**

Χ^2 = Σ (f o - f e) ^2 / f e

f e=expected outcome

f o=observed outcome

6 . **P **

A data distribution is a graphical representation of data that was collected from a sample or population. It is used to organize and disseminate large amounts of information in a way that is meaningful and simple for audiences to digest.

