Python Tutorial

Introduction Python Features Python Applications Python System requirements Python Installation Python Examples Python Basics Python Indentation Python Variables Python Data Types Python IDE Python Keywords Python Operators Python Comments Python Pass Statement

Python Conditional Statements

Python if Statement Python elif Statement Python If-else statement Python Switch Case

Python Loops

Python for loop Python while loop Python Break Statement Python Continue Statement Python Goto Statement

Python Arrays

Python Array Python Matrix

Python Strings

Python Strings Python Regex

Python Built-in Data Structure

Python Lists Python Tuples Python Lists vs Tuples Python Dictionary Python Sets

Python Functions

Python Function Python min() function Python max() function Python User-define Functions Python Built-in Functions Python Recursion Anonymous/Lambda Function in Python apply() function in python Python lambda() Function

Python File Handling

Python File Handling Python Read CSV Python Write CSV Python Read Excel Python Write Excel Python Read Text File Python Write Text File Read JSON File in Python

Python Exception Handling

Python Exception Handling Python Errors and exceptions Python Assert

Python OOPs Concept

OOPs Concepts in Python Classes & Objects in Python Inheritance in Python Polymorphism in Python Python Encapsulation Python Constructor Python Super function Python Static Method Static Variables in Python Abstraction in Python

Python Iterators

Iterators in Python Yield Statement In Python Python Yield vs Return

Python Generators

Python Generator

Python Decorators

Python Decorator

Python Functions and Methods

Python Built-in Functions Python String Methods Python List Methods Python Dictionary Methods Python Tuple Methods Python Set Methods

Python Modules

Python Modules Python Datetime Module Python Math Module Python Import Module Python Time ModulePython Random Module Python Calendar Module CSV Module in Python Python Subprocess Module

Python MySQL

Python MySQL Python MySQL Client Update Operation Delete Operation Database Connection Creating new Database using Python MySQL Creating Tables Performing Transactions

Python MongoDB

Python MongoDB

Python SQLite

Python SQLite

Python Data Structure Implementation

Python Stack Python Queue Python Linked List Python Hash Table Python Graph

Python Advance Topics

Speech Recognition in Python Face Recognition in Python Python Linear regression Python Rest API Python Command Line Arguments Python JSON Python Subprocess Python Virtual Environment Type Casting in Python Python Collections Python Attributes Python Commands Python Data Visualization Python Debugger Python DefaultDict Python Enumerate

Python 2

What is Python 2

Python 3

Anaconda in Python 3 Anaconda python 3 installation for windows 10 List Comprehension in Python3

How to

How to Parse JSON in Python How to Pass a list as an Argument in Python How to Install Numpy in PyCharm How to set up a proxy using selenium in python How to create a login page in python How to make API calls in Python How to run Python code from the command prompt How to read data from com port in python How to Read html page in python How to Substring a String in Python How to Iterate through a Dictionary in Python How to convert integer to float in Python How to reverse a string in Python How to take input in Python How to install Python in Windows How to install Python in Ubuntu How to install PIP in Python How to call a function in Python How to download Python How to comment multiple lines in Python How to create a file in Python How to create a list in Python How to declare array in Python How to clear screen in Python How to convert string to list in Python How to take multiple inputs in Python How to write a program in Python How to compare two strings in Python How to create a dictionary in Python How to create an array in Python How to update Python How to compare two lists in Python How to concatenate two strings in Python How to print pattern in Python How to check data type in python How to slice a list in python How to implement classifiers in Python How To Print Colored Text in Python How to open a file in python How to Open a file in python with Path How to run a Python file in CMD How to change the names of Columns in Python How to Concat two Dataframes in Python How to Iterate a List in Python How to learn python Online How to Make an App with Python How to develop a game in python How to print in same line in python How to create a class in python How to find square root in python How to import numy in python How to import pandas in python How to uninstall python How to upgrade PIP in python How to append a string in python How to comment out a block of code in Python How to change a value of a tuple in Python How to append an Array in Python How to Configure Python Interpreter in Eclipse Parameter Passing in Python How to plot a Histogram in Python How to Import Files in Python How to Download all Modules in Python How to get Time in seconds in Python How to Practice Python Programming How to plot multiple linear regression in Python How to set font for Text in Python

Sorting

Python Sort List Sort Dictionary in Python Python sort() function Python Bubble Sort

Programs

Factorial Program in Python Prime Number Program in Python Fibonacci Series Program in Python Leap Year Program in Python Palindrome Program in Python Check Palindrome In Python Calculator Program in Python Armstrong Number Program in Python Python Program to add two numbers Anagram Program in Python Number Pattern Programs in Python Even Odd Program in Python GCD Program in Python Python Exit Program Python Program to check Leap Year Operator Overloading in Python Pointers in Python Python Not Equal Operator Raise Exception in Python Salary of Python Developers in India What is a Script in Python

Misc

Introduction to Scratch programming SKLearn Clustering SKLearn Linear Module Standard Scaler in SKLearn Python Time Library SKLearn Model Selection Standard Scaler in SKLearn Accuracy_score Function in Sklearn Append key Value to Dictionary in Python Cross Entropy in Python Cursor in Python Data Class in Python How to Install Tweepy in Python Imread Python Program of Cumulative Sum in Python Python Program for Linear Search Python Program to Generate a Random String Read numpy array in Python Scrimba python Sklearn linear Model in Python Scraping data in python Accessing Key-value in Dictionary in Python Find Median of List in Python Linear Regression using Sklearn with Example Problem-solving with algorithm and data structures using Python Python 2.7 data structures Python Variable Scope with Local & Non-local Examples Arguments and parameters in Python Assertion error in python Programs for Printing Pyramid Patterns in Python _name_ in Python Amazon rekognition using python Anaconda python 3.7 download for windows 10 64-bit Android apps for coding in python Augmented reality in python Best app for python Difference between Perl and Python Not supported between instances of str and int in python Python comment symbol Python Complex Class Python IDE names Selection Sort Using Python Hypothesis Testing in Python Idle python download for Windows Insertion Sort using Python Merge Sort using Python Python - Binomial Distribution Python Logistic Regression with Sklearn & Scikit Python Random shuffle() method Python variance() function Python vs HTML Removing the First Character from the String in Python Adding item to a python dictionary Best books for NLP with Python Best Database for Python Count Number of Keys in Dictionary Python Cross Validation in Sklearn Drop() Function in Python EDA in Python Excel Automation with Python Python Program to Find the gcd of Two Numbers Python Web Development projects Adding a key-value pair to dictionary in Python Python Euclidean Distance Python Filter List Python Fit Transform Python e-book free download Python email utils Python range() Function Python random.seed() function What is the re.sub() function in Python Python PPTX Python Pickle Python Seaborn Python Coroutine Python EOL Python Infinity Python math.cos and math.acos function Python Project Ideas Based On Django Reverse a String in Python Reverse a Number in Python Python Word Tokenizer Python Trigonometric Functions Python try catch exception GUI Calculator in Python Implementing geometric shapes into the game in python Installing Packages in Python Python Try Except Python Sending Email Socket Programming in Python Python CGI Programming Python Data Structures Python abstract class Python Compiler Python K-Means Clustering NSE Tools In Python Operator Module In Python Palindrome In Python Permutations in Python Pillow Python introduction and setup Python Functionalities of Pillow Module Python Argmin Python whois Python JSON Schema Python lock Return Statement In Python Reverse a sentence In Python tell() function in Python Why learn Python? Write Dictionary to CSV in Python Write a String in Python Binary Search Visualization using Pygame in Python Latest Project Ideas using Python 2022 Closest Pair of Points in Python ComboBox in Python Python vs R Best resources to learn Numpy and Pandas in python Check Letter in a String Python Python Console Python Control Statements Convert Float to Int in Python using Pandas Difference between python list and tuple Importing Numpy in Pycharm Python Key Error Python NewLine Python tokens and character set Python Strong Number any() Keyword in python Best Database in Python Check whether dir is empty or not in python Comments in the Python Programming Language Convert int to Float in Python using Pandas Decision Tree Classification in Python End Parameter in python __GETITEM__ and __SETITEM__ in Python Python Namespace Python GUI Programming List Assignment Index out of Range in Python List Iteration in Python List Index out of Range Python for Loop List Subtract in Python Python Empty Tuple Python Escape Characters Sentence to python vector Slicing of a String in Python Executing Shell Commands in Python Genetic Algorithm in python Get index of element in array in python Looping through Data Frame in Python Syntax of Map function in Python After Python What Should I Learn Python AIOHTTP Alexa Python Artificial intelligence mini projects ideas in python Artificial intelligence mini projects with source code in Python Find whether the given stringnumber is palindrome or not First Unique Character in a String Python Python Network Programming Python Interface Python Multithreading Python Interpreter Data Distribution in python Flutter with tensor flow in python Front end in python Iterate a Dictionary in Python Iterate a Dictionary in Python – Part 2 Allocate a minimum number of pages in python Assertion Errors and Attribute Errors in Python Checking whether a String Contains a Set of Characters in python Python Control Flow Statements *Args and **Kwargs in Python Bar Plot in Python Conditional Expressions in Python Function annotations() in Python How to Write a Configuration file in Python Image to Text in python import() Function in Python Import py file in Python Multiple Linear Regression using Python Nested Tuple in Python Python String Negative Indexing Reading a File Line by Line in Python Python Comment Block Base Case in Recursive function python ER diagram of the Bank Management System in python Image to NumPy Arrays in Python NOT IN operator in Python One Liner If-Else Statements in Python Sklearn in Python Cube Root in Python Python Variables, Constants and Literals What Does the Percent Sign (%) Mean in Python Creating Web Application in python Notepad++ For Python PyPi TensorFlow Python | Read csv using pandas.read_csv() What is online python free IDE What is Python online compiler Run exec python from PHP What are the Purposes of Python What is Python compiler GDB Python coding platform Python Classification Python | a += b is not always a = a + b PyDev with Python IDE Character Set in Python Best Python AI Projects _dict_ in Python Python Ternary Operators Self in Python Python vs Java Python Modulo Python Packages Python Syntax Python Uses Python Bitwise Operators Python Identifiers Python Matrix Multiplication Python AND Operator Python Logical Operators Python Multiprocessing Python Unit Testing __init__ in Python Advantages of Python Is Python Case-sensitive when Dealing with Identifiers Python Boolean Python Call Function Python History Python Image Processing Python main() function Python Permutations and Combinations Difference between Input() and raw_input() functions in Python Conditional Statements in python Confusion Matrix Visualization Python Nested List in Python Python Algorithms Python Modules List Difference between Python 2 and Python 3 Is Python Case Sensitive Method Overloading in Python Python Arithmetic Operators Assignment Operators in Python Is Python Object Oriented Programming language Python Division Python exit commands Continue And Pass Statements In Python Colors In Python Convert String Into Int In Python Convert String To Binary In Python Convert Uppercase To Lowercase In Python Convert XML To JSON In Python Converting Set To List In Python Covariance In Python CSV Module In Python Decision Tree In Python Difference Between Yield And Return In Python Dynamic Typing In Python BOTTLE Python Web Framework How to Install Scikit-Learn Introducing modern python computing in simple packages Python vs PHP Reason for Python So Popular Returning Multiple Values in Python Spotify API in Python Spyder (32-bit) - Free download Time. Sleep() in Python Traverse Dictionary in Python What is Ipython shell YOLO Python Nested for Loop in Python Data Structures and Algorithms Using Python | Part 1 Data Structures and Algorithms using Python | Part 2 ModuleNotFoundError No module named 'mysql' in Python N2 in Python XGBoost for Regression in Python Explain sklearn clustering in Python Data Drop in Python Falcon Python Flutter Python Google Python Class Excel to CSV in Python Google Chrome API in Python Gaussian elimination in python Matrix List Comprehension in Python Python List Size Python data science course StandardScaler in Sklearn Python Redis Example Python Program for Tower of Hanoi Python Printf Style Formating Python Percentage Sign Python Parse Text File Python Parallel Processing Python Online Compiler Python maketrans() function Python Loop through a Dictionary Python for Data Analysis Python for Loop Increment Python Kwargs Example Python Line Break What does base case mean in recursion What does the if __name__ == "__main__" do in Python What is Sleeping Time in Python Kite Python Length of Tuple in Python Python String Lowercase Python Struct Python Support Python String Variable Python System Command Python TCP Server Python Unit Test Cheat String Python Validator Unicode to String in Python An Introduction to Mocking in Python An Introduction to Subprocess in Python with Examples Anytree Python API Requests using Python App Config Python Check if the directory exists in Python Managing Multiple Python Versions With pyenv os.rename() method in Python os.stat() method in Python Python Ways to find nth occurrence of substring in a string Python Breakpoint Find Last Occurrence of Substring using Python Python Operators Python Selectors Python Slice from Last Occurrence of K Sentiment Analysis using NLTK String indices must be integers in Python Tensorflow Angular in Python AES CTR Python Crash Course on Python by Google Curdir Python Exrex Python FOO in Python Get Bounding Box Co-ordinates Python Hog Descriptor Opencv Python Important Difference between Python 2.x and Python 3.x with Example Io stringio Python iobase Python IPython Display Iterate through the list in Python Joint Plot in Python JWT Decode Python List Comprehension in Python List in Python Map Syntax in Python Python Marshmallow PyShark in Python Python Banner Python Logging Maxbytes Python Multiprocessing Processor Python Skyline Python Subprocess Call Example Python Sys Stdout Python Win32 Process Python's Qstandarditemmodel Struct Module in Python Sys Module in Python Tuple in Python Uint8 Python XXhash Python Examples XXhash Python Handling missing keys in Python dictionaries Python Num2words Python Os sep OSError in Python Periodogram in Python Pltpcolor in Python Poolmanager in Python Python pycountry Python pynmea2 Difference between Package and Module in Python How to add 2 lists in Python How to assign values to variables in Python and other languages How to build an Auto Clicker using Python How to check if the dictionary is empty in Python How to check the version of the Python Interpreter How to convert Float to Int in Python How to Convert Int to String in Python How to Define a Function in Python How to Install Pandas in Python How to Plot Graphs Using Python How to Program in Python on Raspberry pi How to Reverse a number in Python How to Sort a String in Python What is Collaborative Filtering in ML, Python What is the Python Global Interpreter Lock

Hypothesis Testing in Python

Hypothesis Testing in python is widely used along with statistics.

Many libraries in Python are very useful for statistics and machine learning.

Hypothesis Testing in Python

Libraries like numpy, scripy etc. help in hypothesis testing in python.

Before we start with Hypothesis testing firstly we have to understand about the hypothesis:

Hypothesis

A hypothesis is a conjecture or explanation which is based on little data that serves as the basis for further research. But, keeping that definition in mind let’s continue with a simple explanation.

Hypothesis is a proposition made on the basis for reasoning without any assumption of actual result that will be deduced from the proposition.

For references:

Let’s say, when we make an argument that if we sleep for 8 hours then we will get better marks rather than if we sleep less.

Hypothesis Testing

The word "hypothesis testing" refers to a statistical procedure. In Hypothesis Testing there is a testing of any assumption over a population parameter.

It is employed to judge how plausible the hypothesis is.

When it comes to the digital world, Hypothesis Testing is a machine learning based task which is generally done by python.

Purpose of Hypothesis Testing

It is a crucial step in statistics. Using sample data, a hypothesis test evaluates which of two statements about a population that are incompatible with one another is more strongly supported. A hypothesis test is what we allows us to state that a result is statistically significant.

Basis of Hypothesis:

Normalisation and Standard Normalisation are the foundation of a hypothesis.

Hypothesis Testing in Python Hypothesis Testing in Python
  • Normal Distribution: If a variable's distribution resembles a normal curve i.e., a unique bell-shaped curve, then it is said to have a normal distribution or to have this property.  The normal curve, which is the graph of a normal distribution, possesses each of the qualities listed below.
  • Standard Normal Distribution: The term "standard normal distribution" refers to a normal distribution with a mean of 0 and a standard deviation of 1.

Significant Parameters of Hypothesis Testing

The null hypothesis is a broad assertion or default stance in inferential statistics that there is no correlation between two measurable events or no link between groups.

In other words, it is a fundamental assumption or one based on an understanding of the problem or subject.

For instance, a corporation may produce 50 items every day, etc.

Alternative Hypothesis

In hypothesis testing, the alternative hypothesis is the theory that differs from the null hypothesis. Typically, it is assumed that the observations are the product of an actual impact (with some amount of chance variation superposed)

For instance, a corporation may not produce 50 items each day.

Level Of Significance

The fixed likelihood that the null hypothesis will be incorrectly eliminated when it is actually true is the degree of significance. The likelihood of type I mistake is defined as the degree of significance, and the researcher sets it based on the outcomes of the error. The assessment of statistical significance is called degree of significance. Whether the null hypothesis is thought to be accepted or rejected is specified. If the outcome is statistically significant, it should be possible to conclude that the null hypothesis is either incorrect or invalid.

It is impossible to accept or reject a hypothesis with 100% precision,so we choose a level of significance that is typically 5%.

This is typically indicated with the mathematical symbol alpha, which is typically 0.05 or 5%, meaning that you should have 95% confidence that your output will provide results that are comparable in each sample.

Type I error:

Despite the fact that the null hypothesis was correct, we reject it. Alpha is used to indicate a type I mistake. The alpha area is the portion of the normal curve that displays the important region in hypothesis testing.

Type II error

When the null hypothesis is accepted but is incorrect. The sign of a type II mistake is beta. The section of the normal curve that displays the acceptance region in a hypothesis test is referred to as the beta region.

One-tailed test

If just one side of the sample distribution has an area of rejection, the test of a statistical hypothesis is said to be one-tailed.

For instance, data science is being adopted by 80% of a college's 4000 students.

Hypothesis Testing in Python

Two-tailed test

A two-sided critical area of a distribution is used in the two-tailed test to assess if a sample is more than or less than a specific range of values. If any of the key areas apply to the sample being tested, the alternative hypothesis will be used instead of the null hypothesis.

For instance, a school= 4,500 students, or data analyst = 83% of organisations embrace.

P Value

The likelihood of discovering the observed, or more extreme, outcomes when the null hypothesis (H 0) of a study question is true is known as the P value, or computed probability. The definition of what is considered to be "extreme" depends on how the hypothesis is being tested.

When your P value falls below the selected threshold of significance then the null hypothesis is rejected . It is agreed  that your sample contains solid evidence that the alternative hypothesis is true. It does not imply any  significant or important  difference i.e. : that it is prominently subjective while considering the real-world cases.

For reference:

Let’s take a case with a coin that you have and you don’t know that it is fair or it is tricky.

So let’s decide it with the NULL and Alternate Hypothesis.

H0=For Fair coin

H1=For Tricky coin and alpha=0.05

Let’s Workout for P-value

When we toss a coin for the first time if it gives Tail then P-value is =50%

When we toss the coin for the second time the if it gives the Tail again then the P-value is=25%

Then after trying for 6 consecutive times then P-value=1.5%

As we have set our significance to 95% that means that our error rate is 5% level.

We are beyond that level so our null hypothesis holds a good stand so we need to reject it.

Degree Of Freedom:

Degrees of freedom are the number of independent variables that can be estimated in a statistical analysis.

For better understanding of the hypothesis testing, let’s look on some widely use

Hypothesis testing type using python.

  • T Test ( Student T test)
  • Z Test
  • ANOVA Test
  • Chi-Square Test

T test

The t-test is a kind of inferential statistic that is used to assess if there is a significant difference between the means of two groups that could be connected by certain characteristics.

It is typically employed when data sets, such those obtained from tossing a coin 100 times and recorded as results, would follow a normal distribution and could have unknown variances.

The T test is a method for evaluating hypotheses that allows you to evaluate a population-applicable assumption.

There are Two Types of T-test:

  1. One Sampled T-test
  2. Two Sampled T-test

One Sampled T-test

Using the One Sample t Test, you may find out if the sample mean differs statistically from a real or hypothesised population mean. A parametric test is the One Sample t Test.

For reference:  You have to check whether the average age of the 10 ages taken is 30 or not.

Code:

from scipy.stats import ttest_1samp
import numpy as np
sammples_of_all_ages = np.genfromtxt("ages.csv")
print(sammples_of_all_ages)
data_mean = np.mean(sammples_of_all_ages)
print(data_mean)
tset, pval = ttest_1samp(sammples_of_all_ages, 30)
print("P-value",pval)
if pval < 0.05:    # alpha value is 0.05 or 5%
   print(" Null Hypothesis is being rejected")
else:
  print("Null Hypothesis is being accepted")

Output:

[30. 23. 45. 31. 34. 36. 42. 60. 23. 30.]
35.4
P-value 0.16158645446293013
Null Hypothesis is being accepted

Two Sampled Test

The Independent Samples t Test, also known as the 2-sample t-test, examines the means of two independent groups to see if there is statistical support for the notion that the related population means are statistically substantially different. It is  an  example of a parametric test.

Code:

from scipy.stats import ttest_ind
import numpy as np
first_Week= np.genfromtxt("week1.csv", delimiter=",")
second_Week= np.genfromtxt("week2.csv", delimiter=",")


print("Data of First Week :")
print(first_Week)
print("\nData of Second Week :")
print(second_Week)


mean_Of_First_Week = np.mean(first_Week)
mean_Of_Second_Week = np.mean(second_Week)


print("\nfirst_Week mean value:",mean_Of_First_Week)
print("second_Week mean value:",mean_Of_Second_Week)


std_Of_Fist_Week= np.std(first_Week)
std_Of_Second_Week = np.std(second_Week)


print("\nfirst_Week std value:",std_Of_Fist_Week)
print("second_Week std value:",std_Of_Second_Week)


ttest,pval = ttest_ind(first_Week,second_Week)
print("\np-value",pval)


if pval <0.05:
  print("\nREJECTING THE NULL HYPOTHESIS")
else:
  print("\nACCEPTING THE NULL HYPOTHESIS")

Output:

Data of First Week :
[1.2354 3.5685 5.8974 3.7894 6.8945 5.5685 5.6448 6.4752 7.7841 6.7846
 4.5684 6.4579 7.5461 8.9456 6.5489 4.5858 9.4563 8.1523 7.8945 2.5613
 1.5632 9.8945 7.5612 7.5647 8.8945]


Data of Second Week :
[1.2154 3.5455 7.8974 7.7894 6.8945 7.5685 2.6448 6.4752 5.7841 1.7846
 2.5684 5.4579 2.5461 4.9456 3.5489 2.5858 7.4563 6.1523 5.8945 3.5613
 2.5632 3.8945 5.5612 5.5647 6.8945]


first_Week mean value: 6.233504
second_Week mean value: 4.831784


first_Week std value: 2.306689087931878
second_Week std value: 2.0245040191968005


p-value 0.029934418917393343


REJECTING THE NULL HYPOTHESIS

Paired sampled t-test

The dependent sample t-test is another name for the paired sample t-test. It is a univariate test that looks for a meaningful distinction between two closely related variables. An instance of this would be taking a person's blood pressure before and after a certain therapy, condition, or time period.

H0 :- Refers to the difference between two sample is 0

H1:- Refers to the  difference between two sample is not 0

For Example:

We have taken blood pressure  tests , before and after a certain treatment of 120 people.

Code:

import pandas as pd
from scipy import stats


test_samples = pd.read_csv("blood.csv")
test_samples[['bp_before','bp_after']].describe()
ttest,pval = stats.ttest_rel(test_samples['bp_before'], test_samples['bp_after'])


print(pval)
if pval<0.05:
    print("REJECTING THE NULL HYPOTHESIS")
else:
    print("ACCEPTING THE NULL HYPOTHESIS")

Result:

0.0011297914644840823

REJECTING THE NULL HYPOTHESIS

Z-Test

The main purpose for using the Z-test is that it follow certain criteria in comparison to the T test which are given below:

  • If it is feasible, sample sizes need to be comparable.
  • Your data should be picked at random from a population in which each possible item has a fair probability of being chosen.
  • You have a sample size of more than 30. If not, employ a t test.
  • Data should be normally distributed.
  • There should be no correlation between any two data points. To put it another way, two data points are unrelated or unaffected by one another.

For Example:-

We are again going to take the blood pressure test for Z Test (One Sample Test)

Code:

import pandas as pd
from scipy import stats
from statsmodels.stats import weightstats as stests
	
test_samples = pd.read_csv("blood.csv")
ztest ,pval = stests.ztest(test_samples['bp_before'], x2=None, value=156)
print(float(pval))


if pval<0.05:
    print("REJECTING THE NULL HYPOTHESIS")
else:
    print("ACCEPTING THE NULL HYPOTHESIS")


Output:

0.6651614730255063

ACCEPTING THE NULL HYPOTHESIS

We can do the two sample test for Z Test also

Two Sample Test

Similar to a t-test, we are comparing the sample means of two independent data sets to see whether they are equal.

H0 : 0 is the mean of two groups.

H1 : 0 is not the mean of two groups.

For Example:-

We taking the same blood pressure test for sake of convenience

Code:

import pandas as pd
from scipy import stats
from statsmodels.stats import weightstats as stests


test_samples = pd.read_csv("blood.csv")
ztest ,pval = stests.ztest(test_samples['bp_before'], x2=test_samples['bp_after'], value=0,alternative='two-sided')
print(float(pval))
if pval<0.05:
    print("REJECTING THE NULL HYPOTHESIS")
else:
    print("ACCEPTING THE NULL HYPOTHESIS")

Output:

0.002162306611369422

REJECTING THE NULL HYPOTHESIS

Anova Test(F-test)

The t-test is effective for comparing two groups, however there are situations when we wish to compare more than two groups at once. For instance, we would need to compare the means of each level or group in order to determine whether voter age varied depending on a categorical variable like race. We could do a separate t-test for each pair of groups, but doing so raises the possibility of false positive results. A statistical inference test that allows for simultaneous comparison of numerous groups is the analysis of variance, or ANOVA.

F=Variability between group/ Variability within group

Note:

The F-distribution does not have any negative values, unlike the z- and t-distributions, because the within- group variability is always positive as a result of squaring each deviation.

One Way F-test(Anova)

On the basis of their mean similarity and f-score of two or more groups, it  determines if they are similar or not.

Example:

Checking the similarity among three types of plants on the basis of their category and weight.

Csv  File data in a sheet format:

Sr no.weightgroup
14.17ctrl
25.58ctrl
35.18ctrl
46.11ctrl
54.5ctrl
64.61ctrl
75.17ctrl
84.53ctrl
95.33ctrl
105.14ctrl
114.81trt1
124.17trt1
134.41trt1
143.59trt1
155.87trt1
163.83trt1
176.03trt1
184.89trt1
194.32trt1
204.69trt1
216.31trt2
225.12trt2
235.54trt2
245.5trt2
255.37trt2
265.29trt2
274.92trt2
286.15trt2
295.8trt2
305.26trt2

Code:

import pandas as pd
from scipy import stats
from statsmodels.stats import weightstats as stests


df_anova = pd.read_csv('PlantGrowth.csv')
df_anova = df_anova[['weight','group']]


grps = pd.unique(df_anova.group.values)
d_data = {grp:df_anova['weight'][df_anova.group == grp] for grp in grps}
 
F, p = stats.f_oneway(d_data['ctrl'], d_data['trt1'], d_data['trt2'])


print("p-value for significance is: ", p)


if p<0.05:
    print("REJECTING THE NULL HYPOTHESIS")
else:
    print("ACCEPTING THE NULL HYPOTHESIS")

Output:

p-value for significance is:  0.0159099583256229

REJECTING THE NULL HYPOTHESIS

Two Way F-Test(Anova)

When we have two independent variables and two or more groups, we utilise the two-way F-test, which is an extension of the one-way F-test. The 2-way F-test cannot identify the dominating variable. Post-hoc testing must be carried out if individual significance needs to be verified.

Now let's examine the Grand mean crop yield (the mean crop yield without regard to any subgroup), the mean crop yield for each individual element, and the mean crop yield for all of the factors together.

Example with Code:

Yielding capacity of certain type of fertiles.

Csv  File data in a sheet format:

FertWaterYield
AHigh27.4
AHigh33.6
AHigh29.8
AHigh35.2
AHigh33
BHigh34.8
BHigh27
BHigh30.2
BHigh30.8
BHigh26.4
ALow32
ALow32.2
ALow26
ALow33.4
ALow26.4
BLow26.8
BLow23.2
BLow29.4
BLow19.4
BLow23.8

Code:

import pandas as pd
import statsmodels.api as sm
from statsmodels.formula.api import ols
df_anova2 = pd.read_csv("https://raw.githubusercontent.com/Opensourcefordatascience/Data-sets/master/crop_yield.csv")
model = ols('Yield ~ C(Fert)*C(Water)', df_anova2).fit()
print(f"Overall model F({model.df_model: .0f},{model.df_resid: .0f}) = {model.fvalue: .3f}, p = {model.f_pvalue: .4f}")
res = sm.stats.anova_lm(model, typ= 2)
res

Output:

Overall model F( 3, 16) =  4.112, p =  0.0243

Chi-Square Test:

When there are two categorical variables from the same population, the test is used. It is used to decide if the two variables significantly associate with one another.

For instance, voters may be categorised in an election poll according to voting choice and gender (male or female) (Democrat, Republican, or Independent). To find out if gender affects voting choice, we might perform an independent chi-square test.

Example with code:

Which gender like shopping the most?

Csv File data in a sheet format:

GenderShopping?
MaleNo
FemaleYes
MaleYes
FemaleYes
FemaleYes
MaleYes
MaleNo
FemaleNo
FemaleNo

Code:

from re import T
import pandas as pd
from scipy import stats
from statsmodels.stats import weightstats as stests


df_chi_test= pd.read_csv("shop.csv")
table_Of_contingency=pd.crosstab(df_chi_test["Gender"],df_chi_test["Shopping?"])
print('contingency_table :-\n',table_Of_contingency)




#Values that are observed
values_Observed = table_Of_contingency.values
print("Observed Values :-\n",values_Observed)


b=stats.chi2_contingency(table_Of_contingency)
values_expected = b[3]
print("Expected Values :-\n",values_expected)


row_no=len(table_Of_contingency.iloc[0:2,0])
coloumn_no=len(table_Of_contingency.iloc[0,0:2])
ddof=(row_no-1)*(coloumn_no-1)
print("Degree of Freedom:-",ddof)


alpha = 0.05
from scipy.stats import chi2


chi_square_o=sum([(o-e)**2./e for o,e in zip(values_Observed,values_expected)])
chi_square_stats=chi_square_o[0]+chi_square_o[1]
print("chi-square statistic:-",chi_square_stats)
critical_value=chi2.ppf(q=1-alpha,df=ddof)
print('critical_value:',critical_value)


#p-value
p_value=1-chi2.cdf(x=chi_square_stats,df=ddof)
print('p-value \(level of marginal significance within a statistical hypothesis test\) :',p_value)
print('Level of Significance: ',alpha)
print('Degree of Freedom for hypothesis: ',ddof)
print('chi-square statistic for the hypothesis :',chi_square_stats)
print('critical_value of the hypothesis:',critical_value)
print('p-value \(level of marginal significance within a statistical hypothesis test\):',p_value)
if chi_square_stats>=critical_value:
    print("Reject H0,Two category variables are related to one another.")
else:
    print("Retain H0,Two category variables are not related to one another")
   
if p_value<=alpha:
    print("Reject H0,Two category variables are related to one another")
else:
    print("Retain H0,Two category variables are not related to one another")


Output:

contingency_table :-
 Shopping?  No  Yes
Gender
Female      2    3
Male        2    2
Observed Values :-
 [[2 3]
 [2 2]]
Expected Values :-
 [[2.22222222 2.77777778]
 [1.77777778 2.22222222]]
Degree of Freedom:- 1
chi-square statistic:- 0.09000000000000008
critical_value: 3.841458820694124
p-value \(level of marginal significance within a statistical hypothesis test\) : 0.7641771556220945
Level of Significance:  0.05
Degree of Freedom for hypothesis:  1
chi-square statistic for the hypothesis : 0.09000000000000008
critical_value of the hypothesis: 3.841458820694124
p-value \(level of marginal significance within a statistical hypothesis test\): 0.7641771556220945
Retain H0,Two category variables are not related to one another
Retain H0,Two category variables are not related to one another