Lab Session 1 Statistics For Data Science
Data Science 1st Sessional Pdf In this lab session of statistics for data science, we will be diving into the various concepts of statistics necessary for you to start your journey as a data scientist or analyst. Contains labs & assignments for the class infx 573 statisical foundations of data science infx 573 statistics in data science labs lab 1 lab 1.pdf at master · axelnine infx 573 statistics in data science.
Class Sessions Data Science Club The course statistics for data science 1 revolves around data what it is, its collection, its types, its analysis and summarization. we begin with understanding the difference between. Statistics is the science of collecting, analyzing, and interpreting data to uncover patterns and make decisions. in data science, it acts as the backbone for understanding data and building reliable models. there are commonly two types of statistics, which are discussed below:. Build an intuitive understanding of concept in statistics: sample, population, correlation, p value, significance, and others. be able to write python code that generates elaborate and beautiful visuals. make simulations using python code that showcase various statistical concepts. The function glimpse is designed to give you an overview of the data set. it gives basic info on the data set and allows a peak at the various columns and types of variables and values they hold. you should now also see the data in the environment tab. in rstudio, click on it and see what happens.
Lab 1 Introduction To Statistics And Data Analysis Build an intuitive understanding of concept in statistics: sample, population, correlation, p value, significance, and others. be able to write python code that generates elaborate and beautiful visuals. make simulations using python code that showcase various statistical concepts. The function glimpse is designed to give you an overview of the data set. it gives basic info on the data set and allows a peak at the various columns and types of variables and values they hold. you should now also see the data in the environment tab. in rstudio, click on it and see what happens. Complete this guided project in under 2 hours. this is a hands on project to give you an overview of how to use statistics in data science. N = 10 # gaussian distributed data with mean = 2 and var = 1 x = np.random.randn(n) 2 # gaussian distributed data with mean = 0 and var = 1 y = np.random.randn(n) # calculating the standard deviation # calculating the variance to get the standard deviation var x = x.var(ddof = 1) var y = y.var(ddof = 1) # standard deviation sd = np.sqrt((var x var y) 2) print("standard deviation =", sd) # calculating the t statistics tval = (x.mean() y.mean()) (sd * np.sqrt(2 n)) # comparing with the critical t value # degrees of freedom dof = 2 * n 2 # p value after comparison with the t statistics pval = 1 stats.t.cdf( tval, df = dof) print("t = " str(tval)) print("p = " str(2 * pval)) ## cross checking using the internal function from scipy packa ge tval2, pval2 = stats.ttest ind(x, y) print("t = " str(tval2)) print("p = " str(pval2)). By the end of this session, you'll have a comprehensive understanding of statistics in the context of data science, capable of summarizing and interpreting complex data with ease. The document outlines lab 1 for stats 10, focusing on r basics and data analysis. it includes objectives, pre lab preparation, collaboration policies, and detailed instructions for using r and rstudio, along with exercises covering vectors, data visualization, and statistical functions.
Comments are closed.