1. What do you mean by Data Analytics? Data Analytics is the process of examining, cleaning, transforming, and modeling data to discover useful information, draw conclusions, and support decision-making.
2. Define Data Analytics Framework? A structured approach that provides guidelines, processes, and tools for collecting, processing, analyzing, and interpreting data to extract meaningful insights.
3. Define Tools of data analytics? Software and programming languages used for data analysis such as R, Python, MATLAB, Excel, Tableau, SAS, and SQL.
4. Explain the applications of Data analytics? Business intelligence, marketing analysis, healthcare analytics, financial analysis, social media analytics, fraud detection, and predictive modeling.
5. What do you mean by statistics and probability distribution? Statistics is the science of collecting and analyzing data. Probability distribution describes how values of a random variable are distributed and their likelihood of occurrence.
1. What do you mean by ANOVA? Analysis of Variance - a statistical test used to compare means of three or more groups to determine if they are significantly different.
2. Define One-Way ANOVA? Statistical test comparing means of three or more independent groups based on one factor or independent variable.
3. Define Two-Way ANOVA? Statistical test that examines the influence of two different categorical independent variables on one continuous dependent variable.
4. Explain the applications of ANOVA? Quality control, medical research, A/B testing, agricultural experiments, and comparing performance across different groups or treatments.
5. What do you mean by p-value, F-statistics and F-crit value?
1. Define Regression? Statistical method for modeling relationships between a dependent variable and one or more independent variables.
2. How to perform Regression analysis on Datasets? Load data, identify dependent/independent variables, fit regression model, evaluate model performance, and interpret coefficients.
3. Explain the application of Regression? Prediction, forecasting, risk assessment, price modeling, sales analysis, and understanding variable relationships.
1. What is Correlation? Statistical measure that indicates the extent to which two variables are linearly related, ranging from -1 to +1.
2. How to perform Correlation analysis on Datasets? Calculate correlation coefficient between variables using statistical software, interpret strength and direction of relationship.
3. Explain the application of Correlation? Portfolio management, market research, quality control, feature selection in machine learning, and relationship analysis.
1. What is linear regression? Statistical method that models linear relationship between dependent variable and independent variable(s) using a straight line.
2. How to perform linear regression on Datasets? Use lm() function in R, specify formula (y ~ x), fit model, and analyze results using summary() function.
3. What is the equation of linear regression? y = β₀ + β₁x + ε, where y is dependent variable, x is independent variable, β₀ is intercept, β₁ is slope, and ε is error term.
1. What is Data Visualization? Graphical representation of data using charts, plots, and graphs to communicate information clearly and efficiently.
2. Define Histogram? Bar graph showing frequency distribution of continuous data by dividing data into bins or intervals.
3. Define Syntax of Bar Chart?
In R: barplot(height, names.arg, main, xlab, ylab, col)
4. Define Pie Chart? Circular chart divided into sectors representing proportions or percentages of different categories.
5. Define Syntax of Histogram?
In R: hist(x, breaks, main, xlab, ylab, col)
1. How to perform case study? Define problem, collect relevant data, analyze using appropriate statistical methods, interpret results, and draw conclusions.
2. Explain different parameters used in case study? User engagement metrics, growth rates, demographic data, behavioral patterns, conversion rates, and performance indicators.
3. What is the outcome of the case study? Insights into business performance, user behavior patterns, recommendations for improvement, and data-driven decision support.
1. What is 2D Plot? Two-dimensional graphical representation showing relationship between two variables on x and y axes.
2. Define barh plot? Horizontal bar chart where bars extend horizontally from y-axis, useful for categorical data comparison.
3. Define Pie plot? Circular chart showing proportional data as slices of a pie, created using pie() function in MATLAB.
4. Define scatter plot? Plot showing individual data points on x-y coordinate system to visualize correlation between two variables.
5. Define stem plot? Plot displaying discrete data as vertical lines from x-axis to data points, useful for discrete sequences.
1. What is 3D Plot? Three-dimensional graphical representation showing relationships between three variables (x, y, z coordinates).
2. Define contour plot? 2D representation of 3D surface using contour lines connecting points of equal value.
3. Define 3D Line plot? Three-dimensional line connecting data points in 3D space, created using plot3() function.
4. Define waterfall plot? 3D plot showing how data changes across two dimensions, creating a mesh-like surface visualization.