# 20 Important Data Analyst Interview Questions In 2021

## Introduction

By 2022, according to the World Economic Forum, 85% of companies will have adopted Big Data Analytics and related technologies. A direct result of which is increased demand in analytics-related roles, one of them being a Data Analyst. Understanding the top data analyst interview questions will help you in getting your dream job.

Some core skills that are expected of Data Analyst are,

• Knowledge of database management systems and SQL (Structured Query Language),
• Data Visualization Skills and presentation skills
• Good hold of analytical tools like Tableau, Excel, R, Python with some programming for application of scientific analytical techniques.

Looking to apply for a Data Analyst role? Cracking a job interview for Data Analyst requires skills in various domains and you will be tested for this domain knowledge apart from your analytical skills. Also, how much you have understood about the domain you worked in will also weigh in significantly on the psyche of the interviewer. So be thorough with the domain you worked on and understand these important data analyst interview questions. As mentioned earlier there are several skillsets you will be tested for. They are, Statistics, SQL, Visualization Tools, Business Intelligence tools like Excel, Tableau and the like, data analysis tools with programming like Python and R.

Let’s get along with the Data Analyst Interview Questions

## Q1. What is a Normal Distribution?

Also known as Gaussian curve, normal distributions, measure the spread of data in a data set with its mean as a central reference. The flatter the curve the more the spread, the more peaked the cure the lesser the spread around its mean.

## Q2. What is A/B Testing?

A/B testing is a statistical hypothesis test for an experiment with two variables A and B. It is an analytical used to estimate population parameters based on sample statistics.

## Q3. Explain the statistical power of sensitivity?

It is a metric that helps in validating the accuracy of a classifier. This classifier can be either Logistic Regression, Support Vector Machine, Random Forest etc.

Sensitivity is the ratio of predicted true events to the total number of events.

## Q4. Discuss the difference between univariate, bivariate and multivariate analysis.

• Univariate analysis

It is a technique used to analyse and evaluate the dependency of a response variable on a single predictor.

• Bivariate analysis

The bivariate analysis involves the analysis of two variables with the objective of determining an empirical relationship between them.

• Multivariate analysis

Analysis of more than one statistical outcome variable studied together in a single experiment is Multivariate analysis

## Q5. What is the difference between variance and covariance?

Variance is a measure of variability within a feature of a data set. It is a central tendency measure and indicates the degree of spread. Covariance is a completely different topic referring to how two random variables are dependent on each other, indicating a correlation between the two variables or features.

## Q6. Explain the Waterfall chart and its usage.

A waterfall chart is a form of data visualization that shows the impact of successive values on the net outcome using a vertical bar graph.

## Q7. Explain the use of conditional formatting in Excel, to highlight negative values in a column.

• Select the column containing your wish to apply conditional formatting.
• Select the Conditional Formatting option from the ribbon.
• Select the Highlight Cell Rules and click on Less Than option.
• Input the value 0 to indicate negative numbers.
• Select the colour you want Excel to highlight the data in.

## Q8. Can you make a Pivot Table from multiple tables?

As long as there is a connection between the two tables, you can create a pivot table from multiple tables.

## Q9. Explain Normalization and its various forms.

Normalization is a technique employed in database design to ensure there is minimal data redundancy in tables and improve data integrity. These various forms of normalization are called normal forms. The first 3 normal forms are,

• First Normal Form– 1NF mandates that each attribute should be in it most atomic form as practically possible. Like Address should be composed of atomic values of house number, street number, and not a composite string containing the full address.
• Second Normal Form – Builds on the 1NF and mandates that all non-primary key attributes are completely dependent on the primary key of all tables.
• Third Normal Form– Builds on the 2NF and mandates that there should not be any transitive dependency for non-prime attributes

## Q10. Explain the different types of Joins

There are 4 major types of joins Inner Join, Left Join, Right Join and Full Outer Join.

• Inner join: Inner Join is the most common type of join. It is used to return an intersection of the tables involved, in other words, rows where the join condition is true.
• Left Join:  Left Join is used to return every row from the left table in the join statement, and return only the matching ones on the right of the join.
• Right Join: Right Join is used to return every row from the right table in the join statement, and return only the matching ones on the left of the join.
• Full Join: Full join returns all the records when there is a match in any of the tables. It thus returns all the rows from bot the left and right side of the join.

## Q11. Explain the difference between joining and blending in Tableau

Joining refers to combining data from a single data source or database like joining tables under a single schema.  Blending refers to bringing data from two or more different sources.

## Q12. What is the main difference between heat map and treemap?

A heatmap is used for comparing different categories with the help of colour and size. A treemap will help you visualized the data in almost the same way with the addition of highlighting hierarchical data and part-to-whole relationships.

A few other questions are listed below for your review.

Q13. Explain the aggregation and disaggregation of data?

Q14. Explain the creation of stories in Tableau?

Q15.  How is code written in basic syntax style in SAS?

Q16. Explain interleaving in SAS?

Q17. Highlight the differences between the Do Index, Do While and the Do Until loop?

Q18. In SAS, what is the ANYDIGIT function used for?

Q19. Figure out the difference between VAR X1 – X3 and VAR X1 — X3?

Q20. In SAS, what is trailing @ and @@ used for?

## Conclusion

There are various topics that you need to retouch before you get to the interview, just remember to ask yourself Data Analyst Interview Questions that you would if you have to select a candidate in an interview, keeping in mind the skills required for the job.

If you are interested in making a career in the Data Science domain, our 11-month in-person Postgraduate Certificate Diploma in Data Science course can help you immensely in becoming a successful Data Science professional.