Description
Exploratory Data Analysis (EDA) is a process for analysing data to identify patterns, relationships, and outliers. It’s an important first step in data analysis because it can help you understand your data and design meaningful statistical analyses. We use the languages, Python and R, for performing EDA to discover value in our data alongside the Cross Industry Standard Process for Data Mining (CRISP DM) framework to structure our initiatives.
Goal
EDA is used to explore and learn about data, rather than confirming hypotheses.
Techniques
EDA uses a variety of techniques, including:
- Bivariate analysis: Examines the relationship between two variables. Techniques include scatter plots, correlation coefficients, cross-tabulation, and line graphs.
- Multivariate analysis: Explores the interactions and dependencies between multiple variables.
- Descriptive statistics: Summarises key features of the data, such as mean, median, mode, and standard deviation
When performing exploratory data analysis (EDA), we always follow the industry process named CRISP DM. This is the Cross Standard Industry Process for Data Mining. As described, this process can be applied to multiple industries, regardless of the data content and size. The CRISP DM framework is the structure we use for all of our projects due to its robustness and versatility.Â
EDA is a crucial stage of any project, as it helps you to understand your data and pick out the valuable elements to store in your data dictionary. When designing and implementing models, the ask for solution delivery must be clear otherwise the desired outcome will not be met. Performing EDA allows you to explore core data values, highlighting any data quality issues or potential integration problems.Â
We use CRISP DM due to its iterative approach and robust structure. This framework allows for continuous improvements to be made to the data pipeline while adding layers of validation to ensure changes align to the project scope. Moreover, emphasis is placed on business and data understanding to reinforce the project requirements and key deliverables.Â
Reviews
There are no reviews yet.