3 Techniques for Addressing Data Quality Issues
3.1 Assignment: 20 Marks
3.1.1 Instructions
Organize into groups, with a maximum of 5 members per group.
Each member should have a distinct role and individual contribution toward the final submission.
3.1.2 Task
Identify and explore methods to address common data quality issues (e.g., handling missing values, detecting and managing outliers, resolving data inconsistencies, etc.).
Use a sample dataset to demonstrate these methods in R. Your implementation should include clear R code and explanations of the methodologies used.
3.1.3 Submission Requirements
1. Report: The group must prepare a comprehensive report that details:
An overview of data quality issues and the chosen methods.
The step-by-step process for implementing these methods in R.
Sample R code with commentary and explanations. Insights or findings based on the sample dataset.
2. Presentation:
- Each member must present only the part of the work they personally contributed to.
3.1.4 Evaluation Criteria
Quality and accuracy of the report, including clarity of explanations and code functionality. Effectiveness and relevance of the techniques chosen for data quality issues.
Depth of individual contributions in both the report and the presentation.
Cohesiveness and clarity of the group presentation.
To earn marks, each member must contribute to both the written report and the presentation.
3.1.5 Deadline
Submit the report and presentation files by November 9, 2024
Final presentations: November 10 12, 2024