data Science (DS) has emerged as a new academic discipline where students are introduced to data-centric thinking and generating data-driven insights through programming. Unlike traditional introductory Computer Scien...
详细信息
ISBN:
(纸本)9798400704239
data Science (DS) has emerged as a new academic discipline where students are introduced to data-centric thinking and generating data-driven insights through programming. Unlike traditional introductory Computer Science (CS) education, which focuses on program syntax and core CS topics (e.g., algorithms and data structures), introductory DS education emphasizes skills such as analyzing data to gain insights by making effective use of programming libraries (e.g., re, NumPy, pandas, scikit-learn). To better understand learners' needs and pain points when they are introduced to DS programming, we investigated a large online course on datamanipulation designed for graduate students who do not have a CS or Statistics undergraduate degree. We qualitatively analyzed students' incorrect code submissions for computational notebookbased assignments in python. We identified common mistakes and grouped them into the following themes: (1) programming language and environment misconceptions, (2) logical mistakes due to data or problem-statement misunderstanding or incorrectly dealing with missing values, (3) semantic mistakes due to incorrect use of DS libraries, and (4) suboptimal coding. Our work provides instructors insights to understand student needs in introductory DS courses and improve course pedagogy, and recommendations for developing assessment and feedback tools to support students in large courses.
暂无评论