llm-workshop-2025

Exploratory data analysis (EDA) using LLMs

Session objective. Learn to perform exploratory data analysis of a tabular data with the help of LLMs. This includes data visualization, interpretation, and statistical summarization.


Session A. Load and clean the data

Step 1. Download the data Project Management CSV

Step 2. Upload it to ChatGPT

Step 3. See the first 5 rows in the CSV

display the first 5 rows of the raw data

Step 4. Clean the rows

remove empty and unnecessary header cells
display the first 5 rows of the clean data

Session B. Perform classical EDA

Classical EDA involves using \summary statistics, distributions, and visualizations to understand the structure and quality of data. It helps detect patterns, correlations, and anomalies before applying advanced models.

PDF with Classical EDA Prompts


Session C. Perform advanced EDA

Step 5. Outlier Detection

Outlier detection is the process of finding data points that are significantly different from the majority of the dataset. It helps identify errors, rare events, or unusual behaviors that may impact analysis or decision-making.

detect numeric outliers based on advanced models

(takes time)

why are they outliers?

Step 6. Multivariate analysis

Multivariate analysis looks at the relationships between three or more variables at once to uncover complex patterns. It helps in understanding how multiple factors interact together rather than in isolation.

create and display in chat a summary table of relationship between project name, days required and progress
how many tasks have been completed? display a table in chat with their respective details
 create a timeline chart of all completed tasks
Involves 'Task Name', 'Start Date', 'End Date', 'Progress' columns.


Step 7. Sorting

sort in-progress tasks based on the days required

(Optional) Session D. Practice Activity

display all the unique values in each column
make a scatterplot and analyze correlation between task name and days required