Plot Pandas DataFrames Using AI Assistants

Tested prompts for ai tool to plot pandas dataframe compared across 5 leading AI models.

BEST BY JUDGE SCORE Claude Haiku 4.5 8/10

If you have a pandas DataFrame and need to visualize it fast, AI coding assistants can generate the matplotlib, seaborn, or plotly code for you in seconds. Instead of digging through documentation to remember whether to use df.plot(), sns.barplot(), or px.line(), you describe your data and what you want to see, and the AI writes the chart code. That is the core value here: you skip the syntax lookup and get working code on the first try.

The typical situation is this: you have a DataFrame loaded, you know what insight you want to surface, but you are not sure which chart type fits or how to format it correctly. AI tools handle exactly that gap. You paste your column names, describe the relationship you want to show, and get back a full code block ready to run in a Jupyter notebook or Python script.

This page shows you how four major AI models handle a realistic DataFrame plotting prompt, compares their outputs, and gives you the practical context to pick the right tool and write the right prompt every time. Whether you are building a quick exploratory chart or a polished visualization for a report, the workflow is the same.

When to use this

Use an AI tool to plot pandas DataFrames when you need working visualization code quickly and do not want to look up the exact syntax for a specific chart type or library. This fits exploratory data analysis, one-off reporting, and situations where you are switching between matplotlib, seaborn, and plotly and need a reliable starting point without memorizing each API.

  • You have a DataFrame ready and need a bar, line, scatter, or heatmap chart but cannot recall the exact seaborn or plotly syntax
  • You are doing exploratory data analysis in Jupyter and want to iterate through chart types rapidly without stopping to read docs
  • You need to customize a chart with axis labels, color schemes, or annotations and want a complete code scaffold to edit rather than build from scratch
  • You are new to data visualization in Python and want to learn by seeing correct, working examples generated for your specific data
  • You are switching from one plotting library to another and need a quick translation of a chart you already built

When this format breaks down

  • Your DataFrame contains sensitive or proprietary data columns you cannot paste into a third-party AI chat interface due to privacy or compliance rules
  • You need pixel-perfect, brand-specific chart styling that requires deep iteration with your design team, not code generation
  • You are working with very large DataFrames over a million rows where the AI cannot advise on rendering performance or downsampling strategies without seeing real profiling data
  • You need interactive dashboards with state management and callbacks, such as a full Dash or Streamlit app, where the complexity goes well beyond a single plot function call

The prompt we tested

You are a Python data visualization assistant that helps users plot pandas DataFrames. Given the user's DataFrame description and visualization goal below, produce ready-to-run Python code using pandas and matplotlib (or seaborn/plotly when clearly better) to create the requested plot.

Respond in under 250 words. Output must include exactly one fenced ```python code block that runs end-to-end assuming the user's DataFrame is named `df`, with clear axis labels, a title, and plt.show() at the end. Keep prose concise and avoid unnecessary data-cleaning steps unless the user's input requires them.

User's DataFrame and plotting request:
I have a pandas DataFrame called df with columns 'date' (datetime), 'product', and 'revenue'. It covers 3 products over 12 months of 2024. I want to visualize monthly revenue trends per product so I can compare which product is growing fastest.

Return: (1) a brief recommendation of the best chart type for this data, (2) a complete, copy-pasteable Python code block including imports and sample df.head() assumptions, (3) short notes explaining key styling choices (labels, title, figure size, color) and one suggested variation.

Model comparison

ModelQualityCostLatencyTokens
Claude Opus 4.7 8 $0.0589 10.2s 399 / 705
Claude Haiku 4.5 8 $0.0028 5.8s 273 / 513
GPT-5.4 7 $0.0057 7.7s 252 / 339
Gemini 2.5 Pro 7 $0.0032 17.9s 261 / 568
Grok 4.1 Fast Reasoning 7 $0.0002 12.7s 399 / 302

Outputs

Claude Opus 4.7 8/10
**Recommendation:** A multi-line chart with one line per product is ideal here—it makes trend direction and relative growth rates easy to compare at a glance.

```python
import pandas as pd
import matplotlib.pyplot as plt

# Assumes df.head() looks like:
#         date   product  revenue
# 0 2024-01-01        A     1200
# 1 2024-01-01        B      950
# 2 2024-01-01        C      780

# Aggregate to monthly revenue per product
df['date'] = pd.to_datetime(df['date'])
monthly = (df.groupby([pd.Grouper(key='date', freq='MS'), 'product'])['revenue']
             .sum()
             .unstack('product'))

# Plot
fig, ax = plt.subplots(figsize=(10, 6))
monthly.plot(ax=ax, marker='o', linewidth=2, colormap='tab10')

ax.set_title('Monthly Revenue Trends by Product (2024)', fontsize=14, fontweight='bold')
ax.set_xlabel('Month')
ax.set_ylabel('Revenue')
ax.legend(title='Product', frameon=False)
ax.grid(True, linestyle='--', alpha=0.5)
fig.autofmt_xdate()
plt.tight_layout()
plt.show()
```

**Styling notes:**
- **Figure size (10×6):** wide enough to space out 12 monthly ticks without crowding.
- **Markers + lines:** markers highlight discrete monthly points while lines convey trend.
- **`tab10` colormap:** categorically distinct colors for 3 products, colorblind-friendlier than defaults.
- **Dashed grid at low alpha:** aids reading values without dominating the chart.
- **Bold title, clean legend (no frame):** keeps focus on the data.

**Variation:** To compare *growth rates* directly rather than absolute levels, normalize each product to its January value (`monthly / monthly.iloc[0] * 100`) and plot as an indexed trend — the fastest grower will have the steepest line.
Judge notes: The output is production-ready and complete, but the prose slightly exceeds necessity and the variation note, while useful, is described but not shown as runnable code, leaving a small gap between promise and delivery.
Claude Haiku 4.5 8/10
## Recommendation

A **line plot** is ideal here—it clearly shows revenue trends over time and makes growth rates easy to compare across products.

## Code

```python
import pandas as pd
import matplotlib.pyplot as plt

# Sample DataFrame structure (adjust if needed)
df = pd.DataFrame({
    'date': pd.date_range('2024-01-01', periods=36, freq='MS'),
    'product': ['A']*12 + ['B']*12 + ['C']*12,
    'revenue': [10000 + i*500 for i in range(12)] + 
               [12000 + i*800 for i in range(12)] + 
               [8000 + i*300 for i in range(12)]
})

# Pivot for easier plotting
pivot_df = df.pivot_table(values='revenue', index='date', columns='product')

# Create line plot
plt.figure(figsize=(10, 6))
for product in pivot_df.columns:
    plt.plot(pivot_df.index, pivot_df[product], marker='o', label=product, linewidth=2)

plt.xlabel('Date', fontsize=11, fontweight='bold')
plt.ylabel('Revenue ($)', fontsize=11, fontweight='bold')
plt.title('Monthly Revenue Trends by Product (2024)', fontsize=13, fontweight='bold')
plt.legend(title='Product', frameon=True, shadow=True)
plt.grid(True, alpha=0.3)
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
```

## Key Styling Notes

- **Line markers** ('o') highlight individual months for precision
- **Legend with shadow** makes product identification clear
- **Grid lines** (alpha=0.3) aid value reading without clutter
- **Bold labels/title** improve readability
- **Rotated x-axis** prevents date overlap

## Suggested Variation

Replace the line plot with a **stacked area chart** (`pivot_df.plot(kind='area', stacked=True)`) to emphasize total revenue composition, though this sacrifices easy individual product comparison.
Judge notes: Fully production-ready with clean code, clear structure, and useful variation, but minor nitpick: the sample data construction uses a workaround (repeated lists) rather than a cleaner `np.tile`/`pd.concat` approach, and the legend shadow styling is a trivial flourish that slightly oversells polish.
GPT-5.4 7/10
A multi-line time series chart is the best fit: it shows month-to-month revenue trends clearly and makes growth-rate comparisons across products easy.

```python
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Assumes df has columns: 'date' (datetime), 'product', 'revenue'
print(df.head())

# Ensure datetime and aggregate to monthly revenue per product
df['date'] = pd.to_datetime(df['date'])
monthly = (
    df.groupby([pd.Grouper(key='date', freq='MS'), 'product'], as_index=False)['revenue']
      .sum()
)

# Plot
sns.set_style("whitegrid")
plt.figure(figsize=(10, 6))

sns.lineplot(
    data=monthly,
    x='date',
    y='revenue',
    hue='product',
    marker='o',
    linewidth=2.5,
    palette='tab10'
)

plt.title('Monthly Revenue Trends by Product (2024)')
plt.xlabel('Month')
plt.ylabel('Revenue')
plt.xticks(monthly['date'].sort_values().unique(), rotation=45)
plt.legend(title='Product')
plt.tight_layout()
plt.show()
```

Notes:
- Labels and title make the chart immediately interpretable.
- A 10×6 figure keeps 12 monthly points readable without crowding.
- Distinct colors plus markers help compare products and spot growth patterns quickly.
- `whitegrid` improves readability for trend comparisons.

Suggested variation:
- If you want to emphasize overall change from January to December, try a slope chart or normalize each product to January = 100 for relative growth comparison.
Judge notes: The output is solid and mostly production-ready, but the `plt.xticks()` line using `.unique()` on sorted dates may produce inconsistent tick rendering, and the prompt explicitly asked for a sample `df.head()` assumption with actual sample data in the code (not just a print statement), making it not fully copy-pasteable without a pre-existing `df`.
Gemini 2.5 Pro 7/10
A line chart is the best choice here. It excels at showing trends over time for different categories, making it simple to compare the growth trajectories of your products on the same axes.

The code below uses seaborn's `lineplot`, which automatically groups the data by the 'product' column and assigns a unique color to each, creating a clear and comparative visualization.

```python
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

# --- Sample DataFrame Creation ---
# This code simulates a DataFrame with your described structure.
dates = pd.to_datetime(pd.date_range(start='2024-01-01', periods=12, freq='MS'))
products = ['Product A', 'Product B', 'Product C']
data = []
for product in products:
    # Simulate different growth trends for visualization
    base_revenue = np.random.randint(800, 1500)
    growth_factor = 1 + np.random.uniform(0.05, 0.2)
    for i, date in enumerate(dates):
        revenue = base_revenue * (growth_factor ** (i/12)) + np.random.uniform(-50, 50)
        data.append([date, product, revenue])
df = pd.DataFrame(data, columns=['date', 'product', 'revenue'])

# --- Plotting Code ---
plt.style.use('seaborn-v0_8-whitegrid')
plt.figure(figsize=(12, 6))

# Create a line plot with a distinct line for each product
sns.lineplot(
    data=df,
    x='date',
    y='revenue',
    hue='product', # Automatically groups and colors by product
    marker='o'
)

# Add title and labels for clarity
plt.title('Monthly Revenue Trends by Product (2024)', fontsize=16)
plt.xlabel('Month', fontsize=12)
plt.ylabel('Revenue', fontsize=12)

plt.tight_layout()
plt.show()
```

**Notes:**
*   **Styling:** The `figsize` provides ample space for 12 months of data. The `hue` parameter is key, as it automatically handles the color-coding and legend. A title and axis labels are included for context.
*   **Variation:** You could create a bar chart grouped by month (`seaborn.barplot`) to compare exact revenue figures more directly for specific months, rather than focusing on the trend.
Judge notes: The output delivers a functional, well-structured response but falls short on two points: the sample DataFrame uses randomized data with non-reproducible results (missing `np.random.seed()`), and the "Notes" section is slightly thin—it doesn't explain the `marker='o'` choice or mention legend placement, which a demanding user would expect.
Grok 4.1 Fast Reasoning 7/10
1. **Best chart**: Seaborn lineplot (hue=product) – ideal for comparing multi-series time trends with smooth lines and markers.

```python
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Sample df.head() assumption:
#           date product  revenue
# 0 2024-01-15  ProductA    1000
# 1 2024-01-20  ProductB     800
# ... (datetime, 3 products, 12 months)

df['month'] = df['date'].dt.to_period('M')
monthly_df = df.groupby(['month', 'product'])['revenue'].sum().reset_index()

plt.figure(figsize=(12, 6))
sns.lineplot(data=monthly_df, x='month', y='revenue', hue='product', marker='o', linewidth=2.5)
plt.title('Monthly Revenue Trends by Product (2024)')
plt.xlabel('Month')
plt.ylabel('Total Revenue ($)')
plt.xticks(rotation=45)
plt.legend(title='Product')
plt.tight_layout()
plt.show()
```

3. **Notes**: Figsize=(12,6) for wide view; marker='o', linewidth=2.5 for emphasis; seaborn palette auto-assigns distinct colors (e.g., blue/orange/green); descriptive labels/title. **Variation**: Add `sns.lineplot(..., ci=None)` to disable confidence intervals if data is aggregated totals. (148 words)
Judge notes: The code has a subtle bug: `x='month'` passes Period objects to seaborn, which can cause axis label rendering issues (e.g., showing ordinal integers instead of readable month labels) without converting to string or datetime first, making the code not fully copy-paste reliable.

What makes these work

  1. 01
    Name your columns explicitly

    Always include your actual column names in the prompt, not placeholders like 'column A'. The AI generates import statements and function arguments directly from what you provide, so wrong or vague column names produce code that fails on the first run. Paste the output of df.columns.tolist() directly into your prompt for accuracy.

  2. 02
    Specify the library upfront

    Matplotlib, seaborn, and plotly have very different APIs and produce different output types. If you say 'make a chart' without specifying, the AI will pick one, and it may not be the library already in your environment. State the library in the first sentence of your prompt to avoid dependency mismatches.

  3. 03
    Describe the data shape, not just the goal

    Tell the AI whether your DataFrame is already aggregated or still raw. If you want a bar chart of average sales per region but your DataFrame has one row per transaction, the AI needs to know to add a groupby before plotting. Without that context, the chart code may run but produce a misleading result.

  4. 04
    Ask for tight_layout and show in one block

    Request that the AI include plt.tight_layout() and plt.show() at the end of every matplotlib block. These are frequently omitted in generated code and their absence causes overlapping labels or charts that do not render in certain notebook environments. Making this a standing instruction in your prompt saves a consistent debugging step.

More example scenarios

#01 · E-commerce monthly revenue line chart
Input
I have a pandas DataFrame with columns 'month' (string, Jan through Dec) and 'revenue_usd' (float). I want a line chart showing revenue over time with markers at each data point, a title of 'Monthly Revenue 2024', and the y-axis formatted as dollar amounts. Use matplotlib.
Expected output
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker

fig, ax = plt.subplots(figsize=(10, 5))
ax.plot(df['month'], df['revenue_usd'], marker='o')
ax.set_title('Monthly Revenue 2024')
ax.set_xlabel('Month')
ax.set_ylabel('Revenue (USD)')
ax.yaxis.set_major_formatter(mticker.StrMethodFormatter('${x:,.0f}'))
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
#02 · Healthcare patient outcome heatmap
Input
I have a DataFrame where rows are hospital departments (strings) and columns are outcome categories like 'recovered', 'readmitted', 'referred'. Values are counts. I want a seaborn heatmap with annotations showing the actual numbers, a diverging color palette, and clear axis labels.
Expected output
import seaborn as sns
import matplotlib.pyplot as plt

plt.figure(figsize=(10, 6))
sns.heatmap(df, annot=True, fmt='d', cmap='RdYlGn', linewidths=0.5)
plt.title('Patient Outcomes by Department')
plt.xlabel('Outcome Category')
plt.ylabel('Department')
plt.tight_layout()
plt.show()
#03 · Marketing campaign scatter plot with color grouping
Input
My DataFrame has columns 'spend_usd', 'conversions', and 'channel' (values: email, paid_search, social). I want a plotly scatter plot where each channel is a different color, with hover tooltips showing all three values and a trendline per group.
Expected output
import plotly.express as px

fig = px.scatter(
    df,
    x='spend_usd',
    y='conversions',
    color='channel',
    hover_data=['spend_usd', 'conversions', 'channel'],
    trendline='ols',
    title='Campaign Spend vs Conversions by Channel'
)
fig.update_layout(xaxis_title='Spend (USD)', yaxis_title='Conversions')
fig.show()
#04 · HR headcount bar chart by department and year
Input
I have a DataFrame with columns 'department', 'year' (2022, 2023, 2024), and 'headcount'. I want a grouped bar chart using seaborn that shows all three years side by side for each department, with a legend and rotated x-axis labels.
Expected output
import seaborn as sns
import matplotlib.pyplot as plt

plt.figure(figsize=(12, 6))
sns.barplot(data=df, x='department', y='headcount', hue='year', palette='muted')
plt.title('Headcount by Department and Year')
plt.xlabel('Department')
plt.ylabel('Headcount')
plt.xticks(rotation=30, ha='right')
plt.legend(title='Year')
plt.tight_layout()
plt.show()
#05 · Financial returns distribution histogram
Input
I have a DataFrame with a column 'daily_return' containing daily percentage returns for a stock over two years. I want a histogram with 50 bins, a KDE overlay, a vertical line at the mean, and axis labels using matplotlib and seaborn together.
Expected output
import seaborn as sns
import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(10, 5))
sns.histplot(df['daily_return'], bins=50, kde=True, ax=ax, color='steelblue')
ax.axvline(df['daily_return'].mean(), color='red', linestyle='--', label=f"Mean: {df['daily_return'].mean():.2f}%")
ax.set_title('Distribution of Daily Returns')
ax.set_xlabel('Daily Return (%)')
ax.set_ylabel('Frequency')
ax.legend()
plt.tight_layout()
plt.show()

Common mistakes to avoid

  • Pasting raw data instead of schema

    Pasting hundreds of rows of actual data into the prompt wastes tokens and often causes the AI to focus on the values instead of the structure. Share your column names, data types, and a two or three row sample using df.dtypes and df.head(3) output instead. You get better code with less noise.

  • Ignoring data type mismatches

    If your date column is stored as a string and you ask for a time-series line chart, the AI will write code that technically runs but sorts your dates alphabetically rather than chronologically. Always mention column data types in your prompt, especially for dates and categorical variables that look like numbers.

  • Accepting the first output without running it

    AI-generated plot code sometimes references a variable name or column that does not match your actual DataFrame. Run the code immediately after receiving it and paste any error back into the chat. Most errors are resolved in one follow-up because the AI can see exactly what went wrong.

  • Skipping chart type reasoning

    Asking for 'a chart' without specifying type results in the AI making an assumption that may not match your analytical goal. A bar chart and a heatmap can both show the same data but answer different questions. Decide what relationship or comparison you want to communicate before prompting, and say it explicitly.

  • Not specifying figure size or output format

    Default figure sizes in matplotlib are often too small for reports or too large for notebooks. If you need a specific output, such as a PNG at 300 DPI for a slide deck, include that in the prompt. The AI will add the correct savefig arguments and figsize parameter rather than leaving you to find them in the docs.

Related queries

Frequently asked questions

Which AI tool is best for generating pandas DataFrame plot code?

ChatGPT, Claude, Gemini, and GitHub Copilot all handle standard plotting tasks well. For quick one-off charts in a chat interface, ChatGPT and Claude tend to produce clean, complete code blocks. For inline suggestions while writing a script or notebook, Copilot is faster because it sees your existing code and variable names directly.

Can I use AI to convert a matplotlib chart to plotly?

Yes, and this is one of the most practical use cases. Paste your existing matplotlib code and ask the AI to rewrite it using plotly express. The AI handles the API translation and adds interactivity like hover tooltips automatically. You will usually need to adjust layout details manually, but the core translation is accurate.

How do I prompt AI to plot a pandas DataFrame with multiple subplots?

Describe how many subplots you need, what each one shows, and whether they share axes. For example: 'Create a 2x2 grid of subplots using matplotlib. Top left: line chart of revenue by month. Top right: bar chart of units sold by product. Share the x-axis across rows.' Structured prompts like this produce usable subplot code on the first attempt.

Does the AI need to see my actual data to generate plot code?

No. The AI only needs your column names, their data types, and what you want to visualize. You do not need to paste any rows of actual data. Use df.dtypes and df.head(3) to share structure without exposing real values, which is especially important for sensitive or proprietary datasets.

Can AI help me choose which chart type to use for my DataFrame?

Yes. Describe your columns and the question you are trying to answer, and ask the AI to recommend a chart type with a reason. For example: 'I have sales by region and by month. What chart type best shows both dimensions at once?' The AI will typically suggest a grouped bar chart or heatmap and explain the tradeoff between them.

What if the generated plot code throws a KeyError or AttributeError?

Paste the full error message and the first few lines of your DataFrame output back into the chat. KeyErrors almost always mean a column name in the generated code does not match your actual DataFrame. AttributeErrors usually mean the AI assumed a method exists on a data type that it does not. Both are fast fixes with one follow-up message.