๐Ÿ† Kaggle Competition โ€” March Machine Learning Mania 2026

๐Ÿ“ค Kaggle Submission Guide

Follow these steps to upload your predictions to the Kaggle leaderboard. This is completely separate from the CMU MMML competition โ€” different file, different format, different website.

โš ๏ธ
This is NOT the CMU submission.

The CMU Submit page handles the internal Carnegie Mellon MMML competition using two separate files (MNCAATourneyPredictions.csv and WNCAATourneyPredictions.csv) with a different format (WTeamID, LTeamID). This page covers the public Kaggle competition which requires one combined file (KaggleSubmission.csv) in ID, Pred format submitted directly to kaggle.com.

At a Glance

CompetitionMarch Machine Learning Mania 2026
Platformkaggle.com (not the CMU site)
File to submitKaggleSubmission.csv (combined Men's + Women's)
Required rows132,133 rows + 1 header row
ColumnsID, Pred
Scoring metricBrier score (mean squared error of probabilities)
Submissions per day5 maximum โ€” choose which one counts
Score before tournaments0.0 (correct format confirmed; real scoring starts when games begin)

Step 1 โ€” Understand the File Format

What Kaggle expects

Kaggle requires a single CSV file covering every possible matchup between every Division I men's and women's team โ€” not just teams selected for the NCAA tournament. This lets you submit before Selection Sunday without knowing the bracket.

The file has exactly two columns:

# Format of KaggleSubmission.csv ID,Pred 2026_1101_1102,0.5 2026_1101_1103,0.5 2026_1101_1104,0.5 ...
ID column
  • Format: YEAR_TeamID1_TeamID2
  • Example: 2026_1101_1102
  • TeamID1 is always the lower number
  • Men's and women's TeamIDs do not overlap โ€” both are in this one file
Pred column
  • A probability between 0 and 1
  • Probability that TeamID1 (lower ID) wins
  • Example: 0.75 = 75% chance team 1101 beats team 1102
  • Scored with Brier score (MSE) โ€” closer to the real result is better
Why all possible matchups? Kaggle changed the format so you don't need to wait for Selection Sunday to submit. By predicting every possible pairing, any bracket that forms will already have a prediction in your file. Your 2026 submission will show 0.0 on the leaderboard until the tournaments begin โ€” that is normal and expected.

Scoring โ€” Brier Score

Kaggle evaluates your submission using the Brier score, which is equivalent to mean squared error between your predicted probability and the actual binary outcome (1 = lower-ID team won, 0 = higher-ID team won).

# Brier score formula (lower is better, 0.0 is perfect) Brier = mean( (Pred - Outcome)ยฒ ) # Example: you predict 0.7 and the lower-ID team wins (Outcome = 1) error = (0.7 - 1.0)ยฒ = 0.09 # small error, good prediction # Example: you predict 0.7 but the lower-ID team LOSES (Outcome = 0) error = (0.7 - 0.0)ยฒ = 0.49 # large error, bad prediction # Baseline: always predicting 0.5 gives Brier โ‰ˆ 0.25

Step 2 โ€” Download the Submission File

The combined Men's + Women's submission file is pre-generated by the pipeline. It already contains all 132,133 required matchup predictions in the correct Kaggle format.

๐Ÿ† Kaggle Submission (Men's + Women's combined)

KaggleSubmission.csv 132,133 rows ยท All Division I men's & women's team pairs
Columns: ID, Pred
ID format: 2026_TeamID1_TeamID2 (lower ID first)
Pred: probability lower-ID team wins
Generated by python -m src.predict --combine. Covers every C(N,2) pairing of men's teams and every C(N,2) pairing of women's teams in a single file.
๐Ÿ“ฅ Download KaggleSubmission.csv
Need to regenerate this file? If you have retrained the model with newer data, regenerate the Kaggle submission by running:
python -m src.predict --data-dir data/raw --combine
Then run python scripts/export_site_data.py to update the download link on this page.

Step 3 โ€” Verify the File

Before uploading, confirm these requirements

Kaggle will reject your submission if these conditions are not met. Check each one before uploading.

๐Ÿ“„
File name ends in .csv โ€” Kaggle also accepts .zip, .gz, or .7z archives of the CSV, but plain .csv is simplest.
Required
๐Ÿ“Š
Exactly 132,133 data rows (plus one header row) โ€” Kaggle explicitly requires this count. The file has both men's and women's matchups.
Required
๐Ÿ”ค
Header row is exactly ID,Pred โ€” column names are case-sensitive. No extra spaces.
Required
๐Ÿ”ข
ID format: 2026_TeamID1_TeamID2 with TeamID1 < TeamID2 (lower number always first). Example: 2026_1101_1102.
Required
๐Ÿ“ˆ
Pred values are between 0 and 1 โ€” these are probabilities, not scores or rankings. Extreme values like 0.000001 or 0.999999 are allowed.
Required
๐Ÿšซ
No missing values โ€” every row must have both an ID and a Pred. No blank cells, no NaN.
Required
# Quick command-line check (run from repo root): python -c " import pandas as pd df = pd.read_csv('predictions/submission.csv') print('Rows:', len(df)) # should be 132133 print('Columns:', list(df.columns)) # should be ['ID', 'Pred'] print('Any nulls:', df.isnull().any().any()) # should be False print('Pred range:', df.Pred.min(), 'to', df.Pred.max()) # should be within [0,1] print('Sample IDs:', df.ID.head(3).tolist()) # should start with 2026_ "

Step 4 โ€” Upload to Kaggle

How to upload your submission

You must be logged in to Kaggle and have accepted the competition rules before you can submit. Follow these steps exactly.

1
Do this
2
Log in to your Kaggle account. If you don't have one, create a free account at kaggle.com.
Do this
3
Click "Join Competition" and accept the competition rules if you have not already done so. You cannot submit without accepting the rules.
Do this
4
Click the "Submit Predictions" tab (or the blue "Submit" button on the competition page).
Do this
5
Click "Browse Files" or drag-and-drop KaggleSubmission.csv into the upload area.
Do this
6
(Optional but recommended) Enter a short Submission Description so you can identify this version later (e.g., "XGBoost ensemble v3, retrained Mar 16"). Max 500 characters.
Recommended
7
Click "Make Submission". Kaggle will validate the file format. If valid, it will appear in your submission list with a score of 0.0 (this is correct โ€” real scoring starts when the tournaments begin).
Do this
๐Ÿ“ธ What the Kaggle upload form looks like:
CompetitionMarch Machine Learning Mania 2026
Remaining submissions today5 (resets every 24 h)
Upload area"No file chosen โ€” Drag and drop .csv / .zip / .gz / .7z"
Description box0 / 500 characters
Expected rows132,133 rows + header (Kaggle will reject anything else)

Step 5 โ€” Select Which Submission Counts

You must manually select your final submission

Kaggle does not automatically pick your best submission. You have up to 5 uploads per day, but only one counts toward the final leaderboard. If you do nothing, Kaggle will use its own automatic selection โ€” do not rely on this.

๐Ÿ”ด
Critical: Manual selection required.

After uploading, go to the "My Submissions" tab and click the checkbox next to the submission you want to use as your official entry. Only the submission you explicitly mark will count toward scoring once the tournament begins.

๐Ÿ“‹
Click the "My Submissions" tab on the competition page after uploading.
Important
โ˜‘๏ธ
Find the submission you want to count and check the box (or click "Use for final score") next to it.
Important
๐Ÿ—“๏ธ
You can re-submit and re-select any time before the tournaments start. After tipoff, selections lock.
Optional

Step 6 โ€” Submission Limit & Strategy

5 submissions per day

Kaggle enforces a limit of 5 submissions per 24-hour window. Each submission resets the same time of day it was first submitted. Plan accordingly.

โœ… Good strategy
  • Submit the current file once to confirm the format is accepted
  • Save remaining submissions for after retraining or tuning
  • Add a description to each submission so you can compare them
  • Always manually select your best submission before tipoff
โŒ Avoid these
  • Submitting the same file repeatedly โ€” wastes your daily limit
  • Relying on Kaggle's automatic selection โ€” always pick manually
  • Submitting CMU-format files here โ€” they will be rejected (wrong columns)
  • Forgetting to select your submission before the tournament starts

Kaggle vs CMU โ€” Quick Reference

These two competitions are independent. Use the right file for the right platform.

Property ๐Ÿ† Kaggle (this page) ๐ŸŽ“ CMU MMML (submit.html)
Website kaggle.com cs.cmu.edu/~reids/mmml/
Files 1 combined file 2 separate files (M + W)
File name(s) KaggleSubmission.csv MNCAATourneyPredictions.csv
WNCAATourneyPredictions.csv
Column format ID, Pred WTeamID, LTeamID
Row count 132,133 rows 72,010 rows (M) + 71,253 rows (W)
What each row represents Win probability for lower-ID team Predicted winner and loser of a matchup
Scoring method Brier score (MSE of probabilities) Bracket points (standard NCAA scoring)
Submission limit 5 per day Once (via team captain before deadline)
Who submits Each team member directly on Kaggle Team captain submits on behalf of the group
๐Ÿ† Submit on Kaggle ๐Ÿ“ฅ Download KaggleSubmission.csv ๐ŸŽ“ Go to CMU Submit Page