/ Documenting ML Projects / Write and defend
How to write and defend your machine learning project or thesis
The structure that academic committees and technical teams expect — and that nobody teaches you explicitly.
The problem statement section is the one that gets rewritten the most in real projects. Not because it is the most technically difficult, but because at the start of the project the problem is not clear — and that lack of clarity persists until someone reads it from the outside and asks "but what exactly are you solving?"
An ML project has components that do not exist in classical software engineering or traditional science. It has data decisions that determine what the model can and cannot learn. It has metric justifications that depend on the domain, not the technique. It has limitation analyses that are not optional but part of the result. No generic thesis template captures all of this.
This article gives the structure that works for applied ML projects, both in academic and business contexts. It also gives the guide for the oral defense, because writing and defending are two distinct skills that require different preparation.
Table of Contents:
Why writing an ML project is different from writing software or traditional science
An ML project is hybrid: it has empirical rigor (like science), it produces executable artifacts (like software engineering), and its results depend on data that changes (unlike both). That hybrid nature means none of the standard structures apply directly.
In software engineering, the product is the code and the documentation describes how it works. In ML, the code is only the vehicle — the product is the model and the decisions that produced it. In classical experimental science, there is a falsifiable hypothesis and an experiment designed to test it. In applied ML, the goal is to solve a practical problem, and modeling decisions are justified by their effect on that goal, not by a prior theoretical framework.
This means that the documentation of an ML project must answer questions that neither an engineering thesis nor a scientific paper is designed to answer: why this dataset and not another? Why this model and not the alternatives? What are the consequences of each type of error in the real context? What can this model not do that might reasonably be expected of it?
Stuck on the writing of your project or thesis?
If you have the model ready but don't know how to structure the document, or if you already have a draft but are not sure whether it meets what the committee expects, I can review it and give you specific feedback on what to strengthen.
Finish your project already
You've taken courses… but don't know how to apply it
92% of data professionals unblock their projects by seeing complete solved examples.
No sign-up · Instant access
The recommended structure for an applied ML project
Section 1: Problem statement and objective
This section answers three questions: what the real problem is, why ML is the correct tool for this specific problem, and what decision or action the results enable. The most frequent mistake is starting with the model — "I used Random Forest to predict X" — instead of the problem: "the business needs to identify which customers have a high probability of churning before it happens, in order to intervene with limited resources."
The problem statement section is the one that gets rewritten the most because at the start of the project the problem is generally not fully clear. In projects with real stakeholders, it is common for the first version of this section to change substantially after the first alignment meetings. Documenting that evolution — how the problem was redefined as the data was better understood — is also part of the work.
Section 2: Data — origin, preprocessing, and decisions made
This is the most underestimated section and the one that can cause the most damage if done poorly. What must be documented: data source and its limitations, cleaning process with non-obvious decisions justified, what features were built and why, and how the split between training, validation, and test was done.
The mistake that destroys credibility fastest is not documenting the split process or mentioning cross-validation without justifying it. If the committee asks "how do you guarantee there is no data leakage?" and the answer is vague, the work loses credibility regardless of how good the metrics are.
A detail that makes a difference: documenting the exact versions of the libraries used. A model trained with scikit-learn 1.2 may not reproduce in an environment with scikit-learn 1.4. That is the kind of error that appears months after deployment and that nobody can trace if it is not documented.
Section 3: Methodology — the selected model and why
It is not enough to say "I used Random Forest." You must justify why Random Forest over the reasonable alternatives for this problem. That means naming at least two or three alternatives considered and briefly explaining why they were ruled out: greater complexity without justified improvement, an interpretability requirement the model does not meet, or simply that the simplest baseline worked well.
This justification does not need to be exhaustive — it is not a literature review. It needs to show that the choice was deliberate, not arbitrary. An academic committee that asks "why didn't you use a neural network?" expects to hear a reason, not silence.
Section 4: Results and analysis
This section has two parts that are frequently confused: the metrics report and the analysis of what they mean. The report without analysis is a table. The analysis without a report is speculation. Both together are the result.
What is mandatory: comparison with a baseline, metrics with variance (not just the cross-validation average), and at least one analysis of the model's errors — not just how many it gets right but where it fails and what that implies for real use.
What is frequently omitted but adds real value: subgroup analysis. A model can have excellent global F1 and be systematically worse on the most critical segment of the problem. That does not appear in the global number — it only appears when you look for it explicitly.
Section 5: Limitations and future work
Well-documented limitations strengthen the work, they do not weaken it. A project that honestly admits "the model struggles with data from periods of high volatility because it had no examples of that scenario in training" demonstrates understanding of the problem. A project that presents the model as if it had no limitations raises suspicion.
The three types of limitations that must always appear: data limitations (what biases or gaps exist in the dataset), model limitations (what types of errors it makes systematically and why), and generalization limitations (in what contexts or populations the model might not perform the same way). Future work must flow directly from these limitations, not be a generic list of possible improvements.
How to prepare the oral defense
What academic committees ask
Academic committees have recurring questions that are worth preparing for in advance. The most frequent: "what if you had had more data?" There is no single answer — the correct answer depends on why the project has the data it has and what specific limitation more data would improve. But whoever has not thought about it before entering the room usually gives a vague answer that hurts the evaluation.
Other questions that appear consistently: why that metric was chosen and not another, how you guarantee there is no data leakage, what would happen if the model is used in a context slightly different from the training context, and whether the results are statistically significant or could be noise. Having prepared answers to these four does not guarantee a perfect defense, but covers most of the attack vectors of an average committee.
How to respond when you don't know the answer: say it directly and propose how you would find out. "I don't know for certain, but I would evaluate it by doing X" demonstrates methodological maturity. Trying to improvise an incorrect technical answer in an oral defense is the most costly mistake you can make.
What technical teams in companies ask
The key difference from an academic defense is the focus: technical teams in companies evaluate operational utility, not methodological rigor. The most frequent questions in technical interviews about personal projects: what business problem the model solves, what would happen if input data arrived in a different format than expected, and how the operations team would know the model is failing.
These three questions are not technical in the algorithmic sense — they are systems questions. Whoever can answer them fluently demonstrates that they think of the project as a complete product, not as a notebook exercise.
Template structure for an applied ML project or thesis
The structure that covers all the requirements described in this article, adaptable to academic and business projects:
- 1. Problem statement: real problem · why ML · what decision the result enables
- 2. Data: source · cleaning process · feature decisions · documented split · library versions
- 3. Methodology: chosen model · discarded alternatives and why · training process
- 4. Results: comparison with baseline · metrics with variance · error analysis · subgroup analysis if applicable
- 5. Limitations: data · model · generalization · future work derived from real limitations
If you need the template in editable format for your specific project or thesis, I can send it to you directly: contact.
Frequently asked questions about writing and defending ML projects
How many pages should a master's ML thesis be?
There is no universal standard. A well-structured applied project can be solid in 60-80 pages. What matters is not the length but that each section justifies its presence. A long thesis with empty sections is worse than a short one with every section fully developed.
Do I need to include the full code in the thesis?
No. The full code goes in a GitHub repository linked from the thesis. In the written document, snippets are included only when they illustrate a specific technical decision. What must be in the thesis: library versions and enough detail for the experiment to be reproducible.
Can I defend a project with mediocre results?
Yes. What is evaluated is whether the methodological process is rigorous and whether the author understands why they got the results they got. A project with mediocre but well-analyzed results is usually evaluated better than one with good results but no ability to defend them.
What is the difference between a thesis project and a portfolio project?
The audience and the purpose. A thesis evaluates methodological rigor. A portfolio project communicates the ability to solve real problems to recruiters. The same work can be adapted to both formats, but requires significant rewriting of how the problem and results are presented.
If you have a finished or in-progress project and need help structuring it, reviewing it before the defense, or adapting it for a technical interview, I can work on it with you directly.
Finish your project already
You've taken courses… but don't know how to apply it
92% of data professionals unblock their projects by seeing complete solved examples.
No sign-up · Instant access