Reading a Symfony Codebase You Did Not Write

Onboarding into a Symfony application someone else built is a skill that nobody teaches and most engineers learn the slow way: open random files, read until they are confused, hope a pattern emerges, give up, ask a colleague. Six weeks later they are productive, and they could not tell you why.

I have onboarded into roughly thirty Symfony applications over the last decade as a consultant. Some of them were two-week engagements where I had to be useful by Wednesday. The pattern of how to read an unfamiliar Symfony codebase has converged. This essay is the route I walk, in the order I walk it, with the questions I ask at each step.

This is not a tutorial on Symfony. It assumes you know the framework. It is about the meta-skill of mapping an application you have never seen, and arriving at the kind of understanding that lets you confidently change the parts that need changing without breaking the parts that do not.

Step 1: never start in `src/`

The instinct is to open src/ and start reading. Resist it. src/ is where the application’s logic lives, but it is not where the application’s shape lives. The shape lives in five files at the project root or near it.

Read these, in this order:

composer.json. What dependencies are loaded? Symfony version, Doctrine version, PHP version, but also the long tail. The presence of api-platform/core tells you the API surface is probably resource-shaped. The presence of oneup/uploader-bundle tells you uploads exist somewhere. The presence of nelmio/api-doc-bundle tells you somebody has cared about API documentation. The presence of three different test frameworks tells you something else.

config/bundles.php. The bundles list is a more concise summary of “what does this application know how to do” than composer.json. Symfony Mailer registered? There is a notification system somewhere. Workflow component registered? There are state machines. SecurityBundle obviously registered, but how many firewalls? Skim and note.

config/services.yaml and the contents of config/packages/. This is where the application’s character lives. Default service configuration, autowiring exceptions, custom container parameters, environment-specific overrides. After ten minutes here you know what the team’s conventions are: are services autowired? Are interfaces aliased to implementations? Are there a lot of compiler passes?

config/routes.yaml and any route configuration in config/routes/. What URL surface does the application expose? Is it MVC controllers, API Platform resources, or both? Are there route prefixes that segment the app (/admin, /api/v1, /webhook)? The route table is the application’s table of contents.

.env and .env.local.dist (or whichever file is committed). What does the application configure from the environment? Database, mail transport, third-party API keys, feature flags. This tells you what external dependencies exist, which is often a more honest answer than composer.json (which lists what could be used) or service configuration (which lists what is wired).

By the end of these five files, you should be able to answer:

What kind of application is this? (CLI tool, classic web app, API, hybrid?)
What does it talk to? (Database, mail, third-party APIs, queues?)
What is the team’s coding style? (Autowired heavily? Configured explicitly? Lots of compiler passes?)

That is the orientation. Now you can open src/.

Step 2: walk the entrypoints, not the modules

A Symfony application has exactly four kinds of entrypoint: HTTP requests, console commands, message handlers, and event subscribers (which trigger from one of the first three). Everything else is something one of those four called.

Read them in this order:

Controllers and routes. Open src/Controller/. You are not reading the implementations yet; you are listing them. What endpoints exist? What do their names suggest? Cluster them: src/Controller/Api/User/* is one cluster, src/Controller/Admin/* is another. After fifteen minutes, draw the cluster map: “the application has, roughly, five surfaces: public web, customer dashboard, admin panel, public REST API, and webhook receivers.”

Console commands. Open src/Command/. These are the cron jobs, the data imports, the maintenance scripts. The command list often reveals architecture better than the controllers do, because commands are the parts of the application that operations engineers care about. A app:billing:run command tells you billing is async; a app:fix:orphaned-orders command tells you there has been a class of bug worth a permanent fix script.

Message handlers. Open src/MessageHandler/ (or wherever the team places them). These are the asynchronous operations. Each handler is a story: “when X happens, we do Y.” The list of handlers is the list of side effects the application performs out of band. If there are 30 handlers and you are the one being asked to add an integration, you read this directory carefully.

Event subscribers. Open src/EventSubscriber/ and grep for #[AsEventListener]. Subscribers are the framework’s way of saying “this runs as a side effect of something else.” Doctrine lifecycle events, kernel events, custom domain events. Subscribers are where coupling lives that does not show up in stack traces, and they are where surprises live. Read these now so they do not surprise you later.

For each entrypoint, do not read the implementation. Read the constructor, the docblock, and the first three lines of the body. You are looking for: what services does it pull in, what does its name claim it does, what does it actually do at the top level. That is enough.

By the end of this step you should be able to draw an entrypoint map: a list of every way a request can enter the application, what it claims to do, and what its first-order dependencies are.

Step 3: follow the data, not the call graph

The temptation now is to start reading implementations top-down: pick a controller, follow its calls, read the services it uses, read the services they use. This produces a picture of the call graph, which is interesting but not useful.

What is useful: the data graph. What entities exist, what they look like, what relationships they have, where they live in the application’s lifecycle.

Open src/Entity/. Read the entities in this order:

The entity that is mentioned in the most relationships. This is usually User, Account, Tenant, or Organization. It is the application’s centre of gravity. Read it carefully.
The entity with the most fields. This is usually the application’s most important domain object: Order, Article, Project, depending on the domain. Read it carefully.
The entities that relate to those two. Skim. Note the relationships.
Everything else. Skim. You are looking for “interesting” entities: ones with state machines, ones that look like audit logs, ones that look like outbox patterns.

The entity model is the application’s domain model, even when the application does not call it that. Two hours reading the entities is more orienting than two days reading the controllers, because every controller, command, and handler is operating on these objects.

Then open migrations/. Read the most recent five or six. The migration history is the development history. You can see what features were added recently, what got renamed, what was deprecated. A Version20240312_AddTenantToEverything migration tells you the application was made multi-tenant in March 2024, and there will be a corresponding architectural decision somewhere in the codebase.

Step 4: ask the questions that find the load-bearing code

By now you have a map of entrypoints and a map of the data model. The next step is to find the parts of the code that are load-bearing, which are not always the parts that look interesting.

The questions I ask, in roughly this order:

Where is the security? config/packages/security.yaml. Read every firewall, every access control rule, every voter. This is where the application makes the trust decisions, and where mistakes have the highest cost. If there is a custom authenticator, read it; custom authenticators are where security bugs hide.

Where is the money? Greppable terms: payment, invoice, charge, subscription, stripe, paypal. The money path is the highest-stakes code in most applications, and the team usually knows it well, but you should know it too. Read every controller, command, and handler involved.

Where is the multi-tenancy? Greppable: tenant, organization, account_id. The tenancy model is the most likely source of cross-customer data leaks. Read Doctrine filter configurations (typically classes extending Doctrine\ORM\Query\Filter\SQLFilter), read the repositories, look for places where queries are built without going through the tenant filter.

Where are the integrations? src/Bridge/, src/Integration/, src/External/, or wherever the team has put them. Each external service is a place where a) the contract can change without the team noticing, and b) bugs become hard to debug because the data is not in the application’s database. Read the integration adapters; understand what timeouts, retry policies, and error handling they have.

Where is the queue? config/packages/messenger.yaml plus the message handlers. What transports exist? What is async? What is the retry policy? Async work is where bugs hide that take three days to reproduce.

Where are the cron jobs? crontab, or config/packages/scheduler.yaml, or whatever scheduling mechanism is in use. The cron list is the application’s heartbeat. If a cron stops running, what stops working?

These six questions answer “where is this application most likely to break, and where will the breakage hurt the most.” The intersection of those is the load-bearing code.

Step 5: read the seams, not the surface

The last step before you start changing code is reading the seams. A seam is a place where the application’s behaviour can be changed without editing the surrounding code: dependency injection points, event subscribers, decorator definitions, compiler passes. Seams are the levers.

Greppable patterns:

Bash

grep -r "AsDecorator" src/                # decorator definitions
grep -r "decorates:" config/              # the YAML-config equivalent
grep -r "AsEventListener" src/            # event listeners
grep -r "AutowireDecorated" src/          # services that decorate others
grep -r "interface " src/                 # the application's own interfaces

Every interface in src/ is a question: what implements it, and is that implementation swappable? The answer is the difference between “I can change behaviour by writing a new class” and “I have to find every caller and change them all.”

A useful pattern: when an interface has only one implementation, ask why the interface exists. Sometimes the answer is “for testing” (legitimate, indicates a seam intended for test doubles). Sometimes it is “we wanted to decouple but never did the second thing.” Sometimes it is leftover from an architecture decision that was abandoned. Each answer tells you something about the codebase’s history.

The decorator pattern, in particular, is worth looking for explicitly. A decorated cache, a decorated mailer, a decorated security voter: each one is a place where the team has bent the framework’s behaviour without editing the framework’s code, and each one is something you need to know about before you start adding behaviour of your own.

Step 6: the diagnostic questions

After you have done all of the above, you should be able to answer these without grepping further. If you cannot, go back to the relevant step.

What kind of application is this?
What are its main entrypoints, grouped into clusters?
What is the centre of the data model?
Where is multi-tenancy enforced (if at all)?
Where is the money path?
What integrations does it have, and what are their failure modes?
What runs asynchronously, and what is the retry policy?
What runs on cron, and what stops working if cron stops?
What seams exist for changing behaviour without editing code?
What is the team’s testing posture, and where are the tests they trust?

Twenty answers, ten minutes each to find. That is the orientation budget. Spending it up front saves you the slow accumulation of “I learned this when I broke it” knowledge that costs three months of false starts and avoidable incidents.

What to do with the map

The point of the map is not to admire it. It is to make better decisions about three things:

Where to add behaviour. A new feature that lives close to the centre of the data model, uses an existing seam, and runs on an established cron is cheap. The same feature that requires touching the security configuration, the queue routing, and a poorly tested integration is expensive. The map tells you which one you are looking at before you commit to a delivery date.

Where to refactor. Refactoring legacy code is expensive and risky. Refactoring code that is not load-bearing is cheap and safe. The map tells you which is which. The seventh-most-trafficked utility class can be refactored over a weekend without much risk; the order processing handler cannot, and any team that thinks it can has not read the entrypoint map.

What to leave alone. The most important output of the map is the list of code you are not going to touch. Some parts of an inherited Symfony codebase have survived three years of production traffic, two team turnovers, and a major version upgrade. They are weird, but they work. The map tells you when “weird but works” is the correct verdict and when it is the symptom of a problem worth fixing.

A senior engineer’s superpower is not knowing the framework better than the next person. It is reading an unfamiliar codebase fast enough to make a useful change in the second week instead of the second month. Spending a week mapping the codebase before changing anything looks slower at the start of the engagement and is faster by the third week. I have run this experiment thirty times. The teams that do not skip the mapping always win the schedule, even when they look slower in the first sprint.

If you are about to onboard into an unfamiliar Symfony codebase and want a structured second opinion before you commit to a roadmap, our monolith modernisation engagement includes a one-week codebase audit that produces this map for you, with annotated entrypoints, a data-model overview, and a load-bearing-code list scored by criticality.

References

Symfony best practices : the framework’s official conventions, useful for spotting where a codebase agrees with or departs from them.
Doctrine ORM filter documentation : the multi-tenancy enforcement mechanism most Symfony codebases use, and one of the first places to check when reviewing tenant isolation.
Symfony Messenger documentation : reference for the asynchronous component, including transports and retry strategies, often the highest-leverage subsystem in a codebase.
Symfony service decoration : the framework’s official decorator pattern, the most common seam in mature codebases.
Working Effectively With Legacy Code by Michael Feathers : the canonical reference on seams and on changing code you did not write.

Reading a Symfony Codebase You Did Not Write

Step 1: never start in `src/`

Step 2: walk the entrypoints, not the modules

Step 3: follow the data, not the call graph

Step 4: ask the questions that find the load-bearing code

Step 5: read the seams, not the surface

Step 6: the diagnostic questions

What to do with the map

References

Strangler Fig for Symfony Monoliths: A Practical Playbook

API Platform or Hand-Rolled REST: A Decision Framework

Multi-Tenant Symfony: Three Patterns and Their Real Costs

Bounded Contexts in Symfony Without the DDD Theatre

Ready to Fix Your Architecture?

#Step 1: never start in src/

#Step 2: walk the entrypoints, not the modules

#Step 3: follow the data, not the call graph

#Step 4: ask the questions that find the load-bearing code

#Step 5: read the seams, not the surface

#Step 6: the diagnostic questions

#What to do with the map

#References

Related reading

Strangler Fig for Symfony Monoliths: A Practical Playbook

API Platform or Hand-Rolled REST: A Decision Framework

Multi-Tenant Symfony: Three Patterns and Their Real Costs

Bounded Contexts in Symfony Without the DDD Theatre

Ready to Fix Your Architecture?

Step 1: never start in `src/`

Step 2: walk the entrypoints, not the modules

Step 3: follow the data, not the call graph

Step 4: ask the questions that find the load-bearing code

Step 5: read the seams, not the surface

Step 6: the diagnostic questions

What to do with the map

References