Eight Lessons Learned in Two Years of Ph.D.

When I started my program two years ago, I started a habit that proved to be quite beneficial: I dedicated a page in my research notebook titled Lessons Learned. I made sure to update this page on a semi-regular basis, typically whenever I learn something from my mentors/advisors, reflect on my progress, or come across an “Aha” moment about something that I should have done differently. This blog post elaborates and expands on some of the key entries on that notebook page. As it is almost the end of the summer (sad, I know), I hope this article is timely for the many students who are starting their Ph.D. in the fall. You will find that the first four lessons in this article are high-level, more abstract, and related to the way we should view ourselves as Ph.D. students and our research. The last four lessons include more practical advice that you can adopt on a day-to-day basis.

Prior to beginning, I’d like to express gratitude to Prof. Rada Mihalcea for invaluable feedback and to Aurelia Bunescu and Katsumi Ibaraki for proofreading.

Note: This blog post was also published at the Michigan AI blog.

1. Enjoy the process, not the product

Getting a Ph.D. is a long journey and not only because it is 4-6 years long, but also because it’s hard and this difficulty contributes significantly to its length. In my interview with Wilka Carvalho, a recent Ph.D. graduate from CSE, Wilka described the Ph.D. as the hardest thing he’s ever done. Now I would argue that the difficulty of doing a Ph.D. largely stems from one thing: being so focused on the end product and not enjoying the process of getting there. The end product is usually getting a paper accepted, passing the prelims, or a successful defense. The process is about the long nights spent debugging our code, running an error analysis, and the things we learn along the way. Under this mindset, the Ph.D. is akin to a reinforcement learning environment with extremely sparse rewards (e.g., reward = 1 if paper accepted otherwise 0). This setting is detrimental to our own happiness since the things we do 99.9% of the time are not associated with any reward of any kind.

Here I would argue that it needn’t be this way and that we can redefine our own research environment to provide plenty of intermediate rewards. Under this new mindset, every code bug we fix is a reward, every draft we write is a reward, every new tool or technique we learn is a reward, etc. This is what Thomas Edison meant when he said “I have not failed. I’ve just found 10,000 ways that won’t work”—Edison defined his own environment where failure is a reward in its own right. Not only did that make him succeed in the end, but also most likely made him enjoy his work. During a Ph.D. (or any research career for that matter) where failure is the norm, we must find satisfaction in the small daily grind. Think about it this way: you are in an environment where you are learning new things every day (and getting paid for it!), surrounded by smart and amazing people, enjoying a great amount of intellectual freedom and autonomy (basically you are your own boss), and you are expected to contribute to the advancement of science. What more can you ask for?

There is a reason why this is lesson #1: Enjoying the process is the foundation for a happy and productive Ph.D. experience. I will feel like I did a good job if that is the sole takeaway you learn from this article.

2. Understand your research area very well.

One of the defining traits of researchers who excel in their work is their profound understanding of their research area. Rushing to generate ideas and publish papers might seem tempting, but neglecting foundational understanding will impede long-term progress. The lesson here is to immerse oneself in the literature and read extensively about your research area. In the book The Five Elements of Effective Thinking—one of my favorite books and an absolute must-read for researchers—the first foundational element is to understand things deeply. The book argues that understanding something is a spectrum. And more understanding of a given concept can always be achieved. One example of this is that my co-advisor, Honglak Lee, once sat with many of us for two hours straight going through the derivation of the training objective of diffusion modes. Honglak appreciates the importance of understanding things at the foundational level and this appreciation most likely contributed to his successful career. I have to admit that in my first year of Ph.D., I did not allocate sufficient time to understand my research domain well enough, mostly because I was rushed to come up with a working idea and publish a paper.

I happened to be one of the students who met with Yang Song while he was giving a faculty candidate talk here at the CSE department. Yang has had a successful Ph.D. by many measures. During the meeting, I asked him to provide one piece of advice for first-year Ph.D. students. His advice was to “Read as many papers as you can in your area and to not worry about publishing in your early years.” At that time it did not make much sense, but now in hindsight, I think this was very valuable advice. Reading a few of the recent papers in your domain is hardly sufficient. What Yang was talking about is reading enough papers to cover most of the literature in your area. Needless to say, it is not only about the paper count you read, although the paper count can serve as a good indicator of how well you are engaged with the literature in your research area.

While this advice is about building mastery in your own research area, it does not imply you need to know only about your own area. There is another end of the spectrum where students can become too narrowly focused on their own direction that they are unaware of what is being done in other areas. This is something that I would advise against. There is incredible value in knowing about different, but possibly relevant areas. Many great research ideas emerge from the combination of ideas from different domains. Limiting your knowledge and skills to only one research area will likely deprive you of this opportunity.

3. Think in terms of goals, not ideas.

Early in my research career, I used to run after scattered ideas here and there. When you’re starting in a new area without a full understanding of the challenges or limitations, It is very tempting to run after a sole idea that you think will work. The main issue is that ideas have a very short lifespan; an idea is unlikely to work at first, might not be novel enough, might be easily scooped, or might turn out to be inapplicable when you finally sit down to implement it. John Schulman argues that goals have more longevity than ideas. Pursuing a research goal has at least two benefits: firstly, it prevents you from fixating on a single idea and allows you to keep an open mind about different directions that serve your goal; secondly, it reduces the likelihood of getting frustrated because an idea does not work (and trust me, most of your ideas will not work at first) since an idea is only a means to an end.

Let’s assume that you once had an idea to improve the factuality of language models by pairing them with an information retrieval component. You discussed it with your advisor, who approved the direction. You proceeded to implement it, and after two weeks of coding, the idea does not seem to work: in fact, it made the problem worse. If you are thinking in terms of ideas, you’d be easily frustrated and might give the idea a few more attempts before finally giving up and moving on to another, possibly unrelated idea, repeating the same process. However, if your focus was on the higher-level goal of making language models more factual, two scenarios could unfold. You might persist in refining the initial idea while remaining open to trying different techniques, all centered around achieving the ultimate goal. Alternatively, armed with the knowledge gained from working on the first idea, you might move on to a different idea aligned with the same goal, with a higher chance of success. On a personal note, I started developing this way of thinking when I began working with Lu Wang, my co-advisor. Whenever I would present some idea to her, Lu would usually map the idea to the high-level end goal behind it. It might take you some time to make this transition in your way of thinking from ideas to goals, but it is certainly a habit you need to cultivate.

4. Value your own research.

With much significant progress happening around us, especially in AI, it has become easy to underestimate the value of our own research. Statistically speaking, only a very small portion of the papers published every year are considered groundbreaking or “game-changing”. This is the nature of scientific research and is an instance of the Pareto Principle, which states that 80% of the change in any area is contributed to by 20% of the population. I’d argue that the tail of the distribution is even longer for AI research.

I think we all feel this way from time to time. With Large Language Models (LLMs) taking over the field and with many NLP tasks now being considered “solved”, it is easy to question the value of our own research. I felt this way in November 2022 when ChatGPT came out and I am sure I was not alone. The lesson here is despite all these changes happening around us, we should always value our own research. Obviously, this is easier said than done and here I offer you two points that should make this task easier.

First, realize that each research problem you may work on is important to some group of people, no matter how small this group is. Each paper you put out there will likely be relevant to someone. Even a simple paper describing some experiments, showing some negative results, and providing a discussion on why something does not work is still a valuable contribution to the community.

When a former Ph.D. student of Richard Feynman, one of the most renowned physicists in history, complained that he was working on problems he did not find “worthwhile”. Feynman’s response was “…The worthwhile problems are the ones you can really solve or help solve, the ones you can really contribute something to.” Feynman goes on to talk about himself: “I have worked on innumerable problems that you would call humble, but which I enjoyed and felt very good about because I sometimes could partially succeed… No problem is too small or too trivial if we can really do something about it.” Moreover, clearly this former student felt bad that he was not famous or a known name in the area, to which Feynman responded: ”You say you are a nameless man. You are not to your wife and to your child. You will not long remain so to your immediate colleagues if you can answer their simple questions when they come into your office. You are not nameless to me. Do not remain nameless to yourself.”

The second point I want you to consider is that you ought to work on the research you feel is important. While it is important to work in a direction that your advisor supports or one that is “timely” given the current research progress in the community, you also want to work on problems not only because they are easy, but because you feel there is actual value in solving them. I understand there is a risk involved in working on ambitious projects, especially during a Ph.D. when publication matters. In lesson #8, I will discuss a favorite strategy of mine to minimize this risk.

5. Use a Knowledge Management System.

As a Ph.D. student, you are constantly being bombarded with new information. You read papers or articles, you learn a new coding tool, you attend a talk or tutorial, etc. Keeping all this in your head is simply not going to work. Successful retaining of such information can only be achieved via regular reinforcement. The setup that I have been developing over the last two years comprises mainly two components: A citation management system (CBS) and a research notebook. I personally use Zotero for the former and Obsidian for the latter. Using Zotero or any other CBS is fairly straightforward. I also recommend having a browser extension to be able to quickly add any opened pdf to your library.

My research notebook consists of two essential parts. A research journal (per project), where I write down on a daily basis what I did on that day and the results I got. I also use it to manage my to-do list for each day. The other part is a bit more involved, which is an idea management system known as the Zettelkasten or the slip-box method. There are multiple articles and books on Zettelkasten so I will not go into detail here. Briefly, the main purpose of the Zettelkasten method is the effective organization of ideas in a way that facilitates idea generation. This method has a bit of a learning curve and requires a level of discipline to commit to. However, I can be very fruitful once it is integrated into your research and thinking process. I would recommend reading a book called How to Take Smart Notes by Sönke Ahrens, which was my main entry to this method.

6. Know your codebase very well.

In the realm of computational research, code serves as the foundation of experiments and results. In the words of Andrej Karpathy, code is truth. Understanding your code thoroughly is crucial for avoiding unnecessary bugs and ensuring the reproducibility of your results. Nowadays, it is very rare that you will need to implement things from scratch; There are many open-source tools and libraries that make things easier for us. However, that does not necessitate spending less time understanding how these tools work under the hood.

In my first year, I built my codebase around an existing repository that was not very well-documented. I did not give myself enough time to understand how that code worked, and as a result, debugging and implementing new ideas took up a large amount of my time. This happened because I did not have a high-level understanding of that base repository. The lesson here is to study your implementation thoroughly (even if it’s not your code) to avoid wasting time or, worse, publishing research that is not reproducible.

7. Talk about your work.

Early in my Ph.D., I did not particularly enjoy talking about my work, partially because I felt I was still early in my path and I was worried that my ideas would sound too simple or naive. This was a mistake. You should always talk about your work and even more importantly your work in progress. Discussing your work can bring at least three benefits. Like writing, it forces you to flesh out your ideas more, spot flaws in your thinking process, and refine your ideas. Second, it makes you a better communicator. As you talk to more people about your work, you start to learn how to explain concepts, what examples to cite to support your argument, and what analogies to use. This in turn will reflect on both your writing and presentation skills, making you a better researcher overall. Lastly and as a bonus, the feedback you get may be very helpful. In fact, the idea I got for my first conference paper as a Ph.D. student was during a discussion with a well-known researcher in my area.

When I mention “talking with people,” I don’t necessarily mean individuals who are experts in your field. In fact, engaging in conversations with people from diverse backgrounds can provide entirely novel perspectives. Inspiration does not abide by rules and it can easily evolve from conversations from people who know nothing about your field.

8. Lead more than one project at a time.

Now I realize this advice may be a bit controversial. I asked many of my collaborators and my friends in the department about this strategy and I have to say that the majority favored leading only one project at a time. After trying both strategies, I am absolutely in favor of leading more than one project, precisely two. I first encountered this idea when I was talking to Lili Mou, one of the best mentors I ever met. Lili argued that you should pick two projects: one project is ambitious, with a high-risk, high-reward nature, and another project is easier with a “low-hanging fruit” flavor. If your ambitious project did not work out, you can still publish your other project and make progress toward your Ph.D. If it worked out, then you can then publish two projects, one of which is a strong contribution to an important problem. While working on a single project can indeed generate more focus on that single project, the downsides are many. First, it is easy to feel down when you are stuck since that project is all you have going on. Second, you can spend (or rather waste) a lot of time waiting for experiments to finish running because you have nothing else to work on. When working on two projects, if you are stuck on one project, you move to the other; there is always something to do and you do not feel bored at all.

Now I will share with you two tricks that I learned in my second year that will make working on two projects super fun and super easy. The first trick is to work on two projects that are fairly related. This way, you can minimize implementation time by reusing code from one project into the other. That is exactly what I did last year: I was even able to repurpose a full piece of training code in one project for evaluation in the other project. The second trick is one I got from Cal Newport, which is to work on only one project per day. I achieved this by assigning each day in my week to a project while taking meeting times into consideration. This serves to minimize the danger of context-switching between the two projects. Ultimately, we are all different people and what works best for me may not work best for you. However, I recommend you try this for at least a semester and see how well it works for you.

Conclusion:

This article offered the main insights I gained over the first two years of my Ph.D. I hope you will find them useful, regardless of whether you’re still beginning your journey or preparing to graduate. While I have talked to people who told me that their Ph.D. was a terrible time, I have talked to many others who believe that their Ph.D. years were the best time of their life. Ultimately, I think it is about both your attitude towards the process as mentioned in Lesson #1, and your day-to-day habits.