How to Effectively Read and Analyze Research Papers: A Practical Guide
Mastering Academic Literature with a Structured Approach and Hands-On Example
Research papers are a cornerstone of academic and scientific progress, providing insights into new findings, methodologies, and theoretical advances. However, the sheer volume of papers and their often complex nature can make reading and understanding them a daunting task for researchers. This article outlines a structured approach to reading research papers, drawing inspiration from S. Keshav's "How to Read a Paper" and extending it into a general framework. We then apply this framework to a practical example using the "DeepSeekMath" paper by Shao et al., to demonstrate how key insights can be efficiently extracted.
Keshav's Three-Pass Approach: A Foundation for Paper Reading
In his seminal work, "How to Read a Paper," S. Keshav, a professor at the University of Waterloo, proposes a three-pass method to help researchers systematically read and understand research papers. This disciplined approach is particularly useful for managing time and understanding papers at different levels of depth, depending on the reader's goals. The following is an overview of Keshav's method:
First Pass: Getting the Big Picture (5-10 minutes)
Objective: Gain a high-level understanding of the paper without delving into the details.
Steps:
Read the title, abstract, and introduction.
Skim the section headings and conclusions.
Look at the figures, tables, and references.
Outcome: After this review, you should be able to answer basic questions such as: What is the problem being addressed? What are the major contributions? Is this work relevant to my research?
Decision Point: Decide whether to proceed further based on relevance and interest.
Second Pass: Comprehending the Content (Up to 1 hour)
Objective: Understand the paper's content and arguments of the paper in more detail.
Steps:
Read the paper more carefully, focusing on the main body of the paper (excluding evidence or complicated details).
Look closely at figures, graphs, and tables closely to understand the evidence.
Note key ideas, contributions, and supporting arguments.
Outcome: You should be able to summarize the paper's main thrust and evidence of the paper to someone else. This level of understanding is sufficient for papers outside your immediate research area.
Next Steps: If the paper remains unclear, you may need to find background material or move on to the third pass.
Third Pass: Deep Understanding (4-5 hours for beginners, ~1 hour for experts)
Objective: To fully understand the paper, including its assumptions, methodology, and limitations.
Steps:
Virtually reimplement the paper by reconstructing the authors' reasoning and assumptions.
Challenge each claim, identify implicit assumptions, and evaluate the strength of the evidence.
Consider how you would present the ideas differently.
Outcome: You should be able to reconstruct the structure of the paper, identify its strengths and weaknesses, and identify potential improvements or future work. This run-through is essential for reviewers or researchers who are deeply involved in the topic.
Keshav's approach is particularly valuable because it allows flexibility. For example, a quick first pass can help filter out irrelevant papers, while a thorough third pass is reserved for critical analysis, such as when reviewing or building upon a study. This method also helps in conducting literature reviews by iteratively identifying key papers and researchers in a field.
A General Approach to Reading Research Papers
While Keshav's method provides a solid foundation, a general approach to reading research papers can be adapted to different disciplines and purposes. Below is a comprehensive framework that builds on Keshav's ideas and incorporates additional considerations for effective analysis:
Preparation: Define Your Purpose and Context
Why are you reading this paper? Are you doing a literature review, looking for a solution to a specific problem, or reviewing it for a journal? Your purpose will determine the depth of your reading.
What is your background knowledge? Assess your familiarity with the topic and identify gaps that may require additional reading.
Tools and Resources: Have a note-taking system (such as digital tools like Zotero or Notion, I use LogSeq) and access to references or background materials.
Skimming: Evaluate Relevance and Structure (First Pass)
Key sections: Read the title, abstract, introduction, and conclusion to understand the paper's purpose, contributions, and findings of the paper.
Visual Cues: Examine figures, tables, and section headings to get a sense of the paper's organization and evidence.
Questions to answer:
What problem does the paper address?
What are the main findings or contributions?
How does this relate to my interests or research?
Outcome: Decide whether to proceed based on relevance and quality.
Detailed Reading: Understand the focus areas of the content (Second Pass)
Carefully read the introduction, methods, results, and discussion sections. Pay attention to the problem statement, methodology, and key findings.
Evidence Analysis: Study the figures, tables, and data to evaluate the strength of the arguments.
Critical Thinking: Note any assumptions, limitations, or gaps in the reasoning. Identify the novelty of the paper compared to previous studies.
Results: Summarize the paper's main ideas and evidence of the paper in your own words. This step is sufficient for general understanding or background knowledge.
Critical Analysis: Evaluate and Synthesize (Third Pass)
Deep Dive: Reconstruct the authors' argument, question every assumption, and verify the methodology and results.
Evaluation: Assess the paper's strengths (e.g., novelty, rigor) and weaknesses (e.g., unaddressed limitations, methodological flaws).
Synthesis: Compare the paper with related work, consider alternative approaches, and identify implications or future research directions.
Outcome: Gain a comprehensive understanding of the paper so that you can critique it thoroughly or incorporate its findings into your own research.
Documentation: Record and reflect notes
Summarize key points, including the problem, contributions, methods, results, and your critiques. Use a consistent format for easy reference.
Citations: Note references for further research and to track the paper's place in the literature.
Reflection: Consider how the paper informs your research or practice. Identify actionable findings or questions for future investigation.
This general approach is flexible and can be adjusted to different disciplines (e.g., experimental sciences vs. theoretical studies) and goals (e.g., quick review vs. in-depth analysis). It emphasizes preparation, critical thinking, and documentation to ensure that the reading process is both efficient and productive.
Practical Example: Applying the Approach to "DeepSeekMath"
To illustrate this framework, we apply it to the research paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" by Zhihong Shao et al. This paper presents a 7B-parameter language model optimized for mathematical reasoning that achieves impressive performance on benchmarks such as MATH and GSM8K. Below, we walk through each step of their general approach, culminating in a Python simulation of their data selection pipeline to help us better understand their methodology.
Preparation: Define Your Purpose and Context
Purpose: Understand how DeepSeekMath improves mathematical reasoning in language models and explore its relevance to AI research in mathematics.
Background: Familiarity with large language models (LLMs) and reinforcement learning (RL) is assumed, but specific techniques such as Group Relative Policy Optimization (GRPO) may require further study.
Tools: Use a digital notebook to take notes and a reference manager to track citations.
Skimming: Assessing Relevance and Structure (First Pass)
Key Sections:
Title: "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" suggests a focus on advancing mathematical reasoning in open-source LLMs.
Abstract: Highlights the model's 51.7% score on the MATH benchmark, its pre-training on 120B math tokens from Common Crawl, and its use of GRPO, suggesting significant contributions to mathematical reasoning.
Introduction: Discusses the challenge of mathematical reasoning for LLMs and positions DeepSeekMath as a competitive open source alternative to models such as GPT-4.
Conclusion: Emphasizes the model's strengths, limitations (e.g., weaker geometry reasoning), and future directions.
Visual Cues:
Figure 1 shows the model's performance compared to other open source models.
Tables (e.g., Table 5) provide benchmark results.
Questions Answered:
Problem: Improving mathematical reasoning in LLMs.
Contributions: A 7B model pre-trained on a large mathematical corpus and improved with GRPO, achieving near state-of-the-art performance.
Relevance: Highly relevant to AI researchers interested in mathematical reasoning and open source models.
Conclusions: The paper is worth a closer look due to its novel contributions and relevance.
Detailed Reading: Understanding the Content (Second Pass)
Focus Areas:
Introduction: Outlines the challenge of mathematical reasoning and the gap between open source and proprietary models.
Methods: Describes the creation of the DeepSeekMath corpus (120B tokens from Common Crawl), pre-training on DeepSeek-Coder-Base-v1.5, instruction tuning, and the GRPO algorithm.
Results: DeepSeekMath-Base achieves 64.2% on GSM8K and 36.2% on MATH, while DeepSeekMath-RL achieves 88.2% and 51.7%, respectively (Table 5).
Discussion: Highlights the effectiveness of web data and GRPO, with limitations in geometry and few-shot learning.
Evidence Analysis:
Table 2 compares DeepSeekMath-Base with other base models, showing superior performance despite its smaller size.
Table 5 shows DeepSeekMath-RL's advantage over open source and some closed source models.
Critical Thinking:
The use of Common Crawl data is innovative, but the quality control process (e.g., fastText classifier) needs to be scrutinized.
The efficiency of GRPO's over PPO is promising, but its generalizability beyond mathematics is unclear.
Results: DeepSeekMath advances mathematical reasoning through large-scale pre-training and efficient RL, outperforming larger open-source models and approaching proprietary models.
Critical Analysis: Evaluate and Synthesize (Third Pass)
Deep Dive:
Pretraining: The iterative data collection pipeline (Section 2.1) using fastText and human annotation is rigorous, but can introduce bias if the seed corpus (OpenWebMath) is not representative.
GRPO: The algorithm (Section 4.1) eliminates the critic model, reducing memory usage, but relies on group sampling, which may not scale well for diverse tasks.
Evaluation: Benchmarks such as MATH and GSM8K are standard, but the lack of geometry-focused tests limits evaluation of the model's breadth.
Evaluation:
Strengths: High quality data curation, efficient RL with GRPO, and strong performance in quantitative reasoning.
Weaknesses: Limited geometry reasoning, weaker few-shot capabilities compared to GPT-4, and potential data selection bias.
Synthesis:
Compared to previous work (e.g., Minerva, Llemma), DeepSeekMath demonstrates that smaller models with targeted pre-training can compete with larger models.
Alternative approaches might include hybrid datasets (e.g., combining web data with arXiv) or advanced sampling strategies for RL.
Future directions could include scaling the model, improving geometry reasoning, and improving few-shot learning.
Outcome: A thorough understanding of the innovations and limitations of DeepSeekMath, with clear implications for advancing mathematical reasoning in LLM.
Simulation: to fully understand the data selection Pipeline(Section 2.1), i have implemented a Python simulation that mirrors the process of filtering mathematical content from a mock Common Crwals dataset. The code below uses a TfidfVectorizer and LogisticRegression to classify pages, iteratively refine the dataset, and deontaminate it against benchmark phrases. This hands-on approach validates the paper’s methodology and highlights practical challenges, such as classifier sensitivity to seed data size.
Documentation: Capturing and Reflecting Notes
Notes:
Problem: Improving mathematical reasoning in open source LLMs.
Contributions: DeepSeekMath Corpus (120B tokens), GRPO algorithm, 51.7% on MATH.
Methods: Pre-training on DeepSeek-Coder-Base-v1.5, instruction tuning, and GRPO-based RL.
Results: Outperforms open source models and approaches GPT-4 on MATH and GSM8K.
Critisms: Limited geometry reasoning, potential data bias, and weaker few-shot performance.
Insights from the simulation: Running the code revealed that the effectiveness of the classifier was highly dependent on the size and diversity of the speed corpus. A threshold of 0.3 and extended see daata (16 examples) ensured the collection of math-related pages, while decontamination removed specific benchmark matches, in line with the paper’s methodology.
Citations: Key references include Hendrycks et al. (2021) for MATH, Guo et al. (2024) for DeepSeek-Coder, and Schulman et al. (2017) for PPO.
Reflection: DeepSeekMath provides valuable insights into data curation and RL efficiency, inspriring further research into specialized LLMs for mathematics and beyond. The simulation reinforced the importance of robust data selection, a key takeaway for applying similar techniques in other domains.
Conclusion
In today’s world, effective reading of research papers is an important skill not only for researchers, but also for software engineers, enabling them to stay current and identify opportunities for innovation. Keshav's three-step approach provides a structured starting point that we have extended into a general framework that emphasizes preparation, critical analysis, and documentation. By applying this framework to "DeepSeekMath," including a Python simulation of its data pipeline, we demonstrated how to extract actionable insights from a complex paper, highlighting its contributions, limitations, and future directions. Anyone in their field can adapt this approach to their specific needs, ensuring efficient and impactful engagement with the vast body of academic literature.
References
Keshav, S. (2007). How to Read a Paper. ACM SIGCOMM Computer Communication Review, 37(3), 83-84.
Shao, Z., et al. (2025). DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models. arXiv Link
Hendrycks, D., et al. (2021). Measuring Mathematical Problem Solving with the MATH Dataset. arXiv Link
Schulman, J., et al. (2017). Proximal Policy Optimization Algorithms. arXiv Link