An In-House Scientific Article Summarizer I Built at Beijing Nievedor


I built an internal scientific article summarizer at Beijing Nievedor Intelligent Technology Co., Ltd. to make AI papers easier to consume for non-experts.

It’s an in-house tool, so no links or code here — just a quick portfolio-style overview.

What I Built

A bilingual (Chinese/English) summarization app that turns papers and articles into readable summaries with user-controlled output.

Key Features

  • Multilingual end-to-end: UI, input, and output support Chinese and English
  • Flexible input: PDF upload, plain text, URL-to-Markdown, or direct text (including multi-file)
  • Custom summaries: choose length and style (layman-friendly, technical, bullet points)
  • Visual understanding: key figures/plots are identified with PDF-Extract-Kit, cropped out, and embedded into the final summary with inline references
  • Performance at scale: The large PDF bottleneck was solved by adding a preprocessing step where extracted text is used to identify the pages with the most important information and figures, passing only the top 10 pages to the heavier figure detection models.
  • Feedback loop: bilingual ratings + comments stored to iterate on quality

Technical Stack (High Level)

  • Python + Streamlit UI
  • LangChain / LangGraph for pipeline orchestration (stateful, debuggable)
  • PDF-Extract-Kit for figure/plot detection and cropping
  • HTML parsing to Markdown
  • Model API for summarization + multimodal visual descriptions

Deployment

Runs on a local-network machine in the office with secrets managed via environment variables.

What’s Next

Better audience adaptation, more output formats, and smarter performance handling for very large PDFs.