HericLdev

DataScope – Uncover the Data Potential of Any Article with AI

DataScope – Uncover the Data Potential of Any Article with AI

Rethinking Data Journalism Starting from Stories

Ajouté par : heric

✨ DataScope – Rethinking Data Journalism Starting from Stories

DataScope is a lightweight, multilingual web application designed to help journalists and media organizations unlock the hidden data potential of their news articles.

Instead of starting with structured datasets — the traditional "data-first" approach in data journalism — DataScope reverses the paradigm:
It analyzes unstructured texts to reveal entities, patterns, and editorial opportunities for data-driven storytelling.

Built using Flask, spaCy (for local NLP entity extraction), and OpenAI's LLMs (for creative suggestions), the tool makes AI-powered analysis accessible even for small newsrooms, freelance reporters, and journalism students.

This is a Minimum Viable Product (MVP) version, already deployed and publicly available, developed with future extensions and modularity in mind.


📌 Key Features

  • Smart Entity Extraction:
    Detects named entities, dates, quantities, and impactful verbs.
  • Datafication Scoring:
    Assigns a "data potential" score based on the structural richness of the article.
  • AI Editorial Suggestions:
    Generates 3 possible editorial angles and 5 open-data source suggestions to support further investigation.
  • Multilingual Analysis:
    Fully operational in English 🇬🇧 and French 🇫🇷.
  • Lightweight Deployment:
    Dockerized, cloud-deployed (Render), with minimal infrastructure needs.
  • Feedback System and Mini Admin Panel:
    Users can provide direct feedback to continuously improve the system.

🎯 Why DataScope?

Data journalism has expanded rapidly but remains largely confined to large organizations with access to technical resources.
Small newsrooms, freelance journalists, and NGOs often lack tools that bridge the gap between storytelling and data-driven exploration.

DataScope democratizes this opportunity:

  • It offers an accessible first step for journalists curious about integrating data in their workflow.
  • It empowers editorial intuition rather than replacing it.
  • It encourages open-data exploration even when starting from minimal leads.

By focusing on "data-augmenting storytelling" rather than only "data-driven storytelling", DataScope promotes a hybrid model of journalism enhanced by technology, not dictated by it.


📈 Current Status

  • MVP live and tested publicly.
  • Source code open under MIT License.
  • Ongoing iterations based on real user feedback (language refinements, export formats, UI improvements).

🌱 Next Steps

  • Expansion to additional languages (Spanish, German).
  • Introduction of basic data visualization features (timeline of dates, map of locations).
  • Creation of an editorial dashboard to track analysis results over time.
  • Collaboration opportunities with journalism schools and non-profit organizations.

🔗 Useful Links

Heric

Photo de profil

Hello and welcome to my blog. My name is Héric Libong. After a long career as a journalist, I decided 7 years ago to move into data science and then software development. Python is my main programming language. I'm an expert in Web Scraping and data extraction techniques. My development environment is the Django framework.


Skills
Django Flask Git Python SQL PostgreSQL Django Rest Framework Docker Github Action OpenSource

Social Network/websites