Skip to main content

News aggregator platform

This is a robust news aggregator platform designed to gather and present industry-specific news from various online publishers. The platform leverages Google News RSS feeds to fetch daily news articles tailored to specific industries or keywords.

Core Features:

  • News Scraping: Automatically scrapes news articles from the RSS feeds of various publishers.
  • News Cleaning: Processes raw content to remove duplicates, irrelevant data, and formatting issues.
  • News Classification: Uses a fine-tuned NLP model based on BART (Bidirectional and Auto-Regressive Transformers) to categorize and summarize news content by topic, sentiment, or relevance.
  • Translation: Translate to English from the original language using Google Translate APIs.
  • Database Storage: Structured and cleaned news items are stored in a database for quick access and querying.

User Features:

  • Super-Fast Search: Using Algolia, the platform provides users with lightning-fast, full-text search capabilities on the frontend.
  • Bookmarking: Users can save news articles for future reading.
  • Engagement Options: Users can like or dislike articles to personalize their experience.
  • Personalized Feed: Based on user preferences and interactions, it has a My News Section.

This platform streamlines industry news discovery and offers an intelligent, user-centric experience powered by modern NLP and data processing techniques.

Screenshots

Technology used

  • Python
  • NLP
  • Algolia
  • Postgres
  • Laravel