This is a robust news aggregator platform designed to gather and present industry-specific news from various online publishers. The platform leverages Google News RSS feeds to fetch daily news articles tailored to specific industries or keywords.
Core Features:
- News Scraping: Automatically scrapes news articles from the RSS feeds of various publishers.
- News Cleaning: Processes raw content to remove duplicates, irrelevant data, and formatting issues.
- News Classification: Uses a fine-tuned NLP model based on BART (Bidirectional and Auto-Regressive Transformers) to categorize and summarize news content by topic, sentiment, or relevance.
- Translation: Translate to English from the original language using Google Translate APIs.
- Database Storage: Structured and cleaned news items are stored in a database for quick access and querying.
User Features:
- Super-Fast Search: Using Algolia, the platform provides users with lightning-fast, full-text search capabilities on the frontend.
- Bookmarking: Users can save news articles for future reading.
- Engagement Options: Users can like or dislike articles to personalize their experience.
- Personalized Feed: Based on user preferences and interactions, it has a My News Section.
This platform streamlines industry news discovery and offers an intelligent, user-centric experience powered by modern NLP and data processing techniques.
Screenshots
Technology used
- Python
- NLP
- Algolia
- Postgres
- Laravel