MAJD AYOUB.SYS

Projects & Open Source

[MODULE: WORKS] // SYS_DIR: /PROJECTS_OPENSOURCE/ACTIVE

Selected projects and contributions showing how I build AI, backend, security, and open source systems.

terminal CORE_SYS // PROJECT_01
CLASSIFICATION: GRADUATION_PROJECT // DEFENDED: JAN_2026

TruthLens: An AI-Assisted Credibility Assessment Platform

person_search Academic Supervisor: Eng. Nawal Al-Zabin
school Institution: Al-Balqa Applied University (BAU)

Strategic Vision: Complementary Intelligence. Built as a Decision Support System (DSS) that augments human judgment rather than replacing it. It delivers multi-dimensional, bilingual (Arabic & English) analysis synthesizing linguistic patterns, contextual cues, and source credibility against high-velocity misinformation.

TruthLens Interface Preview
Interface_Preview_01.png Res: 1080p
Core Ownership // Graduation Project
Project Origin + Technical Ownership

Original Concept + End-to-End Technical Owner

Proposed the original TruthLens concept and led the core technical implementation across backend architecture, AI/NLP orchestration, Celery/Redis asynchronous processing, ChromaDB-backed retrieval, Docker-based local deployment, frontend integration support, technical documentation, and privacy-focused PII handling.

Academic Context

Completed as a team graduation project at Al-Balqa Applied University (BAU), supervised by Eng. Nawal Al-Zabin, and defended in January 2026.

Impact Snapshot

Multi-Layer AI System

  • • Local LLM reasoning and ML classifiers.
  • • NLP, OCR, retrieval, and security screening.
  • • Layered analysis for source context, bias, toxicity, OSINT signals, and report generation.
Performance Work

Local AI Optimization

Optimized deep-analysis waiting from roughly 8 minutes to about 5–6 minutes in local RTX 4050 6GB VRAM testing. The optimization combined Celery/Redis background tasks, composite prompting, asynchronous coordination, and model-availability improvements under consumer-grade hardware limits.

Local benchmark; hardware-dependent.

Engineering Challenge

Structured AI Output

Worked on composite prompt design, JSON-oriented output handling, validation logic, cleanup steps, and fallback-oriented processing to make local LLM outputs easier to parse, explain, and present.

Technical Trade-Off

Local Privacy vs Cloud Speed

Chose a local-first AI direction to prioritize privacy, cost control, and independence from third-party LLM APIs. The latency trade-off was handled with background processing, progress-update design, and asynchronous workflows.

Deliverable

Technical report and architecture documentation.

Deployment

Docker-based local deployment workflow.

UX & Reporting

PWA interface, real-time feedback, visual analysis, and exportable reports.

Scope

Bilingual Arabic/English credibility analysis.

Academic portfolio project; TruthLens is an AI-assisted credibility assessment and decision-support system, not a production fact-checking authority.

Full System Demonstration

MP4 · 1080p · Web Optimized
Technical Signals

OCR, multilingual NLP, RAG context, source checks, and AI-assisted reasoning.

Positioning

Decision-support system for credibility analysis, not an official fact-checking authority.

Demo Chapters

  • 0:00 - 1:15: Interface Tour & Input Methods
  • 1:15 - 3:00: Asynchronous Job Lifecycle & Real-time Progress
  • 3:00 - 4:45: NLP & Sentiment Layers Inspection
  • 4:45 - 6:00: Report Generation & Credibility Insights

What this Demo Demonstrates

  • Interactive PWA UI: Real-time feedback and responsive workspace layout.
  • Backend Workflow: FastAPI gateway, Celery background tasks, and Redis queuing.
  • Structured AI Reports: Validation parsing of multi-model LLM outputs into readable insights.
memory

Multi-Model Consensus

  • 01
    Qwen 3 (8B): Primary Reasoning Engine for complex semantic context and logical coherence analysis.
  • 02
    Phi-3 Mini (3.8B): Self-Healing Agent ensuring structural data integrity and JSON validation/repair.
  • 03
    Custom Logistic Regression: High-performance clickbait detection model deployed via Hugging Face.
layers

Advanced NLP Layer

Deep integration of CAMeL-BERT for Arabic linguistic nuances and Twitter-RoBERTa for English social sentiment. The NLP layer uses SpaCy NER to detect named entities such as people, organizations, locations, and referenced topics, then enriches those entities through the Wikipedia API to provide additional context during credibility analysis. It is augmented with Detoxify for toxicity and hate-speech signals, Tesseract OCR for visual text extraction, and FastText for fast language detection.

dns

Performance Arch

Engineered on a Python 3 / FastAPI / Uvicorn stack where the API gateway validates requests, dispatches long-running analysis jobs, and returns a task ID without blocking the user interface. Celery and Redis handle background processing, task queues, result caching, and progress updates, while Server-Sent Events and Firestore listeners support real-time frontend feedback. Local RTX 4050 6GB VRAM testing informed the hardware-constrained optimization strategy, including model persistence, composite prompting, and worker lifecycle controls.

database

Data Intelligence

Advanced Adaptive Scraping pipelines using Newspaper3k, Trafilatura, DuckDuckGo Search, and ScraperAPI fallback layers for extracting article content and supporting context. Integrated RAG via ChromaDB for context injection, entity-aware retrieval, and comparison against supporting evidence. The system also considers source context, publication timing, recency cues, citation quality, domain signals, and available external evidence when framing credibility analysis beyond the text alone.

shield_lock

Defense & Privacy

Implements Proactive Threat Defense through VirusTotal API integration for automated URL scanning, alongside privacy-first preprocessing that redacts emails and phone numbers before analysis. The system is designed around content-based evaluation, local processing where feasible, and responsible AI disclaimers so users receive decision-support guidance rather than an absolute fact-checking verdict.

rocket_launch

Future Roadmap

System architecture confirms full Structural Readiness for Google News Fact Check API integration. Future iterations focus on browser-extension analysis, trusted news feeds, trend monitoring, educational media-literacy prompts, and expanded cross-platform misinformation monitoring while preserving the platform’s decision-support positioning.

Architecture Status Multi-Model Consensus + RAG Pipeline
HUD_GAUGE_01 // SYSTEM_VALIDATED
// Merged Open Source Contribution

OpenSSF cve-bin-tool

terminal
What it is: Merged Python pull request to OpenSSF cve-bin-tool, an open-source security scanner.
Problem: "cve_bin_tool.log.LOGGER is not a valid type" mypy logger typing issue (Issue #2870).
Fix: Explicit logging.Logger typing and restored centralized shared logger usage across files.
Result: Successfully resolved type checking errors and aligned related test coverage, resulting in a merged PR.
Stack: Python · mypy · logging · unit tests · security scanning
Proof / Links:

PhishGuard Pro

security
What it is: AI/NLP-powered detector for phishing emails, SMS scams, and financial fraud.
What I built: BERT-based classification engine and security guidance utility.
Stack: BERT, RAG, Python, PyTorch
Proof / Links:
Quick Demo:

AI-powered phishing and scam email analysis with threat probability, indicators, and explainable guidance.

BayForge AI

architecture
What it is: Educational AI prototype for exploring California ADU zoning data.
What I built: Frontend interface and API routes for property analysis. ⚠️ Not legal advice; development stopped due to data integrity and legal-risk concerns.
Stack: Next.js, TypeScript, Tailwind CSS
Proof / Links:

MCP Notebook

library_books
What it is: Reference guide and development suite for Anthropic's Model Context Protocol (MCP).
What I built: Transport pattern simulators, interactive Vercel prototype, and notion engineering docs.
Stack: MCP API, Next.js, Vercel, Notion
Proof / Links:

Clickbait Detector

warning
What it is: Automated classification engine to identify and flag sensationalized headlines.
What I built: Machine learning classifier achieving 96.39% test accuracy.
Stack: NLP, Python, Scikit-learn
Proof / Links:

Fake News Detector

policy
What it is: Machine learning model trained to evaluate veracity and potential bias in news articles.
What I built: TF-IDF feature pipeline and logistic regression classifier packaged using Skops.
Stack: Scikit-learn, TF-IDF, Skops, Gradio, Hugging Face Spaces
Proof / Links:

Full Stack Intern

terminal
What it is: Structured 280-hour Full-Stack Web Development Internship at Jordan Computer Society.
What I built: Flask/Jinja2 server-side prototypes, responsive layouts, SQL CRUD interfaces, and custom DOM logic.
Stack: Flask, Jinja2, SQL, Bootstrap, JavaScript
Proof / Links:
JCS Internship Certificate Verified