Back to Discover
Prompt
Master Prompt: Reproducing the HemoVisionAI Project
Goal: Reproduce the HemoVisionAI project, a Flask-based web application with a TensorFlow/Keras backend for medical image (blood cell) classification, featuring a hybrid Python/C++ preprocessing pipeline, user authentication, prediction history, API access, background task processing, an admin interface, and Dockerization.
Developed By (Original Project): MRI (MaCarson Research Institute) in collaboration with Qasel Gh Ltd. Senior Software Engineering Architect: Kwaku Manu Amponsem.
Core Technologies:
Backend: Python, Flask
Machine Learning: TensorFlow, Keras, Scikit-learn
Database: PostgreSQL, SQLAlchemy, Flask-Migrate (Alembic)
Web Frontend: HTML, CSS, JavaScript (potentially using libraries like Bootstrap, Chart.js)
Asynchronous Tasks: Celery, Redis
API: Flask-RESTful or Flask Blueprints, Flask-JWT-Extended
Authentication: Flask-Login, Flask-Bcrypt
Containerization: Docker, Docker Compose
Optional C++ Extension: C++, pybind11, CMake (for image preprocessing)
Utilities: ReportLab (PDF reports), Pillow (Image handling), python-dotenv, pytest
Instructions: Generate the code and configuration for the HemoVisionAI project by following these progressive tasks. Ensure code follows best practices (modularity, separation of concerns, security considerations) and includes comments where necessary. Adhere to the specified file structure as closely as possible based on the original project analysis.
Progressive Tasks:
Task 0: Project Goal & Core Tech Stack Review
Action: Acknowledge the project goal and the core technologies listed above. Briefly confirm understanding of the overall architecture (Web frontend -> Flask backend -> ML model w/ optional C++ preprocessing -> DB/Redis).
Task 1: Project Setup & Configuration
Goal: Create the basic project structure and essential configuration files.
Actions:
Define the main directory structure (core/, tests/, migrations/, entrypoints/, docker/, docs/, scripts/, data/, logs/).
Create core/config/settings.py with a base Config class holding default settings (debug, testing flags, secret keys placeholders, DB URI placeholder, Redis settings, JWT settings, storage paths, default ML params like input shape). Load settings from environment variables using python-dotenv.
Create core/config/development.py, production.py, testing.py inheriting from Config and overriding relevant settings.
Create core/config/__init__.py to export configurations.
Create .env.example listing necessary environment variables (SECRET_KEY, JWT_SECRET_KEY, DATABASE_URL, REDIS_URL, FLASK_CONFIG, etc.).
Create requirements.txt and requirements-dev.txt listing necessary Python packages (Flask, SQLAlchemy, Flask-Migrate, psycopg2-binary, TensorFlow, scikit-learn, Pillow, Celery, Redis, Flask-Login, Flask-JWT-Extended, Flask-Bcrypt, python-dotenv, pytest, reportlab, requests, Flask-WTF, Flask-Mail, etc.).
Create a basic setup.py (even if minimal, for potential packaging/C++ build integration later).
Create empty __init__.py files in necessary directories to make them Python packages.
Task 2: Database Setup
Goal: Define database models and configure migrations.
Actions:
In core/db/models.py, define SQLAlchemy models: User (with fields for username, email, password hash, roles, timestamps), PredictionHistory (linking to user, storing input details, predictions, timestamps), ModelVersion (tracking ML model details like name, path, version, activation status, metrics). Include necessary relationships and helper methods (e.g., password hashing/checking).
In core/db/db.py (or within core/web/__init__.py), initialize SQLAlchemy and Migrate extensions.
Configure Alembic for migrations (migrations/ directory, alembic.ini, migrations/env.py). Ensure env.py correctly reads the database URL from Flask config and registers the models.
Task 3: Core Utilities
Goal: Implement shared utility functions.
Actions: Create modules within core/utils/:
logging_config.py: Set up logging configuration (e.g., based on Flask config).
security.py: Password hashing/verification helpers (using Flask-Bcrypt), JWT token generation/validation helpers (can use Flask-JWT-Extended decorators later), rate limiting decorator (using Flask-Limiter or custom Redis logic).
file_validation.py: Functions to validate uploaded file types securely (e.g., using python-magic or checking image headers with Pillow).
image_utils.py: Helpers for loading, resizing (using Pillow or TF), and potentially basic processing of images.
report_generator.py: Function to generate PDF reports (e.g., prediction results) using ReportLab.
email_sender.py: Utility for sending emails (using Flask-Mail) for verification, password resets etc.
Task 4: Storage Management
Goal: Create modules to abstract interactions with storage systems (filesystem, Redis, DB).
Actions: Create modules within core/storage/:
file_manager.py: Handles saving, retrieving, deleting files (uploads, models, reports) based on configuration paths. Generates consistent paths.
redis_manager.py: Manages Redis connection and provides functions for caching, storing JWT blocklist entries, managing rate limits, or handling Celery backend results.
history_manager.py: Contains functions specifically for creating, retrieving, and querying PredictionHistory records in the database.
model_store.py: Functions for saving ModelVersion records to the DB, managing model file storage (using FileManager), activating specific model versions, and retrieving the active model path/details.
upload_manager.py: Coordinates receiving an uploaded file, validating it (using file_validation), and saving it (using FileManager).
Task 5: Basic Flask Web Application
Goal: Set up the core Flask app factory, extensions, and basic structure.
Actions:
In core/web/__init__.py, create the create_app factory function. It should:
Initialize the Flask app instance.
Load configuration based on FLASK_CONFIG.
Initialize Flask extensions (SQLAlchemy, Migrate, LoginManager, JWTManager, Bcrypt, Mail, CSRFProtect, possibly SocketIO, CORS).
Register Blueprints (to be created in later tasks).
Set up basic error handlers (core/web/error_handlers.py).
Register middleware if needed (core/web/middleware.py).
Create a main Blueprint (core/web/routes/main_routes.py) with a simple index route (/).
Create base Jinja2 templates (core/web/templates/base.html, index.html) with basic HTML structure and blocks for content, styles, scripts.
Set up static file handling (core/web/static/css/main.css, core/web/static/js/main.js).
Task 6: User Authentication & Management
Goal: Implement user sign-up, login, logout, and profile features.
Actions:
Create authentication forms (core/web/forms/auth_forms.py) using Flask-WTF (LoginForm, RegistrationForm, PasswordResetForm, etc.).
Implement authentication routes in a dedicated blueprint or within main_routes.py (/register, /login, /logout, /reset_password). Use Flask-Login for session management (login_user, logout_user, @login_required) and Flask-Bcrypt for password handling. Integrate email verification if required using EmailSender.
Implement profile viewing/editing routes and forms (core/web/forms/profile_forms.py, /profile route).
Configure Flask-Login (LoginManager in create_app) with a user loader function.
Configure Flask-JWT-Extended (JWTManager in create_app) for API authentication later (user identity loader, token blocklist using RedisManager).
Update base.html template with navigation links for login/logout/profile/register.
Task 7: ML - C++ Image Preprocessor (Optional but Recommended for Reproduction)
Goal: Implement and integrate the optimized C++ preprocessing component.
Actions:
Create C++ source files (core/classifier/cpp_utils/image_processing.h/.cpp, matrix_ops.h/.cpp) containing functions for image normalization, filtering (e.g., Gaussian blur), contrast adjustment, etc.
Create core/classifier/image_preprocessor.cpp using pybind11 to create Python bindings for the C++ functions. Expose a class or functions callable from Python.
Create core/classifier/CMakeLists.txt to define the C++ build process.
Modify setup.py and create core/classifier/build.py to use CMake and pybind11 helper functions to compile the C++ extension during pip install or a build step.
Create core/classifier/cpp_bindings.py to import the compiled C++ module safely, handling potential import errors if the module wasn't built.
Task 8: ML - Core Classifier Logic
Goal: Implement the core machine learning components (dataset, model, prediction).
Actions:
In core/classifier/dataset.py, create DatasetLoader class using tf.data. Implement methods for:
Loading image paths and labels.
Parsing/resizing images (tf.image).
Applying data augmentation (using Keras preprocessing layers based on DATA_AUGMENTATION_CONFIG from settings).
Creating train/validation/test splits.
Batching and prefetching datasets.
In core/classifier/model.py, create HemoVisionModel class:
Implement _build_model method: Load a pre-trained Keras application base model (e.g., EfficientNetB3 specified in settings SUPPORTED_BASE_MODELS), freeze its layers, add custom classification head (GlobalAveragePooling, BatchNormalization, Dropout, Dense layers - use dropout rates from settings).
Implement compile_model method (using optimizer and loss from settings).
Implement train method (placeholder, logic will be in train.py).
Implement predict_on_image method: Takes an image path/array, preprocesses it (resizing, calls C++ preprocessor if available via cpp_bindings, applies Keras preprocess_input), and returns predictions.
Implement methods to load/save model weights.
In core/classifier/predict.py, create functions that utilize HemoVisionModel and ModelStore to load the active model and make predictions on new data.
In core/classifier/evaluation.py, implement functions to evaluate a trained model on the test set using scikit-learn metrics (accuracy, precision, recall, F1, confusion matrix).
Task 9: ML - Training Pipeline
Goal: Create the script and logic to train the ML model.
Actions:
In core/classifier/train.py, implement the main training orchestration logic:
Parse command-line arguments (base model, data dir, output dir, retrain ID, activate flag).
Initialize DatasetLoader to load data.
Initialize HemoVisionModel. Handle retraining logic (loading previous weights).
Compile the model.
Set up Keras callbacks (ModelCheckpoint, EarlyStopping, ReduceLROnPlateau, TensorBoard - using parameters from settings).
Run initial training (model.fit) with base layers frozen.
Optionally unfreeze layers (based on TRAINING_FINE_TUNE_LAYER setting) and run fine-tuning with a lower learning rate.
Evaluate the trained model using evaluation.py.
Save the trained model weights and its metadata/metrics to the database using ModelStore.
Handle activation if the --activate flag is passed.
In entrypoints/cli.py, create a Flask CLI command (flask model train) that calls the main training function from core/classifier/train.py.
Task 10: Web App - Image Upload & Prediction
Goal: Implement the user flow for uploading an image and viewing prediction results.
Actions:
Create upload form (core/web/forms/upload_forms.py).
Create upload route (/upload in main_routes.py). Handle POST request:
Validate form and uploaded file (using UploadManager).
Save the uploaded file.
Trigger prediction: Either directly call the prediction function (core/classifier/predict.py) or (better) create a Celery task for prediction (see Task 14).
Store prediction request details and results in the database using HistoryManager.
Redirect to a result page, passing the history ID or results.
Create result page template (core/web/templates/result.html) to display the input image and prediction outcomes.
Implement JavaScript (core/web/static/js/upload.js) for potential frontend enhancements (preview, progress bar).
Task 11: Web App - Prediction History
Goal: Allow users to view their past predictions.
Actions:
Create a route (/history in main_routes.py) protected by @login_required.
Fetch the user's prediction history from the database using HistoryManager (implement pagination).
Create a template (core/web/templates/history.html) to display the history records in a table or list, possibly with thumbnails and links to detailed results. Include pagination controls.
Task 12: API Endpoints
Goal: Provide RESTful API access for core functionalities.
Actions:
Create an API blueprint (core/web/routes/api_routes.py).
Implement endpoints protected by @jwt_required() from Flask-JWT-Extended:
/api/predict (POST): Accepts image data, performs prediction (ideally via Celery task), saves history, returns prediction results/history ID.
/api/history (GET): Returns paginated prediction history for the authenticated user.
/api/history/<id> (GET): Returns details for a specific prediction.
(Optional) /api/auth/login (POST), /api/auth/refresh (POST).
Ensure consistent JSON responses for success and errors. Use API rate limiting defined in core/utils/security.py.
Task 13: Admin Interface
Goal: Create a section for administrators to manage users and models.
Actions:
Create an admin blueprint (core/web/routes/admin_routes.py) protected by a role check or specific admin login logic.
Implement routes and forms (core/web/forms/admin_forms.py) for:
User Management: List users, view details, edit roles/status, delete users.
Model Management: List trained models (ModelVersion from DB), view metrics, activate/deactivate models, potentially trigger training (via Celery task).
Create corresponding admin templates (core/web/templates/admin/). Use a distinct base template for admin pages if desired.
Create a Flask CLI command (flask db create_admin) in entrypoints/cli.py to create an initial admin user.
Task 14: Asynchronous Tasks (Celery)
Goal: Offload long-running tasks like model training and prediction to background workers.
Actions:
Configure Celery: Set up broker (Redis) and result backend URLs in Flask config. Create Celery app instance, ensuring it integrates with Flask application context (core/tasks/__init__.py or entrypoints/celery_worker.py).
Define tasks in core/tasks/:
training_tasks.py: Task to run the main training logic from core/classifier/train.py.
prediction_tasks.py: Task to run prediction using core/classifier/predict.py, taking image path/ID as input, updating PredictionHistory upon completion.
Modify prediction triggers (in API and web upload routes) to call .delay() or .apply_async() on the Celery prediction task instead of running it synchronously.
Modify model training trigger (in admin panel or CLI if desired) to use the Celery training task.
Create entrypoints/celery_worker.py to run the Celery worker.
Create entrypoints/celery_beat.py if any scheduled tasks are needed.
Task 15: Testing
Goal: Implement automated tests for different parts of the application.
Actions: Create tests within the tests/ directory using pytest:
tests/unit/: Test individual functions/classes in isolation (e.g., model building blocks, utility functions, form validation). Mock external dependencies (DB, Redis, API calls).
tests/integration/: Test interactions between components (e.g., API endpoint calls trigger correct actions and DB changes, web form submissions work correctly). Requires setting up a test database/Redis.
tests/performance/ (Optional): Basic load tests for API endpoints or benchmarks for image processing.
Create tests/conftest.py to define fixtures (e.g., Flask test client, test database setup/teardown, sample data).
Ensure tests cover core logic, authentication, API endpoints, ML pipeline steps.
Task 16: Dockerization
Goal: Containerize the application and its dependencies for deployment.
Actions:
Create Dockerfile: Define base image (Python), set up workdir, copy requirements, install dependencies (including handling C++ build if applicable), copy application code, expose port, define entrypoint/CMD (e.g., using gunicorn or uWSGI). Handle multi-stage builds if needed (e.g., for C++ compilation).
Create .dockerignore to exclude unnecessary files/dirs from the build context.
Create docker-compose.yml: Define services for web (using the Dockerfile), worker (running celery_worker.py), beat (optional, running celery_beat.py), db (PostgreSQL image), redis (Redis image), nginx (optional, as reverse proxy). Configure networking, volumes (for persistent data like DB, models, uploads), environment variables (loading from .env).
Create docker-compose.dev.yml: Override docker-compose.yml for development. Mount source code directly into containers for live reloading, use development config, map different ports if needed.
Task 17: Documentation & Final Files
Goal: Create essential documentation and supporting project files.
Actions:
Write README.md covering project overview, features, setup, usage (running, training, testing), configuration, accuracy improvement tips, API usage basics, and attribution.
Create detailed documentation in docs/: api.md (API endpoint details), user_guide.md, deployment.md, contributing.md.
Create a Makefile with helper commands for common tasks (e.g., make build, make run-dev, make test, make lint, make migrations).
Create .gitignore.
Create CHANGELOG.md.
Add a LICENSE file.
Task 18: Review and Refine
Goal: Perform a final review of the generated project code.
Actions: Check for:
Consistency in coding style and structure.
Completeness based on the tasks.
Correct implementation of core features (auth, ML pipeline, API, etc.).
Security best practices (input validation, password hashing, secret management).
Functionality of Docker setup and basic commands.
Clarity and accuracy of documentation.