MACHINE LEARNING-BASED STATIC MALWARE DETECTION

Detect malware in Windows executables using static PE file analysis with machine learning.

Enter PE header values for one file.

Upload CSV for multiple files, get predictions and metrics.

Feature Extraction: Extract 27 features from the PE header of Windows executables (no need to run the file!).
Preprocessing: Handle missing values, scale features, and prepare data for ML models.
Model Training: Train multiple models (Random Forest, XGBoost, etc.) on the Brazilian malware dataset (~50,000 samples).
Prediction: Predict if a file is malware (1) or goodware (0) with confidence scores.
Evaluation: For batch uploads with labels, see AUC, accuracy, and confusion matrix.

This system uses machine learning models trained on PE file characteristics: