Do Catboost in Matlab for high dimensional dataset

Dear friend,
Currently, I am trying various approaches to improve the performance of my model on a high dimensional spectrometry dataset for binary classification. My aim is to improve upon python's lightGBM's 0.74 AUC for this dataset. However, I am struggling to get anywhere close to this to this using the matlab packages for variable selection and stats ml modelling packages. Is there a possibility to provide Catboost for matlab or a model that would perform better than lightGBM for a high dimensional dataset (e,g, with 6000 variables spectrometry dataset) ?
Thanks,
s0810110

Réponses (1)

Shubham
Shubham le 18 Jan 2024

0 votes

Hi Tim,
There isn't a direct implementation of CatBoost for MATLAB. However, there are a few strategies you could consider to potentially improve the performance of your models on high-dimensional data in MATLAB:
Feature Selection/Reduction:
  • Use MATLAB's built-in functions for feature selection, such as sequentialfs (sequential feature selection), relieff (ReliefF algorithm), or fscmrmr (Minimum Redundancy Maximum Relevance). Refer to this documentation link: https://in.mathworks.com/help/stats/sequentialfs.html
  • Consider dimensionality reduction techniques like PCA (pca function) or t-SNE (tsne function) to reduce the number of variables while retaining most of the variance in the data. Refer to this documentation link: https://in.mathworks.com/help/stats/tsne.html
Ensemble Methods:
Hyperparameter Optimization:
Advanced Preprocessing:
Deep Learning:
AUC is a good metric for binary classification problems, but you should also consider others such as accuracy, precision, recall, and F1-score for a comprehensive evaluation.

Catégories

En savoir plus sur Deep Learning Toolbox dans Centre d'aide et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by