Abstract:
Despite being first described in the late 1970s, hot subdwarf stars are still not fully understood, especially when it comes to the necessary conditions and mechanisms responsible for their formation. One of the biggest problems in this field is related to the small number of catalogued and confirmed objects, which hinders the development and confirmation of possible hypotheses in relation to observed data. Despite this, hot subdwarf are important for several fields of astronomy and they impact studies ranging from stellar evolution to the physical properties of our own and other galaxies. Thus, the main objective of this study is the development of solutions aimed at identifying new candidates for hot subdwarfs and thus expand the current catalogues. To this end, Machine Learning (ML) algorithms are used, which are capable of processing a large amount of data in an efficient way. The models created in this work are based on Random Forest algorithms and the data used are the 12 photometric magnitudes provided by J-PLUS and S-PLUS. After crossmatching these two surveys with a catalogue of confirmed hot subdwarfs, samples with about 17,000 objects (between general stars and hot subdwarfs) were generated for each of them. With a careful optimization of their hyperparameters, the models created obtained excellent results in the test samples (F1 Score of 0.88 in J-PLUS and 0.94 in S-PLUS), and they were used to identify 2896 new candidates for hot subdwarfs within the two surveys considered. Furthermore, as a way of validating the algorithms used, stellar parameter ($T_{eff}$, log(g) and Fe/H) prediction models based on Random Forest were also developed for J-PLUS and S-PLUS objects. The development samples for these models were created from crossmatching the two surveys with LAMOST stellar parameter data, resulting in about 211 thousand objects for J-PLUS and about 66 thousand for S-PLUS. For these predictions, the use of absolute magnitudes as input variables for the models was tested, which brought an improvement of about $\sim30\%$ in the log(g) predictions. After also performing an optimization of the hyperparameters, the models obtained in this work showed a good performance in the test samples, with errors of $\sim50 K$ for $T_{eff}$, $\sim0.07$ dex for log(g) and $\sim0.08$ dex for [Fe/H]. Finally, the models were used to create stellar parameter catalogues for J-PLUS and S-PLUS with 3 million and 5 million objects respectively. Random Forests; Hot subdwarfs; Stellar Parameter Prediction; J-PLUS; S-PLUS