The Machine learning widespread adoption has inadvertently led to the amplification of societal biases and discrimination, with many consequential decisions now influenced by data-driven systems. In this scenario, fair machine learning techniques has become a frontier for AI researchers and practitioners. Addressing fairness is intricate; one cannot solely rely on the data used to train models or the metrics that assess them, as this data is often the primary source of bias — akin to noisy data. This paper delves into the convergence of these two research domains, highlighting the similarities and differences between fairness and noise in machine learning. We introduce the Fair Transition Loss, a novel method for fair classification inspired by label noise robustness techniques. Traditional loss functions tend to ignore distributions of sensitive features and their impact on outcomes. Our approach uses transition matrices to adjust predicted label probabilities based on this ignored data. The empirical evaluation indicates that this method outperforms many benchmarked approaches in a variety of scenarios and remains competitive when compared with prominent fair classification strategies.
LLM
The Economic Implications of Large Language Model Selection on Earnings and Return on Investment: A Decision Theoretic Model
Geraldo Xexéo, Filipe Braida, Marcus Parreiras, and 1 more author
Selecting language models in business contexts requires a careful analysis of the final financial benefits of the investment. However, the emphasis of academia and industry analysis of LLM is solely on performance. This work introduces a framework to evaluate LLMs, focusing on the earnings and return on investment aspects that should be taken into account in business decision making. We use a decision-theoretic approach to compare the financial impact of different LLMs, considering variables such as the cost per token, the probability of success in the specific task, and the gain and losses associated with LLMs use. The study reveals how the superior accuracy of more expensive models can, under certain conditions, justify a greater investment through more significant earnings but not necessarily a larger RoI. This article provides a framework for companies looking to optimize their technology choices, ensuring that investment in cutting-edge technology aligns with strategic financial objectives. In addition, we discuss how changes in operational variables influence the economics of using LLMs, offering practical insights for enterprise settings, finding that the predicted gain and loss and the different probabilities of success and failure are the variables that most impact the sensitivity of the models.
2023
IA
An online platform for COVID-19 diagnostic screening using a machine learning algorithm
Erito Marques de Souza Filho, Rodrigo de Souza Tavares, Bruno José Dembogurski, and 6 more authors
O ruído de classe consiste no erro de rotulação da classe. Ele pode afetar negativamente o desempenho de um modelo, podendo variar com relação ao modelo escolhido. Por essa razão, surgiram trabalhos que avaliam a resistência natural dos modelos de Aprendizado de Máquina ao ruído de classe. Sendo assim, seria relevante avaliar a resistência natural da rede neural artificial ao ruído de classe, dado a sua relevância ao Aprendizado Profundo. O objetivo deste trabalho é realizar um experimento para avaliar a influência do ruído de classe nas redes neurais artificiais, treinando-as sob base de dados ruidosas. Os resultados mostraram que a complexidade da rede pode influenciar a resistência delas ao ruído de classe.
Patente: Programa de Computador. Número do registro: BR512022002678-3, data de registro: 31/10/2022, Instituição de registro: INPI - Instituto Nacional da Propriedade Industrial
2021
Recsys
Simulating real profiles for shilling attacks: A generative approach
Collaborative Filtering (CF) approaches are vulnerable to Shilling Attacks, in which malicious users or companies inject a large number of fake profiles in a system in order to manipulate its recommendations. One problem of current Shilling Attack models is that they commonly use straightforward statistical templates, producing profiles with different rating patterns than actual system data, which facilitates its detection, requiring a larger amount of profiles to achieve its goals. To address this problem and create profiles closer to reality, we propose using a generative model, Variational Autoencoder (VAE) to map original data distribution. With VAE, it is possible to generate new profiles based on real data, without explicit copying their actual ratings. Its generated profiles are converted to malicious profiles by adding target item rating value. We test our attack model on MovieLens 100k data set and compare to literature attack models. Our results indicate that our model outperforms all other models in model-based CF system, especially using low attack sizes (from 3% to 5%). Also, analysis comparing profiles generated from it and other approaches shows that our model ratings pattern are very similar to real profiles, which may indicate that attacks mounted using our approach may be less likely to be detected by detection approaches. Thus, we show that our attack model represents an advance on Shilling Attack models, since its superior results in model-based CF and possible indistinction from real profiles may be useful as a baseline to test detection techniques and other tasks among Shilling Attack area.
2020
IA
Redes de bits como alternativa às redes neurais em problemas de aprendizado por reforço
Nickolas R Machado, Pedro CF Machado, Juliana MNS Zamith, and 3 more authors
In Anais da VI Escola Regional de Alto Desempenho do Rio de Janeiro, 2020
Redes neurais artificiais (RNAs) são amplamente utilizadas, como por exemplo, em jogos digitais através de agentes inteligentes que replicam o comportamento humano. Contudo, o custo computacional de treinamento delas costuma ser alto, exigindo a escolha entre maior processamento ou menor qualidade. Desta forma, este trabalho propõe o uso de redes de bits (RBs) como uma alternativa capaz de maximizar o processamento enquanto minimiza o consumo de memória. Comparados às RNAs de precisão simples, os primeiros resultados mostram um speedup de até 91 vezes utilizando 32 vezes menos memória.
HD
Evidências, códigos e classificações: o ofício do historiador e o mundo digital
Alexandre Fortes , and Leandro Guimarães Marques Alvim
<p>O artigo examina o impacto da difusão global das tecnologias digitais sobre o ofício do historiador. Parte da análise sobre a relação entre a prática da profissão e natureza do conhecimento histórico formuladas por alguns dos maiores historiadores do século XX. Examina a natureza social da linguagem e seu papel na constituição das evidências e fontes históricas, articulando essa análise com os avanços tecnológicos do “processamento de linguagem natural”. Discute conceitos de diversos ramos das ciências sociais relevantes para a compreensão do processo de desenvolvimento do conhecimento humano e o papel da codificação de informações na elaboração de narrativas e na pesquisa histórica. Por fim, apresenta um panorama das principais metodologias no campo da inteligência artificial atualmente aplicadas à pesquisa histórica</p>
Patente: Programa de Computador. Número do registro: BR512020002469-6, data de registro: 13/05/2020, Instituição de registro: INPI - Instituto Nacional da Propriedade Industrial
Patente: Programa de Computador. Número do registro: BR512020002470-0, data de registro: 19/04/2020, Instituição de registro: INPI - Instituto Nacional da Propriedade Industrial
2019
IA
AMANDA: Semi-supervised density-based adaptive model for non-stationary data with extreme verification latency
Concept drift refers to an alteration in the relations between input and output data in the distribution over time. Thus, a gradual concept drift alludes to a smooth and gradual change in these relations. It generates a model obsolescence and quality decrease in predictions. Besides, there is a challenging task: the extreme verification latency to certify the labels. For batch scenarios, state-of-the-art methods do not properly tackle the problems aforementioned due to their high computational time, lack of representing samples of the drift or even for having several hyperparameters for tuning. Therefore, we propose AMANDA, a semi-supervised density-based adaptive model for non-stationary data. It has two variations: AMANDA-FCP, which selects a fixed number of samples; and AMANDA-DCP, which, in turn, dynamically selects samples from data. Our results indicate that these two variations outperform the state-of-the-art methods for almost all synthetic and real datasets, with an improvement up to 27.98% regarding the average error. AMANDA-FCP improved the results for a gradual concept drift, even with a small size of initial labeled data. Moreover, our results indicate that semi-supervised classifiers are improved when they work along with our density-based methods. Therefore, we emphasize the importance of research directions based on this approach.
2018
IA
A Cognitive Architecture for Agent-Based Artificial Life Simulation
Ronaldo Vieira, Bruno Dembogurski, Leandro Alvim, and 1 more author
In Computational Science and Its Applications–ICCSA 2018: 18th International Conference, Melbourne, VIC, Australia, July 2-5, 2018, Proceedings, Part I 18, Jun 2018
The ability to simulate living beings that behave in a credible way is a fundamental aspect in digital games. This is due to its interdisciplinary characteristic, that brings together different fields of knowledge to better understand biological life and its processes. In this context, the design of an intelligent agent is a hard task as it involves a complex system, which has several interconnected components. In this work a virtual mind architecture for intelligent agents is proposed, where it simulates the cognitive processes of an actual brain, in this case attention and memory, in order to reproduce behaviors similar to those of actual living beings. A prototype is then proposed, where the architecture is applied on agents that represent virtual animals in a semantic-modeled ecosystem, and conduct a proof-of-concept experiment with it to demonstrate its effectiveness. In this experiment, the behavior of the virtual animals were consistent with reality, thus, validating the architecture’s ability to simulate living beings.
IA
Density-Based Core Support Extraction for Non-stationary Environments with Extreme Verification Latency
Raul Sena Ferreira, Bruno Silva, Wendell Teixeira, and 2 more authors
In 2018 7th Brazilian Conference on Intelligent Systems (BRACIS), Jun 2018
Machine learning solutions usually consider that the train and test data has the same probabilistic distribution, that is, the data is stationary. However, in streaming scenarios, data distribution generally change through the time, that is, the data is non-stationary. The main challenge in such online environment is the model adaptation for the constant drifts in data distribution. Besides, other important restriction may happen in online scenarios: the extreme latency to verify the labels. Worth to mention that the incremental drift assumption is that class distributions overlap at subsequent time steps. Hence, the core region of data distribution have significant overlap with incoming data. Therefore, selecting samples from these core regions helps to retain the most important instances that represent the new distribution. This selection is denominated core support extraction (CSE). Thus, we present a study about density-based algorithms applied in non-stationary environments. We compared KDE, GMM and two variations of DBSCAN against single semi-supervised approaches. We validated these approaches in seventeen synthetic datasets and a real one, showing the strengths and weaknesses of these CSE methods through many metrics. We show that a semi-supervised classifier is improved up to 68% on a real dataset when it is applied along with a density-based CSE algorithm. The results between KDE and GMM, as CSE methods, were close but the approach using KDE is more practical due to having less parameters.
2017
Recsys
Autoencoders and recommender systems: COFILS approach
Collaborative Filtering to Supervised Learning (COFILS) transforms a Collaborative Filtering (CF) problem into classical Supervised Learning (SL) problem. Applying COFILS reduces data sparsity and makes it possible to test a variety of SL algorithms rather than matrix decomposition methods. Its main steps are: extraction, mapping and prediction. Firstly, a Singular Value Decomposition (SVD) generates a set of latent variables from a ratings matrix. Next, on the mapping phase, a new data set is generated where each sample contains a set of latent variables from a user and each rated item; and a target that corresponds the user rating for that item. Finally, on the last phase, a SL algorithm is applied. One problem of COFILS is its dependency on SVD, that is not able to extract non-linear features from data and it is not robust to noisy data. To address this problem, we propose switching SVD to a Stacked Denoising Autoencoder (SDA) on the first phase of COFILS. With SDA, more useful and complex representations can be learned in a neural network with a local denoising criterion. We test our novel technique, namely Autoencoder COFILS (A-COFILS), on MovieLens, R3 Yahoo! Music and Movie Tweetings data sets and compare to COFILS, as a baseline, and state of the art CF techniques. Our results indicate that A- COFILS outperforms COFILS for all the data sets and with an improvement up to 5.9%. Also, A-COFILS achieves the best result for the MovieLens 100k data set and ranks on the top three algorithms for these data sets. Thus, we show that our technique represents an advance on COFILS methodology, improving its results and making it a suitable method for CF problem.
2015
Recsys
Transforming collaborative filtering into supervised learning
Filipe Braida, Carlos E. Mello, Marden B. Pasinato, and 1 more author
Collaborative Filtering (CF) is a well-known approach for Recommender Systems (RS). This approach extrapolates rating predictions from ratings given by user on items, which are represented by a user-item matrix filled with a rating ri,j given by an user i on an item j. Therefore, CF has been confined to this data structure relying mostly on adaptations of supervised learning methods to deal with rating predictions and matrix decomposition schemes to complete unfilled positions of the rating matrix. Although there have been proposals to apply Machine Learning (ML) to RS, these works had to transform the rating matrix into the typical Supervised Learning (SL) data set, i.e., a set of pairwise tuples (x,y), where y is the correspondent class (the rating) of the instance x∈Rk. So far, the proposed transformations were thoroughly crafted using the domain information. However, in many applications this kind of information can be incomplete, uncertain or stated in ways that are not machine-readable. Even when it is available, its usage can be very complex requiring specialists to craft the transformation. In this context, this work proposes a domain-independent transformation from the rating matrix representation to a supervised learning dataset that enables SL methods to be fully explored in RS. In addition, our transformation is said to be straightforward, in the sense that, it is an automatic process that any lay person can perform requiring no domain specialist. Our experiments have proven that our transformation, combined with SL methods, have greatly outperformed classical CF methods.
2013
Recsys
Group Recommender Systems: Exploring Underlying Information of the User Space
This work proposes a new methodology for the Group Recommendation problem. In this approach we choose the Most Representative User (MRU) as the group medoid in a user space projection, and then generate the recommendation list based on his preferences. We evaluate our proposal by using the well-known dataset Movie lens. We have taken two different measures so as to evaluate the group recommender strategies. The obtained results seem promising and our strategy has shown an empirical robustness compared with the baselines in the literature.