Debido a la pandemia mundial provocada por el virus COVID-19, en el año 2020 surgen diversos desafíos en el ámbito de la investigación del procesamiento del lenguaje natural para intentar proporcionar sistemas automáticos que respondan de forma eficiente a preguntas relacionadas con la enfermedad. En este proyecto, se presenta una adaptación de un sistema existente de búsqueda de respuestas realizada para dar solución a uno de estos desafíos, el EPIC-QA. Dentro de este ámbito, el desafío propone dos tareas: dar respuesta a un conjunto de preguntas realizadas por usuarios generalistas y a un conjunto de preguntas formuladas por perfiles científicos o médicos. Para ello, el dataset utilizado incluye dos baterías de documentos distintas, una de documentos científicos y otra de textos más generalistas, que se adaptan mejor a cada una de las tareas. El objetivo del proyecto era analizar el impacto del uso de modelos pre-entrenados para el módulo de extracción de respuestas, realizando una comparativa entre tres modelos distintos. El sistema para la experimentación está preparado para responder a cualquier tipo de pregunta y consta de un módulo de recuperación de información a partir de una indexación de los documentos del dataset, seguido del módulo de extracción de respuestas para el que se implementaron las tres configuraciones distintas, con modelos basados en BERT utilizando Transformes y entrenados con baterías de datos adaptadas al ámbito científico y médico: SciBERT, ELECTRA y RoBERTa. Para analizar los resultados, se utilizó el método propuesto en EPIQ-QA basado en nuggets anotados manualmente por los evaluadores. Los resultados obtenidos en la experimentación muestran cómo una de las configuraciones propuestas se adapta mejor a los distintos escenarios planteados, obteniendo en todos los casos las mejores puntuaciones.
The employment of modern technologies is widespread in our society, so the inclusion of practical activities for education has become essential and useful at the same time. These activities are more noticeable in Engineering, in areas such as cybersecurity, data science, artificial intelligence, etc. Additionally, these activities acquire even more relevance with a distance education methodology, as our case is. The inclusion of these practical activities has clear advantages , such as (1) promoting critical thinking and (2) improving students’ abilities and skills for their professional careers. There are several options, such as the use of remote and virtual laboratories, virtual reality and gamebased platforms, among others. This work addresses the development of a new cloud game-based educational platform, which defines a modular and flexible architecture (using light containers). This architecture provides interactive and monitoring services and data storage in a transparent way. The platform uses gamification to integrate the game as part of the instructional process. The CyberScratch project is a particular implementation of this architecture focused on cybersecurity game-based activities. The data privacy management is a critical issue for these kinds of platforms, so the architecture is designed with this feature integrated in the platform components. To achieve this goal, we first focus on all the privacy aspects for the data generated by our cloud game-based platform, by considering the European legal context for data privacy following GDPR and ISO/IEC TR 20748-1:2016 recommendations for Learning Analytics (LA). Our second objective is to provide implementation guidelines for efficient data privacy management for our cloud game-based educative platform. All these contributions are not found in current related works. The CyberScratch project, which was approved by UNED for the year 2020, considers using the xAPI standard for data handling and services for the game editor, game engine and game monitor modules of CyberScratch. Therefore, apart from considering GDPR privacy and LA recommendations, our cloud game-based architecture covers all phases from game creation to the final users’ interactions with the game.
Disruptions are dangerous events in tokamaks that require mitigation methods to alleviate its detrimental effects. A prerequisite to trigger any mitigation action is the existence of a reliable disruption predictor. This article assesses a predictor that relates in a linear way consecutive samples of a single quantity (in particular, the magnetic perturbation time derivative signal has been used). With this kind of predictor, the recognition of disruptions does not depend on how large the signal amplitude is but on how large the signal increments are: small increments mean smooth plasma evolution whereas abrupt increments reflect a non-smooth evolution and potential risk of disruption. Results are presented with data from the JT-60U tokamak and high-beta discharges. Two training methods have been tested: a classical approach in which the more data for training the better and an adaptive method that starts from scratch. In both cases the success rate is about 95%. It should be noted that predictors based on signal increments and their adaptive versions can be of big interest for next devices such as JT-60SA or ITER.
The development of the Internet of Things (IoT) benefits from 1) the connections between devices equipped with multiple sensors; 2) wireless networks and; 3) processing and analysis of the gathered data. The growing interest in the use of IoT technologies has led to the development of numerous diverse applications, many of which are based on the knowledge of the end user's location and profile. This paper investigates the characterization of Bluetooth signals behavior using 12 different supervised learning algorithms as a first step toward the development of fingerprint-based localization mechanisms. We then explore the use of metaheuristics to determine the best radio power transmission setting evaluated in terms of accuracy and mean error of the localization mechanism. We further tune-up the supervised algorithm hyperparameters. A comparative evaluation of the 12 supervised learning and two metaheuristics algorithms under two different system parameter settings provide valuable insights into the use and capabilities of the various algorithms on the development of indoor localization mechanisms.
Detecting disruptions with sufficient anticipation time is essential to undertake any form of remedial strategy, mitigation or avoidance. Traditional predictors based on machine learning techniques can be very performing, if properly optimised, but do not provide a natural estimate of the quality of their outputs and they typically age very quickly. In this paper a new set of tools, based on probabilistic extensions of support vector machines (SVM), are introduced and applied for the first time to JET data. The probabilistic output constitutes a natural qualification of the prediction quality and provides additional flexibility. An adaptive training strategy 'from scratch' has also been devised, which allows preserving the performance even when the experimental conditions change significantly. Large JET databases of disruptions, covering entire campaigns and thousands of discharges, have been analysed, both for the case of the graphite and the ITER Like Wall. Performance significantly better than any previous predictor using adaptive training has been achieved, satisfying even the requirements of the next generation of devices. The adaptive approach to the training has also provided unique information about the evolution of the operational space. The fact that the developed tools give the probability of disruption improves the interpretability of the results, provides an estimate of the predictor quality and gives new insights into the physics. Moreover, the probabilistic treatment permits to insert more easily these classifiers into general decision support and control systems.