Seminar - Improving Medical Document Classification via Feature Engineering

ECS PhD Proposal

Speaker: Mahdi Abdollahi
Time: Monday 17th December 2018 at 02:00 PM - 03:00 PM
Location: Cotton Club, Cotton 350

Add to Calendar Add to your calendar

Abstract

Document classification (DC) is the task of assigning pre-defined labels to unseen documents by utilizing the trained model on the available labeled documents. DC has attracted much attention in medical field recently because many issues can be formulated as a classification problem. It can assist doctors in decision making and correct decisions can reduce the medical expenses. Medical documents have special attributes that distinguish them from other texts and make it difficult to analyze. For example, many acronyms and abbreviations, and short expressions make it more challenging to extract information. The main limitations of the current medical DC methods are that most current state-of-the-art methods are not interpretable and the training data is often not sufficient. Furthermore, the classification accuracy is not satisfactory. The goal of this thesis is to enhance the input feature sets of the DC method to improve the accuracy and interpretability. To approach this goal, this work will develop new feature manipulation (such as feature weighting, feature selection, and feature construction) in supervised learning systems to introduce new meaningful feature sets. This thesis will utilize information extraction techniques and Evolutionary Computation (EC) techniques like Genetic Algorithm (GA), Particle Swarm Optimisation (PSO) and Genetic programming (GP) to achieve its objectives.

Go backGo back to the seminar list