Dr Vahid Seydi

Research Fellow in Data Science

Contact info

Vahid Seydi is a Research Fellow in the School of Ocean Science at Bangor University in Data Science and Machine Learning. Prior to Bangor, he was an Assistant Professor at the Department of AI at Azad University South Tehran Branch.

As a member of the iMarDIS project team, He works on a data infrastructure to bring together diverse ocean science datasets and make them available to various research and industrial partners within the offshore renewable and ocean sciences community. Visualising this data so that it would be informative and well exploited in the fields of education, research, and industry, along with investigating the interaction of AI and machine learning with the collected marine data are two parts of the project's intentions (iMardDIS-Analyser is an example of visualizing data).

He received a B.Sc.(2005) in software engineering, Ms Sc. (2007) and PhD(2014) in AI, from the Department of Computer Science at Azad University, Science and Research Branch, Tehran Iran. He has been awarded his current research fellowship(2020), a merit-based scholarship for attending the school of AI, Rome, Italy(2019), Also, He has achieved a full scholarship Award from Azad University(2010-2014), KNTU ISLAB Research Fellowship (2007-2010). He secured the first rank among the graduates in the year 2004-2005. 

Research Interests:

Using the following concepts in data that are relevant to ocean science

  • Deep Learning
  • Explainable Machine Learning
  • Reinforcement Learning
  • Optimization

Contact Info

Vahid Seydi is a Research Fellow in the School of Ocean Science at Bangor University in Data Science and Machine Learning. Prior to Bangor, he was an Assistant Professor at the Department of AI at Azad University South Tehran Branch.

As a member of the iMarDIS project team, He works on a data infrastructure to bring together diverse ocean science datasets and make them available to various research and industrial partners within the offshore renewable and ocean sciences community. Visualising this data so that it would be informative and well exploited in the fields of education, research, and industry, along with investigating the interaction of AI and machine learning with the collected marine data are two parts of the project's intentions (iMardDIS-Analyser is an example of visualizing data).

He received a B.Sc.(2005) in software engineering, Ms Sc. (2007) and PhD(2014) in AI, from the Department of Computer Science at Azad University, Science and Research Branch, Tehran Iran. He has been awarded his current research fellowship(2020), a merit-based scholarship for attending the school of AI, Rome, Italy(2019), Also, He has achieved a full scholarship Award from Azad University(2010-2014), KNTU ISLAB Research Fellowship (2007-2010). He secured the first rank among the graduates in the year 2004-2005. 

Research Interests:

Using the following concepts in data that are relevant to ocean science

  • Deep Learning
  • Explainable Machine Learning
  • Reinforcement Learning
  • Optimization

Overview

Background

Vahid has been teaching and researching in the various fields of machine learning, data mining, and optimization since 2014. He previously had been researching in the fields of derivative-free optimization algorithms and hybrid them with derivative-based optimization methods to train neural networks and fuzzy systems. he had also researched multi-objective optimization methods. Since 2014 he has concentrated on large scale data modelling using machine learning methods to solve real-world problems.

In a wide range of machine learning topics such as regression, classification, retrieval, clustering, reinforcement learning, probabilistic graphical models, Gaussian process, recommender system, social network analysis, association rule mining, optimization methods, vahid has been working with diverse models and data types like tabular data, text, image, audio signal, time series etc. 

He has implemented different types of deep learning models such as CNN-based models, RNN-based models, attention-based models, and graph CNN as well as other well-known machine learning models such as regression-based models, support vector machine, non-parametric models, ensemble decision tree-based models, Gaussian mixture models, HMM, PGM in various types of applications.

 

Current project

As a member of the iMarDIS project team, He works on a data infrastructure to bring together diverse ocean science datasets and make them available to various research and industrial partners within the offshore renewable and ocean sciences community. Visualising this data so that it would be informative and well exploited in the fields of education, research, and industry, along with investigating the interaction of AI and machine learning with the collected marine data are two parts of the project's intentions.

some of my previous projects

(2020) Deep Domain Adaptation for Ophthalmic Image Classification

Domain adaptation is an attractive approach given the availability of a large amount of labelled data with similar properties but different domains. It is effective in image classification tasks where obtaining sufficient label data is challenging. We propose a novel method, named SELDA, for stacking ensemble learning via extending three-domain adaptation methods for effectively solving real-world problems. The major assumption is that when base domain adaptation models are combined, we can obtain a more accurate and robust model by exploiting the ability of each of the base models. We extend Maximum Mean Discrepancy (MMD), Low-rank coding, and Correlation Alignment (CORAL) to compute the adaptation loss in three base models. Also, we utilize a two-fully connected layer network as a meta-model to stack the output predictions of these three well-performing domain adaptation models to obtain high accuracy in ophthalmic image classification tasks.

(2019) Predicting failures in medium voltage lines from a sequence of SCADA events

One of the goals of reliability is to identify and manage the risks around assets that could fail and cause unnecessary and expensive downtime. Organizations know it is important to identify areas of potential failures and rate them in terms of likelihood and consequence.

ENEL distribution must manage a very complex reality: several control centers (STUX and STM Systems), more than 2200 primary substations (HV/MV) and more than 100000 remote controlled secondary substations (MV/LV) In substation automation systems, SCADA performs the operations like bus voltage control, bus load balancing, circulating current control, overload control, transformer fault protection, bus fault protection, etc.

The main idea investigated in this project was to apply predictive maintenance to the medium voltage lines using only SCADA events messages (each of which is coded as a unique string), in order to predict component failures in the distribution grid.

In this project, a large set of methods was utilized in order to identify patterns and signals in the data to classify anomalies within the SCADA events that lead to faults in the medium voltage lines. Firstly, the work focused on analysing the data in order to understand how to frame the problem. Secondly, both 2supervised and unsupervised techniques were performed to learn patterns and identify which sequences of events are anomalies.

The problem was formulated as both a sentiment analysis(supervised) and anomaly detection(unsupervised) problem in NLP. This assumption comes from the fact that with respect to the horizon and sliding window, the sequence of SCADA events that either leads to the fault or not, can be considered as a sequence of words in a sentence. After some embedding methods, the CNN for text classification, RNN based structure (RNN, LSTM, GRU), customized waveNet, VAE and GAN were utilized to predict a sequence of events that leads to the fault. Moreover, in order to extract such SCADA events which are important to the failure and aggregate the representation of those informative SCADA events, attention mechanisms were used.

(2018) Zillow's Home Value Prediction (Kaggle Competition)

Zillow's Zestimate home valuation has shaken up the U.S. real estate industry. Zillow Prize, a competition with a one-million-dollar grand prize, is challenging the data science community to help push the accuracy of the Zestimate even further. Winning algorithms stand to impact the home values of 110M homes across the U.S.

In order to solve this challenge, after some feature engineering, I used some regression models including LGBM with a new implementation of CART based on transudative learning, CatBoost and then ensemble them by the stacking framework. I've got rank 71 among 3775 teams

 

Research

Deep Learning

There are many applications we have a large amount of data and the matter is to find a mapping to transfer the inputs to the desired outputs; so it is needed to extract some features from the input which are highly correlated to the desired outputs.

Basically, deep learning methods are the nonlinear mapping which regards to the local dependency of the inputs and respect to the target, try to auto-extract the features hierarchically. The dependency sometimes appears regionally like an image where we often utilize convolution filters; or it happens sequentially like text where we often use recurrent base structure (RNN, GRU, LSTM).

In some applications, there are not any desired outputs, and the goal is to extract an awesome representation of the data automatically (unsupervised learning). in these applications, deep auto-encoders are powerful models to extract a lower-dimensional representation of the data. 

Also, I am interested in deep generative models (VAE, GAN) to discover the distribution of the data to deal with anomaly detection applications; and in the other aspect, create bridges between two data with different natures (e. g. text and image).

 

Explainable Machine Learning and Probabilistic Graphical Model.

In the probabilistic models, everything (input and output, features(latent), parameters, hyper-parameters) is a random variable. By estimating dependency between them, it is possible to answer and inference queries about real-world application questions.

probabilistic models with their Bayesian nature are awesome for capturing causality. The probabilistic matrix factorization methods are powerful tools for community detection and recommender system applications.  Using probabilistic matrix factorization, it is possible to embed high dimensional data in lower-dimensional spaces which reduce effects due to noise and discover latent relations.

Deep Reinforcement Learning

There are many applications that it is possible to lay them in a set of states and a set of actions. Due to the unknowing environment,  the solution for this kind of problem is a map that provides guidance to select the best action in each state. if there is not any prior to the dynamic of the environment, the learning should be done by experimenting with encouraging and punishing which are named as a reward. basically, learning directly depends on the definition of the reward. estimating reward function based on some desired output by supervised learning could be one of the interesting topics in this area.

In reinforcement learning, in order to reach a solution,   each state should be assigned a value. this value is the maximum expected reward that is obtained from the state. in the model-based approach, the value is estimated by the moving average on the future states. investigate to estimate the value function as a conditional distribution over future states, given the current state, is another interesting area.

moreover,  the relation between GAN and actor-critic reinforcement learning method is another absorbing field in this area.

Teaching and Supervision

I was teaching a range of undergraduate and postgraduate modules in computer science and AI such as:

PhD and master's degree: 2014-2020

  • Machine Learning
  • Deep Learning
  • Natural Language Processing
  • Mining of Massive Data Sets
  • Advanced Artificial Intelligence (Reinforcement learning and PGM)

Bachelor's degree: 2010-2020

  • Artificial Intelligence(Search, CSP, Adversarial Search, Logic Programming)
  • Foundation of Programming(C / Python)
  • Object-Oriented Programming(Java)
  • formal language and automata theory

I supervised(lead-supervisor or joint supervisor) these students

Other

TECHNICAL SKILLS

  • Programming Languages: Python, Java, C, C++, MATLAB, R, ProLog.
  • Python Stack: PyTorch, Keras, TensorFlow, Gensim, NLTK, NumPy, Pandas, SiKit-Learn, SciPy, Scrapy, Matplotlib, Seaborn, ...
  • Databases Management: SQL
  • Writing: LATEX, Microsoft Word, Markdown, HTML
  • Others: AWS cloud platform, Git, GitHub, Software Development, RUP, Agile.

Education / academic qualifications

  • 2014 - PhD , Artificial Intelligence- (thesis: Job’s interaction theory to train hyper-parameters of the cultural optimization algorithm.)
  • 2007 - MSc , Artificial Intelligence- (thesis: multi-objective optimization to train neural networks and neuro-fuzzy systems.)
  • 2005 - BSc , Computer Software Eng. - (Concentrations: RUP methodology, database management, SQL, object-oriented programming, designing algorithm, data structure, Java, Visual C++.)

Research outputs (20)

View all

View graph of relations