Main Topics
My research primarily involves looking into machine learning and intelligence based systems. This ranges from proposing possible methods for improving a solution to optimizing existing systems. Priumarily, my interests lie in improving renewable energies and food-based systems for a better future. I believe machine learning and data analysis can be utilized to its fullest potential to benefit not only people, but the planet itself. We must respect our planet to ensure healthier crops and better general weather.
Automated Detection of Harmful Substances in Crops
For a computing topics research team, I performed group-based survey research on food safety found in
the food supply chain. As this is a topic that has little research focusing on the whole
supply chain and how it relates, my team wished to fill in gaps and find relations while proposing
solutions to improve safety. This is important as with globalization increasing and networks becoming
more interconnected, the importance of keeping food safe and minimizing the spread of disease is
highlighted. My main research task was focusing on the beginning of the supply chain, agriculture.
Considering agriculture, it can be broken into both crops and livestock. For crops, the three main
areas of concern are soil, water, and the crops themselves. I proposed a methodology of utilizing
soil and water sensors alongside cameras to directly monitor farms for harmful substances such as:
foodborne pathogens, pesticides, and toxins. The concept is training machine learning models based on
different
regional data in combination with ensemble-based stacking to improve generalization and standardization
for a model. If these models can be trained on different regional data and pick up on the nuances of
different farms,
it will help with the detection of these harmful substances. At the same time, detection for soil,
water, and crop based substances will be independent from one another, allowing for a more
flexible implementation.

Placing Top Place in Kaggle Competition over WISDM Accelerometer Data

In a Machine Learning course, I ended up ranking within the top two positions for the final Kaggle
assessment. For the
target data, there were both raw accelerometer readings and features extracted from these readings. The
goal was to take
the target data and produce a prediction over whether it is signifying walking, jogging, moving up and
down stairs, standing, or sitting. Two predictions were allowed for the scenario, and it was recommended
that there is one
traditional machine learning approach and one deep learning approach. I took both sets of data and
properly cleaned, imputed, and normalized the sets.
For the feature-based readings, I ran the data through several tests with several iterations each.
Random search was utilized
to find the best performing parameters. After running the data through RFC, SVM, GNB, and XG Boost, I
found that XG Boost performed
the best while RFC was shortly following. This assumption lead to the best results obtaining the top
score.
For the raw signal data, a 1D CNN model was trained utilized due to it's proficiency in taking time
sequence data.
I wrote a system that would train and run the model through different sets of shuffled training and
testing data, compiling the average
results of each to compare the results for hyperparameter tuning. For the provided dataset, the model
tended to result in high scores but
seemed to result in around 65% accuracy for the kaggle dataset. I was unable to make a model to push
pass this point, although, after the deadline and
final kaggle data was integrated, the model still performed around this accuracy while the traditional
models all dropped across the board for all participants.
This showed me that while it did not result in the highest score, it performed with the highest
consistency.
Optimizing the Placement of Components in Hybrid Wind-Solar Farms
My first research-based assignment, the topic was as broad as focus on an optimization problem. As my interests lie in sustainability and the environment, I looked into different systems for improving renewable energies. This lead me to renewable hybrid wind-solar farms which can have a variety of configurations. The main idea is to increase energy production while reducing costs. This involves where to place hybrid and solar wind components based on the geography and weather of the region. There is an installation fee as well as maintenance fee involved with each. The methodology I looked into involves using particle swarm optimization algorithms, genetic algorithms, and cuckoo search algorithms. I found that the cuckoo search algorithm performed the best and was the most robust in approach. To further look into the problem, using dynamic yet robust algorithms and include more regional based data could see improvements. It is a challenging concept as each region has different amounts of wind and sunlight.