How to Think About and Implement AI Models: A Personal Journey

9 min readAug 21, 2024

Artificial Intelligence (AI) has become a vital tool in both professional and personal spheres, revolutionizing business operations. With the increasing accessibility of AI tools, the possibilities for creating tailored solutions to everyday challenges are vast.

In this article, I’ll share insights from my recent machine learning and data engineering project, inspired by Thu Vu Analytics, along with a framework I developed for implementing AI effectively.

The article begins by breaking down my project, followed by an in-depth discussion of the framework. I’ll illustrate how this framework was applied in my project and conclude with a list of resources that have been invaluable in my journey into machine learning. As a ‘noob’ in the field, I believe this resource list will be especially helpful for beginners like myself.

The Inspiration: Thu Vu Analytics and LLaMA

My project was sparked by a video from the YouTube channel Thu Vu Analytics, where the creator utilized a large language model, LLaMA, to categorize transaction data. Thu Vu highlighted how generative models like LLaMA can be applied to tasks that traditionally relied on rule-based methods, showcasing their versatility and power. When I first watched the video, I was struck by the potential of using such models for personal finance management. The idea of creating an AI-powered system to categorize my spending habits and generate insightful reports seemed both innovative and practical.

While I greatly respect Thu Vu, who has been an integral part of my learning journey, I realized I needed to personalize certain aspects of the project to maximize its effectiveness and sustainability. Unlike Thu Vu’s approach, which focused on a one-off task, my goal was to create a long-term personal intelligence tool. This required developing pipelines for data ingestion and processing and implementing a long-term storage solution, such as a data warehouse.

However, as I delved deeper, I quickly recognized that some changes to the project were not immediately apparent. Before exploring these changes, let’s consider the project’s details and the problem it aimed to solve.

Defining and Breaking Down the Problem

The core of my project was straightforward: I wanted to categorize my transaction data to gain a better understanding of my spending patterns. However, as with any AI project, the devil is in the details. Here’s how I approached it:

Problem Definition: The goal was to create a system that could automatically categorize transactions into predefined categories (e.g., groceries, utilities, entertainment) and generate a dashboard providing descriptive statistics about my cash flow. This would help me get a clearer picture of my financial standing at any given moment. Let’s break down the problem into manageable tasks:

Extract transaction data from bank servers.
Categorize the transactions into multiple classes using a fine-tuned model
-Create a dataset for fine-tuning.
-Select a pre-trained model.
-Train and evaluate the model.
Develop a scalable pipeline and storage solution:
-Create Python scripts to handle data cleaning and processing.
-Ensure the pipeline runs through the fine-tuned ML model
-Set up a data warehouse for long-term data storage.
-Implement archival mechanisms to prevent data loss.
Extract insights and create a dashboard for visualizing data.
Maintain bug tracking sheets to address recurring issues.

Lastly, all of this needed to happen locally. I was uncomfortable with my data leaving my machine, as I was dealing with sensitive transactional information. Most resources I encountered utilized large models provided by OpenAI, which operate through APIs that would transmit my data outside my control.

Throughout my personal journey, I developed a framework for how to approach fine-tuning AI for specific use cases. In this article, I will focus on the second step in the process: fine-tuning a language model while walking you through the framework.

The Problem: When a Large Language Model Might Be Too Large

Remember I talked about a problem? As I began to delve into the project, I quickly realized that using a large language model like LLaMA could be overkill for my specific needs. While LLaMA is capable of analyzing long sequences of text, my task was not that complex. These models are powerful but come with significant challenges:

Resource Intensity: Large models require substantial computational resources, making them challenging to run on personal devices without specialized hardware. The size of the model is positively correlated with its cost. According to The Wall Street Journal, Microsoft AI products can take away up to 80$ a month per user from their profits.
Inconsistency in Responses: Generative models, while versatile, can sometimes produce inconsistent results, which is problematic for tasks requiring precision, like financial categorization.
Data Requirements: If I were to train my own machine learning model, it should at least be personalized (fine-tuned). Fine-tuning or retraining may significantly increase accuracy because you provide specific data to the model. However, the LLaMA model is extremely large and data-hungry; training it would require a considerable amount of data.

These challenges led me to reconsider my approach. Through my research and implementation, I developed a framework that streamlines AI tasks like these while being cost-effective and maintaining accuracy.

The Framework for Fine-Tuning

Start with a clear problem definition.
Attempt to solve the problem with a truly large language model or existing solutions.
Divide and conquer: Break down the problem and fine-tune for each sub-section.
Find or create a ‘gold’ dataset.
Choose a model that fits the specific needs of the task.
Experiment with training and optimize.

Now I will walk through the framework step-by-step, providing examples to demonstrate its effectiveness.

Having clearly defined the problem, I initially attempted to use a large language model to solve it. As previously mentioned, the solutions provided were not what I needed, which is why I developed this framework in the first place. However, it is important to keep a log of how the large model performed at these tasks. We will later use it to compare it with our solution.

Divide and Conquer: Breaking down problems and fine-tuning AI for each piece

Large language models like GPTs excel in versatility, handling a wide range of tasks from generating poetry to solving complex equations. However, when it comes to specialized or repetitive tasks, these models can be inefficient and resource-intensive.

A better approach is to break down the problem into smaller, manageable parts, each suited to a specialized model. While my project only needed one model to solve the problem, that might not be always the case. Let’s consider an example to think more about it.

Consider the AI systems used in autonomous driving vehicles. You might assume that all the sensor data is fed into one large model, which then determines and executes the appropriate action, such as turning right, just like the above illustration. However, the reality is quite different.

In practice, autonomous vehicles rely on a series of specialized models working together to solve distinct parts of the overall problem. For instance, one model might focus on computer vision to identify and classify objects using the car’s cameras. Another model might handle sensor fusion, determining the position of these objects in physical space and predicting their movement.

Applying this “divide and conquer” strategy to your own projects can lead to more efficient and accurate solutions. Break the problem down, and see if smaller, specialized models can handle the subtasks. This method is more effective than relying on a single large model for everything.

This brings us to the next step in the framework: choosing the right model for your task. Remember that you have to choose a model for each of the subtask of the problem.

Find or Create a gold dataset

When fine-tuning a model, especially in a supervised learning scenario, having high-quality labeled data is crucial. In fact, the quality of your dataset can make or break your model’s performance. If your dataset is small, it’s even more important that it be both accurate and well-balanced in terms of frequency distributions.

For small datasets, aim for 100% accuracy. Unlike large datasets, where some degree of mislabeled data might be acceptable, a small dataset must be flawless to prevent skewing the model’s understanding.

Once you have this small “gold” dataset, you can explore traditional data augmentation techniques to expand it. This might involve creating variations of existing data points or synthetically generating new ones to increase diversity and quantity.

In my case, I had only six months’ worth of transaction data, so I focused on ensuring it was perfectly labeled. I used an online service provider to generate the initial labels and then manually reviewed the entire dataset to confirm its accuracy. It’s like the old saying goes: “Garbage in, garbage out.” If you feed your model inaccurate data, you’ll get poor results. So, take the time to ensure your dataset is pristine.

Choosing a Model That Fits Your Task

Experimentation is crucial when selecting the right model. Start with some quality and performance benchmarks, and run some sample data through different models using tools like Hugging Face pipelines. It’s also essential to check the documentation to understand what kind of data each pretrained model was trained on. At this point, it is also important to consider your dataset as well. If you have a small dataset, be careful not to pick up a huge model. Furthermore, you can freeze some layers of the model to make sure you’re not overfitting.

For my project, I chose DistilBERT, a smaller and more efficient version of BERT (Bidirectional Encoder Representations from Transformers). Here’s why it was the right choice:

Efficiency: DistilBERT requires fewer computational resources, making it ideal for projects with shorter sequences, like transaction descriptions. This means I can run it on my local setup as well, which was a key problem of the project.
Consistency: Unlike generative models, DistilBERT can be fine-tuned for specific tasks with fixed classes, ensuring consistent outputs. It handled the 21 transaction categories I needed effectively.
Data Requirements: Given my small dataset, DistilBERT was a perfect fit. I also discovered that freezing the early layers of the model could further reduce its size, leveraging transfer learning while requiring less data for fine-tuning.

Choosing DistilBERT meant sacrificing some flexibility, but the trade-off resulted in a model that was both efficient and accurate for my specific needs.

Llama and DistilBERT models in scale. You can fit 106 DistilBERT models in the smallest Llama Model.

Experimentation, and Optimization

Experimentation is at the heart of successful machine learning projects. Fine-tuning a model requires careful adjustments and continuous testing to ensure that the model performs optimally for your specific task. Now there are so many resources for the best practices while training or fine-tuning, however here are some best practices that helped me:

Start with Baselines: Before diving into complex configurations, establish a baseline performance. baseline will serve as a reference point to measure improvements as you iterate.
Control Your Variables: When experimenting, change one factor at a time. This could be learning rate, batch size, or the number of epochs.
Monitor Loss and Validation Curves: The loss curve is a key indicator of your model’s training process. Track both the training and validation loss curves to ensure your model is learning correctly.
Implement Effective Logging: Keep detailed logs of your experiments. This includes model parameters, training duration, and performance metrics.

Remember that machine learning is an iterative process. After initial experiments review results, research, and refine your approach.

Remember I told you to use an LLM for your problem and store its performance results? Now its time to pull those out and compare it with your own fine-tuned model. In my case, I also compared it to an online service provider called Wallet (Found great results).

Conclusion

As a ‘noob’ in machine learning, I did tonne of research and reading. In this article I showed a framework that worked for me during my personal journey. Just as machine learning, creating such frameworks is an iterative process as well. So I would love any constructive criticism I can get for the framework, and the project it self.

Secondly, as a noob I can also say that Implementing AI models doesn’t have to be a daunting task. With the right approach and tools, you can create powerful, efficient systems tailored to your specific needs. Just believe in yourself and keep pushing the boundaries of your knowledge.

The project could be found in this Github Repository.

Other resources that I used while implementing this project can be found here