Leveraging AI to Create Affordable Professional Headshots

 

In this post, you will learn about some of the various AI projects that Crafted has worked on, and how one of those ideas eventually led to the creation of PhotoPacks.AI, a platform that allows users to leverage AI to create photo quality professional headshots…without the price tag.

Utilizing Machine Learning

During my time as Director of Engineering at Crafted, I had the opportunity to participate in the successful delivery of many interesting projects across a variety of business domains.  These projects ranged from the modernization and decomposition of legacy monolithic systems, to building an insurance company’s first consumer-facing web application, to a B2C application that memorializes loved ones that have passed away.  The latter was one of my favorite projects, due in no small part to a research spike that popped up in our backlog in early 2022:

“Research methods to identify and pull valuable data out of our large corpus of unstructured text data.”

The team thought about various strategies we could use to parse this data for useful information.  Could we compare words in the corpus to a dictionary of valuable terms? Should we do something with regex? Of the various half-baked ideas we formulated that could solve the problem, I couldn’t help but think there might be a better way.  What about machine learning (ML)?  ChatGPT had yet to take the world by storm, but the AI hype-train was starting to build steam nonetheless.  

Recent deep learning advances, such as the Transformers architecture (as proposed in the Attention Is All You Need paper), were beginning to push the boundaries of natural language processing and computer vision.  Given our time constraints, was recent AI technology accessible enough for our team to utilize it to quickly deliver value?  There was enough evidence to indicate that it was mature enough for us to conduct a deeper experiment utilizing deep learning.

We decided to develop a named-entity recognition (NER) system to pull important pieces of data out of our text.  No off-the-shelf models currently existed to perform the exact task we needed to execute, so we knew we would have to train our own model. We went through the familiar process of data labeling with a few tools; one built in-house, then Mechanical Turk, and finally scale.com for the highest quality labeling results.  We then used our labeled data to train a custom NER model with spaCy.  Finally, we wrapped our model with a Flask API and deployed it so we could execute real-time inference.  The process of building and deploying the model was incredibly satisfying and, dare I say, fun! 

Shortly after our project ended, OpenAI amazed the world with their announcement of Dall-E 2, a text-to-image model, on April 6th, 2022.  Not long after that, Stability.AI released an open source, latent-diffusion-based text-to-image model, Stable Diffusion, on August 22nd, 2022.  On November 30, 2022, OpenAI released ChatGPT, which set a record as the fastest growing consumer application in history, by reaching a total of 100 million active users in two months.  The paradigm shift in computing had arrived.

Generative AI

After gaining exposure to machine learning on the Crafted project, I was incredibly enthusiastic about all things AI.  I started using Midjourney to perform image-to-image transformations on pictures of a few of my coworkers.  The results were interesting; occasionally horrifying, but nearly always funny.  Not exactly high quality, though.  I decided to play around with Stable Diffusion to see if I could achieve increased quality, likeness, and control of the image composition.  I was instantly hooked.

What is Stable Diffusion?  Developed by Stability.AI, Stable Diffusion is an open-source diffusion-based image generation model.  By taking text or other images as input, it is capable of generating lifelike images that have never existed before.  By leveraging the power of machine learning, this model can be fine-tuned on your own data to generate images of people, objects, and scenes that it was not initially trained on.  With its permissive open source license, a vibrant community of developers have created an impressive ecosystem of tools around Stable Diffusion, enabling smart inpainting, pose control, and customized styling.

At this point, I decided the best thing to do would be to train a model on photos of my boss, Adam. Why? So that I could create images of him in embarrassing situations and post them to our company Slack, of course.

After numerous (like, a lot of them) iterations of fine-tuning Stable Diffusion 1.5 to produce a model based on Adam’s images, I was able to reliably produce images that were far better than the image-to-image strategy I was using with Midjourney.  I posted a few images to our Slack and the team thoroughly enjoyed it!

As funny as it was for me to be able to create these images, I figured it would be much better if everybody on the team had the power to generate images of Adam with their own text prompts.  The next step was to get this model deployed so that my coworkers could execute inference against it.  With no small amount of effort, our custom slackbot, txt2adam, was released!

It was a hit!  People generated all kinds of photos of Adam in ridiculous scenarios.  The story of the creation and deployment of txt2adam became the subject of a Crafted talk at Denver Startup Week; Stable Diffusion for the Unstable Mind.

It was evident there was something there; an idea worth pursuing.  Creating and deploying the custom text-to-image models, however, is incredibly laborious.  But what if it wasn't? What if it could be done easily so that people without the knowledge of machine learning and an understanding of the intricacies of Stable Diffusion could do this with pictures of their own friends and family?

PhotoPacks.AI

I decided to get to work on this idea.  Automating the data collection, training, and deployment of models is a difficult task.  I worked around the clock to build a proof of concept.  Within a few weeks, I had something working…kinda.  Then I spent the next couple of months refining it to improve the user experience and output quality.  This proved that the problem was, while challenging, certainly technically feasible.

To evaluate the possibility of product-market fit, I spent some time selling AI-generated pet photos on Etsy. For each order, I manually trained a custom model with the photos of the customer’s pets and sent them a collection of generated images (now that is an MVP!).  After consistent sales and good feedback, I got to the point where I realized this was absolutely a valuable, solvable problem.  To get a fleshed-out, production-ready application built to solve this problem in a fully-automated fashion, however, would require a substantial amount of effort and dedication.

It is at this point that PhotoPacks.AI was born.  An LLC was created and we started selling custom AI image photo packs to any users we could find.  After a couple of pivots, and narrowing our focus to creating affordable, professional headshots, orders started to roll in and proved that the product, still in its infancy, was solving a real business need and had the legs to grow into something special.

Developing PhotoPacks.AI would not have been possible without gaining the skills that I learned being surrounded by the amazing product managers, product designers, and engineers at Crafted.  The business, design, and technical problem solving at PhotoPacks.AI is imbued with the balanced team and lean development philosophies championed at Crafted. 

Crafted has not slowed down on their AI development efforts. They have developed a custom GPT and Assistant using OpenAI.  More recently, they have leveraged Retrieval Augmented Generation to build a chatbot that can answer questions about Crafted based on a large corpus of text data from Crafted’s website.  Crafted is building in public; check it out!

Jeremy Gustine PhotoPacks.AI Crafted
 

About the Author

Jeremy Gustine is the Founder of PhotoPacks.AI. He was previously the Director of Engineering at Crafted.

Previous
Previous

How to Migrate Your App to the AWS Cloud

Next
Next

Crafted GPT: Build In Public Update #3