Even though we expect side hustles to take time before breaking even, the shorter the path, the better. After all, we never really know what might hit the jackpot. Success often comes from unexpected avenues, so we need to approach this journey strategically. This reminds me of concepts from decision science: multi-armed bandits and reinforcement learning. These techniques are designed to maximize rewards by balancing two essential actions: exploration and exploitation. Let me be like a teacher.
Imagine walking into a casino filled with slot machines (commonly called "one-armed bandits"). Each machine has an unknown probability of payout—some machines might give big rewards, others may rarely pay at all. Your goal is to maximize cumulative rewards. To do so, you face a dilemma:
Explore: Try different machines to discover which one has the best payout.
Exploit: Stick to the machine you think will give the most rewards based on what you've learned so far.
Now think of side hustles in the same way. Each "slot machine" represents an opportunity:
Exploit: Stick to the machine you think will give the most rewards based on what you've learned so far.
Now think of side hustles in the same way. Each "slot machine" represents an opportunity:
- A social media channel (YouTube, TikTok, Instagram)
- An online store (Etsy, Amazon, Shopify)
- A mobile app, digital product, freelance gig, or blog
The challenge is that we don’t know in advance which opportunity has the best odds for success or how profitable it could be. Like slot machines, some might surprise us with rapid gains, while others yield slow or no returns.
Reinforcement learning (RL) takes this idea a step further. In RL, actions taken today don’t just provide rewards now; they also influence future opportunities. For example:
- An online store (Etsy, Amazon, Shopify)
- A mobile app, digital product, freelance gig, or blog
The challenge is that we don’t know in advance which opportunity has the best odds for success or how profitable it could be. Like slot machines, some might surprise us with rapid gains, while others yield slow or no returns.
Reinforcement learning (RL) takes this idea a step further. In RL, actions taken today don’t just provide rewards now; they also influence future opportunities. For example:
- A YouTube video might not perform well initially, but it builds momentum and teaches you about your audience.
- Creating an MVP (Minimum Viable Product) for a mobile app could uncover user insights that shape a more profitable version.
- Testing a new product on an online store might help you fine-tune pricing, marketing, or design.
In short, every action updates what we know about the probabilities of success and would help us make smarter decisions down the line.
The beauty of this process lies in its experimental nature. There is no single “right” or “wrong” path. Each attempt is a learning opportunity that feeds into smarter future decisions. For me, this is where the excitement truly lies: testing ideas, embracing uncertainty, and discovering hidden opportunities along the way.
Let’s enjoy the process. Let’s play the game. And who knows? One of these “machines” might just surprise us.
- Creating an MVP (Minimum Viable Product) for a mobile app could uncover user insights that shape a more profitable version.
- Testing a new product on an online store might help you fine-tune pricing, marketing, or design.
In short, every action updates what we know about the probabilities of success and would help us make smarter decisions down the line.
The beauty of this process lies in its experimental nature. There is no single “right” or “wrong” path. Each attempt is a learning opportunity that feeds into smarter future decisions. For me, this is where the excitement truly lies: testing ideas, embracing uncertainty, and discovering hidden opportunities along the way.
Let’s enjoy the process. Let’s play the game. And who knows? One of these “machines” might just surprise us.

Comments
Post a Comment